ChatPredict

Can an AI read the news and call the stock? We're testing it — live.

Twice every trading day a machine collects US-stock headlines, has three language models judge each one — YES, NO, or UNKNOWN — and records everything raw. A forward, out-of-sample replication of Lopez-Lira & Tang (2023).

The paper: “Can ChatGPT Forecast Stock Price Movements?”, arXiv 2304.07619 · paper ledger only, no real trades.


Afternoon round completed 2026-06-17 at 2:30 PM ET (on time) · 683 articles · 595 judged · database committed 3:32 PM ET.

Rounds fire 8:00 AM & 2:30 PM ET, every trading day

How a round works

Live numbers from the latest round — afternoon round · Jun 17, fired Jun 17, 2:30 PM ET. Every step expands.

940 collected 683 kept (−257 dropped) 595 judged 1,785 verdicts committed 3:32 PM ET
01 The clock fires Two rounds every trading day — morning for overnight and pre-market news, afternoon for trading-day news. Never early; late is allowed and recorded; never twice. fired 2:30 PM ETon time

Morning round — fires 8:00 AM ET, judges P3 (published after 4pm yesterday) + P1 (published before 6am today). Decisions feed today's 9:30 OPEN (orders by 9:28); collection window: since yesterday 14:30 ET.

Afternoon round — fires 2:30 PM ET, judges P2 (published 6am–4pm today). Decisions feed today's 16:00 CLOSE (orders by 15:58); collection window: since today 06:00 ET.

An early round would truncate its window and silently miss news, so the gate forbids it. A late round still collects everything published before its target — the slip is recorded in the run ledger, and the analysis judges each day's fidelity. Schedule and gate: src/chatpredict/schedule.py · source available on request

02 Collect the news Market-wide feeds plus hot-list look-ups, pulled in parallel. The publish time is the law. 940articles fetched

Market-wide feeds deliver ticker-tagged articles for the whole US market at once:

Alpaca news (Benzinga wire)376
Market-wide professional news wire; every article arrives tagged with its tickers.
Polygon.io news180
Aggregated market news API, ticker-tagged (free tier lags ~1h — recorded as-is).
GlobeNewswire press releases37
Official company press releases (earnings, deals, guidance) straight from the wire.
PR Newswire press releases
The other big press-release wire — includes the law-firm 'investor alert' spam we record raw and filter in analysis.
SEC EDGAR 8-K filings62
Material-event filings companies are legally required to make — the ground-truth feed.

“What's hot” lists name stocks that should have news today — 87 stocks were flagged this round (every one recorded, including those where we then found nothing):

Top gainers & losers
Alpaca screener: the day's 50 biggest gainers + 50 biggest losers (EOD snapshot — its own freshness stamp is recorded every round).
Most-active stocks
Alpaca screener: the 30 highest-volume names — volume spikes often mean news.
Earnings calendar
Finnhub's calendar of companies reporting today/tomorrow — earnings days make news.

Targeted look-ups then query flagged stocks the feeds didn't cover:

Finnhub per-stock news
First stop for a targeted look-up on a hot-list stock the feeds missed.
Yahoo Finance per-stock news
Fallback look-up if Finnhub is rate-limited — one exhausted source never sinks a round.

Pipeline declared once in src/chatpredict/flow.py · source available on request — the runner builds from it and this page renders it, so the diagram cannot drift from what runs.

Market-wide news feeds 655articles kept 5 feeds, every article pre-tagged with its tickers
Hot lists → targeted look-ups 28articles kept 87 stocks flagged; the uncovered ones looked up one by one
03 Keep what qualifies Exact ticker matches against 5,852 US common stocks; everything else is dropped and the reason logged. 683articles kept683 = 655 + 28

An article is kept only if it maps to exactly one US common stock — via the source's own ticker tag, an official SEC identifier, or an explicit ticker in the text. We never fuzzy-match company names; that is how news ends up filed under the wrong company. Every rejected article is stored in the drop log with its reason — “no record” never means “quietly discarded”.

droppedreasonmeaning
237out of universenot a US common stock in our universe
20no tickerno exact stock could be identified

Resolution: src/chatpredict/resolver.py · source available on request

04 Three models judge every headline The paper's exact question, word for word, at temperature 0 — YES, NO, or UNKNOWN. 595headlines judged

Each (stock × headline) is judged once per model, ever — at temperature 0 a re-ask returns the same answer. A rate-capped or failed model is retried next round; one that answered is never re-asked. The free models are paced to their providers' limits, which is why a round takes about 80 minutes — the panel runs in parallel, so the wall-clock is the slowest model. Tallies below are this round's.

gpt-4.1-nano
OpenAI · Paid anchor — the model that replicates the paper (the gpt-5 family rejects the paper's required temperature=0; verified live).
~paid tier, 0.2s between calls
155 yes · 43 no · 396 unknown
gemini-2.5-flash
Google · Free panel member — does the signal survive a different model family?
free tier, 10 calls/min (6.1s pacing)
221 yes · 73 no · 301 unknown
gpt-4o-mini
OpenAI · Third panel member — a second, distinct OpenAI model. Replaced Groq llama-3.1-8b-instant on 2026-06-15: that 8B model never emits NO (0/472 on day one, even on bankruptcy/fraud — verified it's the model, not our parser), so it could never short.
~paid tier, 0.2s between calls
191 yes · 141 no · 263 unknown

Clients and pacing: src/chatpredict/model_clients.py · source available on request

05 Everything stored raw Verdicts, drops, no-news checks — all of it. The strategy is computed later, as an analysis over a complete record. 1,785verdict rows

A verdict row keeps the model's raw reply verbatim alongside the parsed answer, the relevance tag, and a status (ok / failed / rate-capped) — so every gap is explainable. Articles keep their raw source payloads; flagged-but-newsless stocks are recorded too. UNKNOWN and not-relevant answers are kept like everything else: no-signal is also a result. Nothing is filtered live.

06 Committed to the record The database is committed to a git repository — append-only history, every change timestamped. 3:32 PM ETcommitted

After every round the SQLite file is committed back to the repository — simultaneously the store, the backup, and the audit trail. If new code lands mid-round, the save step rebases and retries; if saving still fails, the data is parked as a downloadable artifact. This site is regenerated from that file on every commit and has no write access to anything. See the run ledger.

What the models said today

gpt-4.1-nano
Paid anchor
277 yes · 112 no · 877 unknown
gemini-2.5-flash
Free panel member
414 yes · 192 no · 662 unknown
gpt-4o-mini
Third panel member
365 yes · 362 no · 541 unknown

Where at least two models agreed on a relevant headline — 331 stocks:

AAL yesAAL noAAT yesABCL yesABOS yesACM noACN yesADMA noADPT yesADTX noAEHR yesAIFA yesAIFA noAIZ yesAMAT yesAMC yesAMD noAMH yesAMKR yesAMZN yesAMZN noAPLD yesARWR yesASBP yesASTS yesAVAV noAVGO yesAVLN yesAXIL yesAXSM yesAZO yesBB yesBE yesBGDE yesBGSI yesBIAF yesBIAF noBIIB noBMNR noBMY yes+ 291 more →

from the verdicts × articles of Jun 17's 2 rounds — full table on News & Verdicts

The record so far

Trading days live3
see history →
Rounds run6
the ledger →
Articles collected4,076
browse all →
Verdicts recorded11,013
every one shown →