ChatPredict

Can an AI read the news and call the stock? We're testing it — live.

Twice every trading day a machine collects US-stock headlines, has three language models judge each one — YES, NO, or UNKNOWN — and records everything raw. A forward, out-of-sample replication of Lopez-Lira & Tang (2023).

The paper: “Can ChatGPT Forecast Stock Price Movements?”, arXiv 2304.07619 · paper ledger only, no real trades.


Morning round completed 2026-06-17 at 8:00 AM ET (on time) · 722 articles · 673 judged · database committed 9:11 AM ET.

Rounds fire 8:00 AM & 2:30 PM ET, every trading day

How a round works

Live numbers from the latest round — morning round · Jun 17, fired Jun 17, 8:00 AM ET. Every step expands.

899 collected 722 kept (−177 dropped) 673 judged 2,019 verdicts committed 9:11 AM ET
01 The clock fires Two rounds every trading day — morning for overnight and pre-market news, afternoon for trading-day news. Never early; late is allowed and recorded; never twice. fired 8:00 AM ETon time

Morning round — fires 8:00 AM ET, judges P3 (published after 4pm yesterday) + P1 (published before 6am today). Decisions feed today's 9:30 OPEN (orders by 9:28); collection window: since yesterday 14:30 ET.

Afternoon round — fires 2:30 PM ET, judges P2 (published 6am–4pm today). Decisions feed today's 16:00 CLOSE (orders by 15:58); collection window: since today 06:00 ET.

An early round would truncate its window and silently miss news, so the gate forbids it. A late round still collects everything published before its target — the slip is recorded in the run ledger, and the analysis judges each day's fidelity. Schedule and gate: src/chatpredict/schedule.py · source available on request

02 Collect the news Market-wide feeds plus hot-list look-ups, pulled in parallel. The publish time is the law. 899articles fetched

Market-wide feeds deliver ticker-tagged articles for the whole US market at once:

Alpaca news (Benzinga wire)387
Market-wide professional news wire; every article arrives tagged with its tickers.
Polygon.io news213
Aggregated market news API, ticker-tagged (free tier lags ~1h — recorded as-is).
GlobeNewswire press releases17
Official company press releases (earnings, deals, guidance) straight from the wire.
PR Newswire press releases
The other big press-release wire — includes the law-firm 'investor alert' spam we record raw and filter in analysis.
SEC EDGAR 8-K filings65
Material-event filings companies are legally required to make — the ground-truth feed.

“What's hot” lists name stocks that should have news today — 73 stocks were flagged this round (every one recorded, including those where we then found nothing):

Top gainers & losers
Alpaca screener: the day's 50 biggest gainers + 50 biggest losers (EOD snapshot — its own freshness stamp is recorded every round).
Most-active stocks
Alpaca screener: the 30 highest-volume names — volume spikes often mean news.
Earnings calendar
Finnhub's calendar of companies reporting today/tomorrow — earnings days make news.

Targeted look-ups then query flagged stocks the feeds didn't cover:

Finnhub per-stock news
First stop for a targeted look-up on a hot-list stock the feeds missed.
Yahoo Finance per-stock news
Fallback look-up if Finnhub is rate-limited — one exhausted source never sinks a round.

Pipeline declared once in src/chatpredict/flow.py · source available on request — the runner builds from it and this page renders it, so the diagram cannot drift from what runs.

Market-wide news feeds 682articles kept 5 feeds, every article pre-tagged with its tickers
Hot lists → targeted look-ups 40articles kept 73 stocks flagged; the uncovered ones looked up one by one
03 Keep what qualifies Exact ticker matches against 5,842 US common stocks; everything else is dropped and the reason logged. 722articles kept722 = 682 + 40

An article is kept only if it maps to exactly one US common stock — via the source's own ticker tag, an official SEC identifier, or an explicit ticker in the text. We never fuzzy-match company names; that is how news ends up filed under the wrong company. Every rejected article is stored in the drop log with its reason — “no record” never means “quietly discarded”.

droppedreasonmeaning
153out of universenot a US common stock in our universe
24no tickerno exact stock could be identified

Resolution: src/chatpredict/resolver.py · source available on request

04 Three models judge every headline The paper's exact question, word for word, at temperature 0 — YES, NO, or UNKNOWN. 673headlines judged

Each (stock × headline) is judged once per model, ever — at temperature 0 a re-ask returns the same answer. A rate-capped or failed model is retried next round; one that answered is never re-asked. The free models are paced to their providers' limits, which is why a round takes about 80 minutes — the panel runs in parallel, so the wall-clock is the slowest model. Tallies below are this round's.

gpt-4.1-nano
OpenAI · Paid anchor — the model that replicates the paper (the gpt-5 family rejects the paper's required temperature=0; verified live).
~paid tier, 0.2s between calls
122 yes · 69 no · 481 unknown
gemini-2.5-flash
Google · Free panel member — does the signal survive a different model family?
free tier, 10 calls/min (6.1s pacing)
193 yes · 119 no · 361 unknown
gpt-4o-mini
OpenAI · Third panel member — a second, distinct OpenAI model. Replaced Groq llama-3.1-8b-instant on 2026-06-15: that 8B model never emits NO (0/472 on day one, even on bankruptcy/fraud — verified it's the model, not our parser), so it could never short.
~paid tier, 0.2s between calls
174 yes · 221 no · 278 unknown

Clients and pacing: src/chatpredict/model_clients.py · source available on request

05 Everything stored raw Verdicts, drops, no-news checks — all of it. The strategy is computed later, as an analysis over a complete record. 2,019verdict rows

A verdict row keeps the model's raw reply verbatim alongside the parsed answer, the relevance tag, and a status (ok / failed / rate-capped) — so every gap is explainable. Articles keep their raw source payloads; flagged-but-newsless stocks are recorded too. UNKNOWN and not-relevant answers are kept like everything else: no-signal is also a result. Nothing is filtered live.

06 Committed to the record The database is committed to a git repository — append-only history, every change timestamped. 9:11 AM ETcommitted

After every round the SQLite file is committed back to the repository — simultaneously the store, the backup, and the audit trail. If new code lands mid-round, the save step rebases and retries; if saving still fails, the data is parked as a downloadable artifact. This site is regenerated from that file on every commit and has no write access to anything. See the run ledger.

What the models said today

gpt-4.1-nano
Paid anchor
122 yes · 69 no · 481 unknown
gemini-2.5-flash
Free panel member
193 yes · 119 no · 361 unknown
gpt-4o-mini
Third panel member
174 yes · 221 no · 278 unknown

Where at least two models agreed on a relevant headline — 185 stocks:

AAL yesAAL noABCL yesACM noACN yesADMA noADTX noAEHR yesAMD noAMKR yesAMZN yesAMZN noASTS yesAVAV noAVLN yesAXIL yesAZO yesBE yesBGSI yesBIAF yesBIIB noBMNR noBOF yesBP noBR yesBTGO yesBTGO noBTU noCANF yesCAVA yesCIIT yesCLWT yesCME noCNL yesCOAG yesCOHR yesCOIN yesCOIN noCRMT noCRVL yes+ 145 more →

from the verdicts × articles of Jun 17's 1 round — full table on News & Verdicts

The record so far

Trading days live3
see history →
Rounds run5
the ledger →
Articles collected3,393
browse all →
Verdicts recorded9,228
every one shown →