ChatPredict

Can an AI read the news and call the stock? We're testing it — live.

Twice every trading day a machine collects US-stock headlines, has three language models judge each one — YES, NO, or UNKNOWN — and records everything raw. A forward, out-of-sample replication of Lopez-Lira & Tang (2023).

The paper: “Can ChatGPT Forecast Stock Price Movements?”, arXiv 2304.07619 · paper ledger only, no real trades.


Morning round completed 2026-06-18 at 8:00 AM ET (on time) · 639 articles · 601 judged · database committed 9:02 AM ET.

Rounds fire 8:00 AM & 2:30 PM ET, every trading day

How a round works

Live numbers from the latest round — morning round · Jun 18, fired Jun 18, 8:00 AM ET. Every step expands.

812 collected 639 kept (−173 dropped) 601 judged 1,803 verdicts committed 9:02 AM ET
01 The clock fires Two rounds every trading day — morning for overnight and pre-market news, afternoon for trading-day news. Never early; late is allowed and recorded; never twice. fired 8:00 AM ETon time

Morning round — fires 8:00 AM ET, judges P3 (published after 4pm yesterday) + P1 (published before 6am today). Decisions feed today's 9:30 OPEN (orders by 9:28); collection window: since yesterday 14:30 ET.

Afternoon round — fires 2:30 PM ET, judges P2 (published 6am–4pm today). Decisions feed today's 16:00 CLOSE (orders by 15:58); collection window: since today 06:00 ET.

An early round would truncate its window and silently miss news, so the gate forbids it. A late round still collects everything published before its target — the slip is recorded in the run ledger, and the analysis judges each day's fidelity. Schedule and gate: src/chatpredict/schedule.py · source available on request

02 Collect the news Market-wide feeds plus hot-list look-ups, pulled in parallel. The publish time is the law. 812articles fetched

Market-wide feeds deliver ticker-tagged articles for the whole US market at once:

Alpaca news (Benzinga wire)358
Market-wide professional news wire; every article arrives tagged with its tickers.
Polygon.io news171
Aggregated market news API, ticker-tagged (free tier lags ~1h — recorded as-is).
GlobeNewswire press releases16
Official company press releases (earnings, deals, guidance) straight from the wire.
PR Newswire press releases
The other big press-release wire — includes the law-firm 'investor alert' spam we record raw and filter in analysis.
SEC EDGAR 8-K filings62
Material-event filings companies are legally required to make — the ground-truth feed.

“What's hot” lists name stocks that should have news today — 72 stocks were flagged this round (every one recorded, including those where we then found nothing):

Top gainers & losers
Alpaca screener: the day's 50 biggest gainers + 50 biggest losers (EOD snapshot — its own freshness stamp is recorded every round).
Most-active stocks
Alpaca screener: the 30 highest-volume names — volume spikes often mean news.
Earnings calendar
Finnhub's calendar of companies reporting today/tomorrow — earnings days make news.

Targeted look-ups then query flagged stocks the feeds didn't cover:

Finnhub per-stock news
First stop for a targeted look-up on a hot-list stock the feeds missed.
Yahoo Finance per-stock news
Fallback look-up if Finnhub is rate-limited — one exhausted source never sinks a round.

Pipeline declared once in src/chatpredict/flow.py · source available on request — the runner builds from it and this page renders it, so the diagram cannot drift from what runs.

Market-wide news feeds 607articles kept 5 feeds, every article pre-tagged with its tickers
Hot lists → targeted look-ups 32articles kept 72 stocks flagged; the uncovered ones looked up one by one
03 Keep what qualifies Exact ticker matches against 5,853 US common stocks; everything else is dropped and the reason logged. 639articles kept639 = 607 + 32

An article is kept only if it maps to exactly one US common stock — via the source's own ticker tag, an official SEC identifier, or an explicit ticker in the text. We never fuzzy-match company names; that is how news ends up filed under the wrong company. Every rejected article is stored in the drop log with its reason — “no record” never means “quietly discarded”.

droppedreasonmeaning
138out of universenot a US common stock in our universe
35no tickerno exact stock could be identified

Resolution: src/chatpredict/resolver.py · source available on request

04 Three models judge every headline The paper's exact question, word for word, at temperature 0 — YES, NO, or UNKNOWN. 601headlines judged

Each (stock × headline) is judged once per model, ever — at temperature 0 a re-ask returns the same answer. A rate-capped or failed model is retried next round; one that answered is never re-asked. The free models are paced to their providers' limits, which is why a round takes about 80 minutes — the panel runs in parallel, so the wall-clock is the slowest model. Tallies below are this round's.

gpt-4.1-nano
OpenAI · Paid anchor — the model that replicates the paper (the gpt-5 family rejects the paper's required temperature=0; verified live).
~paid tier, 0.2s between calls
126 yes · 60 no · 415 unknown
gemini-2.5-flash
Google · Free panel member — does the signal survive a different model family?
free tier, 10 calls/min (6.1s pacing)
197 yes · 99 no · 305 unknown
gpt-4o-mini
OpenAI · Third panel member — a second, distinct OpenAI model. Replaced Groq llama-3.1-8b-instant on 2026-06-15: that 8B model never emits NO (0/472 on day one, even on bankruptcy/fraud — verified it's the model, not our parser), so it could never short.
~paid tier, 0.2s between calls
176 yes · 183 no · 242 unknown

Clients and pacing: src/chatpredict/model_clients.py · source available on request

05 Everything stored raw Verdicts, drops, no-news checks — all of it. The strategy is computed later, as an analysis over a complete record. 1,803verdict rows

A verdict row keeps the model's raw reply verbatim alongside the parsed answer, the relevance tag, and a status (ok / failed / rate-capped) — so every gap is explainable. Articles keep their raw source payloads; flagged-but-newsless stocks are recorded too. UNKNOWN and not-relevant answers are kept like everything else: no-signal is also a result. Nothing is filtered live.

06 Committed to the record The database is committed to a git repository — append-only history, every change timestamped. 9:02 AM ETcommitted

After every round the SQLite file is committed back to the repository — simultaneously the store, the backup, and the audit trail. If new code lands mid-round, the save step rebases and retries; if saving still fails, the data is parked as a downloadable artifact. This site is regenerated from that file on every commit and has no write access to anything. See the run ledger.

What the models said today

gpt-4.1-nano
Paid anchor
126 yes · 60 no · 415 unknown
gemini-2.5-flash
Free panel member
197 yes · 99 no · 305 unknown
gpt-4o-mini
Third panel member
176 yes · 183 no · 242 unknown

Where at least two models agreed on a relevant headline — 172 stocks:

AA yesAAPL yesAAPL noACN yesACN noADUR yesAIXC yesALGM yesALKS yesALL noAMBO yesAMD yesAMZN yesAPYX yesAS yesAVGO yesAXTI yesAYI yesBABA yesBBUC yesBCE yesBEAM yesBIIB yesBKSY noBMNR noBRN yesBTDR yesBTGO yesBTGO noBYAH yesC noCAST yesCERT yesCERT noCLPT yesCLWT yesCLX yesCMND yesCNMD yesCOIN yes+ 132 more →

from the verdicts × articles of Jun 18's 1 round — full table on News & Verdicts

The record so far

Trading days live4
see history →
Rounds run7
the ledger →
Articles collected4,715
browse all →
Verdicts recorded12,816
every one shown →