ChatPredict

Can an AI read the news and call the stock? We're testing it — live.

Twice every trading day a machine collects US-stock headlines, has three language models judge each one — YES, NO, or UNKNOWN — and records everything raw. A forward, out-of-sample replication of Lopez-Lira & Tang (2023).

The paper: “Can ChatGPT Forecast Stock Price Movements?”, arXiv 2304.07619 · paper ledger only, no real trades.


Morning round completed 2026-06-19 at 8:00 AM ET (on time) · 613 articles · 581 judged · database committed 9:01 AM ET.

Rounds fire 8:00 AM & 2:30 PM ET, every trading day

How a round works

Live numbers from the latest round — morning round · Jun 19, fired Jun 19, 8:00 AM ET. Every step expands.

810 collected 613 kept (−197 dropped) 581 judged 1,743 verdicts committed 9:01 AM ET
01 The clock fires Two rounds every trading day — morning for overnight and pre-market news, afternoon for trading-day news. Never early; late is allowed and recorded; never twice. fired 8:00 AM ETon time

Morning round — fires 8:00 AM ET, judges P3 (published after 4pm yesterday) + P1 (published before 6am today). Decisions feed today's 9:30 OPEN (orders by 9:28); collection window: since yesterday 14:30 ET.

Afternoon round — fires 2:30 PM ET, judges P2 (published 6am–4pm today). Decisions feed today's 16:00 CLOSE (orders by 15:58); collection window: since today 06:00 ET.

An early round would truncate its window and silently miss news, so the gate forbids it. A late round still collects everything published before its target — the slip is recorded in the run ledger, and the analysis judges each day's fidelity. Schedule and gate: src/chatpredict/schedule.py · source available on request

02 Collect the news Market-wide feeds plus hot-list look-ups, pulled in parallel. The publish time is the law. 810articles fetched

Market-wide feeds deliver ticker-tagged articles for the whole US market at once:

Alpaca news (Benzinga wire)259
Market-wide professional news wire; every article arrives tagged with its tickers.
Polygon.io news212
Aggregated market news API, ticker-tagged (free tier lags ~1h — recorded as-is).
GlobeNewswire press releases34
Official company press releases (earnings, deals, guidance) straight from the wire.
PR Newswire press releases
The other big press-release wire — includes the law-firm 'investor alert' spam we record raw and filter in analysis.
SEC EDGAR 8-K filings79
Material-event filings companies are legally required to make — the ground-truth feed.

“What's hot” lists name stocks that should have news today — 67 stocks were flagged this round (every one recorded, including those where we then found nothing):

Top gainers & losers
Alpaca screener: the day's 50 biggest gainers + 50 biggest losers (EOD snapshot — its own freshness stamp is recorded every round).
Most-active stocks
Alpaca screener: the 30 highest-volume names — volume spikes often mean news.
Earnings calendar
Finnhub's calendar of companies reporting today/tomorrow — earnings days make news.

Targeted look-ups then query flagged stocks the feeds didn't cover:

Finnhub per-stock news
First stop for a targeted look-up on a hot-list stock the feeds missed.
Yahoo Finance per-stock news
Fallback look-up if Finnhub is rate-limited — one exhausted source never sinks a round.

Pipeline declared once in src/chatpredict/flow.py · source available on request — the runner builds from it and this page renders it, so the diagram cannot drift from what runs.

Market-wide news feeds 584articles kept 5 feeds, every article pre-tagged with its tickers
Hot lists → targeted look-ups 29articles kept 67 stocks flagged; the uncovered ones looked up one by one
03 Keep what qualifies Exact ticker matches against 5,856 US common stocks; everything else is dropped and the reason logged. 613articles kept613 = 584 + 29

An article is kept only if it maps to exactly one US common stock — via the source's own ticker tag, an official SEC identifier, or an explicit ticker in the text. We never fuzzy-match company names; that is how news ends up filed under the wrong company. Every rejected article is stored in the drop log with its reason — “no record” never means “quietly discarded”.

droppedreasonmeaning
166out of universenot a US common stock in our universe
31no tickerno exact stock could be identified

Resolution: src/chatpredict/resolver.py · source available on request

04 Three models judge every headline The paper's exact question, word for word, at temperature 0 — YES, NO, or UNKNOWN. 581headlines judged

Each (stock × headline) is judged once per model, ever — at temperature 0 a re-ask returns the same answer. A rate-capped or failed model is retried next round; one that answered is never re-asked. The free models are paced to their providers' limits, which is why a round takes about 80 minutes — the panel runs in parallel, so the wall-clock is the slowest model. Tallies below are this round's.

gpt-4.1-nano
OpenAI · Paid anchor — the model that replicates the paper (the gpt-5 family rejects the paper's required temperature=0; verified live).
~paid tier, 0.2s between calls
106 yes · 59 no · 416 unknown
gemini-2.5-flash
Google · Free panel member — does the signal survive a different model family?
free tier, 10 calls/min (6.1s pacing)
166 yes · 115 no · 300 unknown
gpt-4o-mini
OpenAI · Third panel member — a second, distinct OpenAI model. Replaced Groq llama-3.1-8b-instant on 2026-06-15: that 8B model never emits NO (0/472 on day one, even on bankruptcy/fraud — verified it's the model, not our parser), so it could never short.
~paid tier, 0.2s between calls
146 yes · 175 no · 260 unknown

Clients and pacing: src/chatpredict/model_clients.py · source available on request

05 Everything stored raw Verdicts, drops, no-news checks — all of it. The strategy is computed later, as an analysis over a complete record. 1,743verdict rows

A verdict row keeps the model's raw reply verbatim alongside the parsed answer, the relevance tag, and a status (ok / failed / rate-capped) — so every gap is explainable. Articles keep their raw source payloads; flagged-but-newsless stocks are recorded too. UNKNOWN and not-relevant answers are kept like everything else: no-signal is also a result. Nothing is filtered live.

06 Committed to the record The database is committed to a git repository — append-only history, every change timestamped. 9:01 AM ETcommitted

After every round the SQLite file is committed back to the repository — simultaneously the store, the backup, and the audit trail. If new code lands mid-round, the save step rebases and retries; if saving still fails, the data is parked as a downloadable artifact. This site is regenerated from that file on every commit and has no write access to anything. See the run ledger.

What the models said today

gpt-4.1-nano
Paid anchor
106 yes · 59 no · 416 unknown
gemini-2.5-flash
Free panel member
166 yes · 115 no · 300 unknown
gpt-4o-mini
Third panel member
146 yes · 175 no · 260 unknown

Where at least two models agreed on a relevant headline — 156 stocks:

AAPL yesABBV yesAEP yesAGI noALB yesAMC yesAMR noAMZN yesANDE yesAPA noAPLE yesAR noATNI yesAUGO yesAZTR yesBAM yesBCBP noBCML noBEPC yesBFS yesBIVI noBMNR yesBMNR noBRK-A yesBRK-B yesCAKE noCALX noCAST yesCCEC yesCCL yesCCZ noCDE yesCERT noCHDN yesCHTR noCMCSA noCOCO noCOIN noCPA yesCRK no+ 116 more →

from the verdicts × articles of Jun 19's 1 round — full table on News & Verdicts

The record so far

Trading days live5
see history →
Rounds run9
the ledger →
Articles collected6,100
browse all →
Verdicts recorded16,563
every one shown →