ChatPredict

Can an AI read the news and call the stock? We're testing it — live.

Twice every trading day a machine collects US-stock headlines, has three language models judge each one — YES, NO, or UNKNOWN — and records everything raw. A forward, out-of-sample replication of Lopez-Lira & Tang (2023).

The paper: “Can ChatGPT Forecast Stock Price Movements?”, arXiv 2304.07619 · paper ledger only, no real trades.


Afternoon round completed 2026-06-16 at 2:30 PM ET (on time) · 726 articles · 633 judged · database committed 3:37 PM ET.

Rounds fire 8:00 AM & 2:30 PM ET, every trading day

How a round works

Live numbers from the latest round — afternoon round · Jun 16, fired Jun 16, 2:30 PM ET. Every step expands.

954 collected 726 kept (−228 dropped) 633 judged 1,899 verdicts committed 3:37 PM ET
01 The clock fires Two rounds every trading day — morning for overnight and pre-market news, afternoon for trading-day news. Never early; late is allowed and recorded; never twice. fired 2:30 PM ETon time

Morning round — fires 8:00 AM ET, judges P3 (published after 4pm yesterday) + P1 (published before 6am today). Decisions feed today's 9:30 OPEN (orders by 9:28); collection window: since yesterday 14:30 ET.

Afternoon round — fires 2:30 PM ET, judges P2 (published 6am–4pm today). Decisions feed today's 16:00 CLOSE (orders by 15:58); collection window: since today 06:00 ET.

An early round would truncate its window and silently miss news, so the gate forbids it. A late round still collects everything published before its target — the slip is recorded in the run ledger, and the analysis judges each day's fidelity. Schedule and gate: src/chatpredict/schedule.py · source available on request

02 Collect the news Market-wide feeds plus hot-list look-ups, pulled in parallel. The publish time is the law. 954articles fetched

Market-wide feeds deliver ticker-tagged articles for the whole US market at once:

Alpaca news (Benzinga wire)396
Market-wide professional news wire; every article arrives tagged with its tickers.
Polygon.io news197
Aggregated market news API, ticker-tagged (free tier lags ~1h — recorded as-is).
GlobeNewswire press releases36
Official company press releases (earnings, deals, guidance) straight from the wire.
PR Newswire press releases
The other big press-release wire — includes the law-firm 'investor alert' spam we record raw and filter in analysis.
SEC EDGAR 8-K filings63
Material-event filings companies are legally required to make — the ground-truth feed.

“What's hot” lists name stocks that should have news today — 75 stocks were flagged this round (every one recorded, including those where we then found nothing):

Top gainers & losers
Alpaca screener: the day's 50 biggest gainers + 50 biggest losers (EOD snapshot — its own freshness stamp is recorded every round).
Most-active stocks
Alpaca screener: the 30 highest-volume names — volume spikes often mean news.
Earnings calendar
Finnhub's calendar of companies reporting today/tomorrow — earnings days make news.

Targeted look-ups then query flagged stocks the feeds didn't cover:

Finnhub per-stock news
First stop for a targeted look-up on a hot-list stock the feeds missed.
Yahoo Finance per-stock news
Fallback look-up if Finnhub is rate-limited — one exhausted source never sinks a round.

Pipeline declared once in src/chatpredict/flow.py · source available on request — the runner builds from it and this page renders it, so the diagram cannot drift from what runs.

Market-wide news feeds 692articles kept 5 feeds, every article pre-tagged with its tickers
Hot lists → targeted look-ups 34articles kept 75 stocks flagged; the uncovered ones looked up one by one
03 Keep what qualifies Exact ticker matches against 5,852 US common stocks; everything else is dropped and the reason logged. 726articles kept726 = 692 + 34

An article is kept only if it maps to exactly one US common stock — via the source's own ticker tag, an official SEC identifier, or an explicit ticker in the text. We never fuzzy-match company names; that is how news ends up filed under the wrong company. Every rejected article is stored in the drop log with its reason — “no record” never means “quietly discarded”.

droppedreasonmeaning
197out of universenot a US common stock in our universe
31no tickerno exact stock could be identified

Resolution: src/chatpredict/resolver.py · source available on request

04 Three models judge every headline The paper's exact question, word for word, at temperature 0 — YES, NO, or UNKNOWN. 633headlines judged

Each (stock × headline) is judged once per model, ever — at temperature 0 a re-ask returns the same answer. A rate-capped or failed model is retried next round; one that answered is never re-asked. The free models are paced to their providers' limits, which is why a round takes about 80 minutes — the panel runs in parallel, so the wall-clock is the slowest model. Tallies below are this round's.

gpt-4.1-nano
OpenAI · Paid anchor — the model that replicates the paper (the gpt-5 family rejects the paper's required temperature=0; verified live).
~paid tier, 0.2s between calls
151 yes · 58 no · 424 unknown
gemini-2.5-flash
Google · Free panel member — does the signal survive a different model family?
free tier, 10 calls/min (6.1s pacing)
228 yes · 109 no · 296 unknown
gpt-4o-mini
OpenAI · Third panel member — a second, distinct OpenAI model. Replaced Groq llama-3.1-8b-instant on 2026-06-15: that 8B model never emits NO (0/472 on day one, even on bankruptcy/fraud — verified it's the model, not our parser), so it could never short.
~paid tier, 0.2s between calls
201 yes · 158 no · 274 unknown

Clients and pacing: src/chatpredict/model_clients.py · source available on request

05 Everything stored raw Verdicts, drops, no-news checks — all of it. The strategy is computed later, as an analysis over a complete record. 1,899verdict rows

A verdict row keeps the model's raw reply verbatim alongside the parsed answer, the relevance tag, and a status (ok / failed / rate-capped) — so every gap is explainable. Articles keep their raw source payloads; flagged-but-newsless stocks are recorded too. UNKNOWN and not-relevant answers are kept like everything else: no-signal is also a result. Nothing is filtered live.

06 Committed to the record The database is committed to a git repository — append-only history, every change timestamped. 3:37 PM ETcommitted

After every round the SQLite file is committed back to the repository — simultaneously the store, the backup, and the audit trail. If new code lands mid-round, the save step rebases and retries; if saving still fails, the data is parked as a downloadable artifact. This site is regenerated from that file on every commit and has no write access to anything. See the run ledger.

What the models said today

gpt-4.1-nano
Paid anchor
276 yes · 128 no · 883 unknown
gemini-2.5-flash
Free panel member
416 yes · 207 no · 663 unknown
gpt-4o-mini
Third panel member
373 yes · 361 no · 553 unknown

Where at least two models agreed on a relevant headline — 337 stocks:

AAL yesAAPL noABM yesABOS yesABVX yesACIW yesACN yesADPT yesAEVA yesAIOT noAIT yesAKTS yesALM yesALSN yesAMAT yesAMC yesAMD yesAMD noAMZN noAON noAPPF yesARBB noARDX yesARES yesASML yesATEN yesAVAV yesAVAV noAVB yesAVTX yesAXON yesAXP noAYA yesBABA yesBAND noBDC yesBE yesBEEM yesBETR yesBIOA yes+ 297 more →

from the verdicts × articles of Jun 16's 2 rounds — full table on News & Verdicts

The record so far

Trading days live2
see history →
Rounds run4
the ledger →
Articles collected2,671
browse all →
Verdicts recorded7,209
every one shown →