Can an AI read the news and call the stock? We're testing it — live.

Twice every trading day a machine collects US-stock headlines, has three language models judge each one — YES, NO, or UNKNOWN — and records everything raw. A forward, out-of-sample replication of Lopez-Lira & Tang (2023).

The paper: “Can ChatGPT Forecast Stock Price Movements?”, arXiv 2304.07619 · paper ledger only, no real trades.

Afternoon round completed 2026-06-19 at 2:30 PM ET (on time) · 240 articles · 204 judged · database committed 2:52 PM ET.

Rounds fire 8:00 AM & 2:30 PM ET, every trading day

How a round works

Live numbers from the latest round — afternoon round · Jun 19, fired Jun 19, 2:30 PM ET. Every step expands.

331 collected→ 240 kept (−91 dropped)→ 204 judged→ 612 verdicts→ committed 2:52 PM ET

01 The clock fires Two rounds every trading day — morning for overnight and pre-market news, afternoon for trading-day news. Never early; late is allowed and recorded; never twice. fired 2:30 PM ETon time

Morning round — fires 8:00 AM ET, judges P3 (published after 4pm yesterday) + P1 (published before 6am today). Decisions feed today's 9:30 OPEN (orders by 9:28); collection window: since yesterday 14:30 ET.

Afternoon round — fires 2:30 PM ET, judges P2 (published 6am–4pm today). Decisions feed today's 16:00 CLOSE (orders by 15:58); collection window: since today 06:00 ET.

An early round would truncate its window and silently miss news, so the gate forbids it. A late round still collects everything published before its target — the slip is recorded in the run ledger, and the analysis judges each day's fidelity. Schedule and gate: src/chatpredict/schedule.py · source available on request

02 Collect the news Market-wide feeds plus hot-list look-ups, pulled in parallel. The publish time is the law. 331articles fetched

Market-wide feeds deliver ticker-tagged articles for the whole US market at once:

Alpaca news (Benzinga wire)69

Market-wide professional news wire; every article arrives tagged with its tickers.

Polygon.io news147

Aggregated market news API, ticker-tagged (free tier lags ~1h — recorded as-is).

GlobeNewswire press releases9

Official company press releases (earnings, deals, guidance) straight from the wire.

PR Newswire press releases4

The other big press-release wire — includes the law-firm 'investor alert' spam we record raw and filter in analysis.

SEC EDGAR 8-K filings

Material-event filings companies are legally required to make — the ground-truth feed.

“What's hot” lists name stocks that should have news today — 67 stocks were flagged this round (every one recorded, including those where we then found nothing):

Top gainers & losers

Alpaca screener: the day's 50 biggest gainers + 50 biggest losers (EOD snapshot — its own freshness stamp is recorded every round).

Most-active stocks

Alpaca screener: the 30 highest-volume names — volume spikes often mean news.

Earnings calendar

Finnhub's calendar of companies reporting today/tomorrow — earnings days make news.

Targeted look-ups then query flagged stocks the feeds didn't cover:

Finnhub per-stock news

First stop for a targeted look-up on a hot-list stock the feeds missed.

Yahoo Finance per-stock news

Fallback look-up if Finnhub is rate-limited — one exhausted source never sinks a round.

Pipeline declared once in src/chatpredict/flow.py · source available on request — the runner builds from it and this page renders it, so the diagram cannot drift from what runs.

Market-wide news feeds 229articles kept 5 feeds, every article pre-tagged with its tickers

Hot lists → targeted look-ups 11articles kept 67 stocks flagged; the uncovered ones looked up one by one

03 Keep what qualifies Exact ticker matches against 5,856 US common stocks; everything else is dropped and the reason logged. 240articles kept240 = 229 + 11

An article is kept only if it maps to exactly one US common stock — via the source's own ticker tag, an official SEC identifier, or an explicit ticker in the text. We never fuzzy-match company names; that is how news ends up filed under the wrong company. Every rejected article is stored in the drop log with its reason — “no record” never means “quietly discarded”.

dropped	reason	meaning
76	out of universe	not a US common stock in our universe
15	no ticker	no exact stock could be identified

Resolution: src/chatpredict/resolver.py · source available on request

04 Three models judge every headline The paper's exact question, word for word, at temperature 0 — YES, NO, or UNKNOWN. 204headlines judged

Each (stock × headline) is judged once per model, ever — at temperature 0 a re-ask returns the same answer. A rate-capped or failed model is retried next round; one that answered is never re-asked. The free models are paced to their providers' limits, which is why a round takes about 80 minutes — the panel runs in parallel, so the wall-clock is the slowest model. Tallies below are this round's.

gpt-4.1-nano

OpenAI · Paid anchor — the model that replicates the paper (the gpt-5 family rejects the paper's required temperature=0; verified live).

~paid tier, 0.2s between calls

13 yes · 17 no · 174 unknown

gemini-2.5-flash

Google · Free panel member — does the signal survive a different model family?

free tier, 10 calls/min (6.1s pacing)

38 yes · 30 no · 136 unknown

gpt-4o-mini

OpenAI · Third panel member — a second, distinct OpenAI model. Replaced Groq llama-3.1-8b-instant on 2026-06-15: that 8B model never emits NO (0/472 on day one, even on bankruptcy/fraud — verified it's the model, not our parser), so it could never short.

~paid tier, 0.2s between calls

24 yes · 74 no · 106 unknown

Clients and pacing: src/chatpredict/model_clients.py · source available on request

05 Everything stored raw Verdicts, drops, no-news checks — all of it. The strategy is computed later, as an analysis over a complete record. 612verdict rows

A verdict row keeps the model's raw reply verbatim alongside the parsed answer, the relevance tag, and a status (ok / failed / rate-capped) — so every gap is explainable. Articles keep their raw source payloads; flagged-but-newsless stocks are recorded too. UNKNOWN and not-relevant answers are kept like everything else: no-signal is also a result. Nothing is filtered live.

06 Committed to the record The database is committed to a git repository — append-only history, every change timestamped. 2:52 PM ETcommitted

After every round the SQLite file is committed back to the repository — simultaneously the store, the backup, and the audit trail. If new code lands mid-round, the save step rebases and retries; if saving still fails, the data is parked as a downloadable artifact. This site is regenerated from that file on every commit and has no write access to anything. See the run ledger.

What the models said today

gpt-4.1-nano
Paid anchor

119 yes · 76 no · 590 unknown

gemini-2.5-flash
Free panel member

204 yes · 145 no · 436 unknown

gpt-4o-mini
Third panel member

170 yes · 249 no · 366 unknown

Where at least two models agreed on a relevant headline — 191 stocks:

from the verdicts × articles of Jun 19's 2 rounds — full table on News & Verdicts

The record so far

Trading days live5
see history →

Rounds run10
the ledger →

Articles collected6,340
browse all →

Verdicts recorded17,175
every one shown →