MIT AI Risk Initiative
AI Incident Monitoring · A walk through the SORT framework

AI incident reports are climbing across the board.
What does that actually mean?

A rising line could reflect any combination of three forces — more AI being deployed, more reporting infrastructure picking up what was always there, or more harm per use. To address frontier-AI risks properly, the readings have to be separated. A new framework from Slattery et al. (2026) does that — and in doing so produces opposite verdicts on two harms that look superficially similar.

Scroll or click to begin
Step 01

Reports are climbing.

The chart on the right plots monthly AI incidents recorded in the AI Incident Database since 2016.

The shape of the line is unambiguous — but what it means is not.

Step 02

A climbing line has three competing readings.

The line might be rising because:

  • AI is being deployed more widely, so the raw count grows without any change in severity or in the AI itself;
  • Reporting infrastructure has improved, and journalists and researchers have become better at noticing AI-related harms that were already happening;
  • Each use of AI is now more likely to cause harm than it used to be.

These three readings require different responses to mitigate AI risks. Most likely all three are happening at once — the question is in what proportion, and the count alone cannot say.

Step 03

Separate harm from exposure.

The three readings cannot be untangled at the level of “AI incidents in general.” The category is too broad — different harms have different exposure denominators, different reporting infrastructures, and different deployment curves.

Slattery et al. (2026) propose a framework that works at a narrower level: pick one specific harm, estimate its harm and exposure separately, take the ratio, and classify the resulting trajectory of the risk.

The four steps: define a precise monitoring question using the SORT framework, estimate harm and exposure independently, take their ratio, and classify the result.

Step 04

What this procedure produces.

To better understand its current trajectory, the framework classifies a given AI risk into one of four quadrants.

The framework's output is a single dot on a 2 × 2 grid. One axis tracks exposure (E) — is the population at risk growing or shrinking? The other tracks harm per unit exposure (Ĥ) — is each interaction more or less likely to cause harm than before?

  • Escalating — both growing. Urgent attention.
  • Mitigating — exposure growing, harm-rate falling. Monitor closely.
  • Concentrating — exposure shrinking, harm-rate growing. Targeted measures.
  • Receding — both shrinking or flat. Continue strategy.

By defining a specific harm, estimating the two trends separately, then placing them on the grid, we can develop a better view of how AI risks are developing, and why.

Step 05

A monitoring question has four parts.

SORT — Subject, Opportunity, Risk event, Timeframe — is the paper's structured analogue to PICO in evidence-based medicine. It forces analytical choices to be explicit rather than buried in framing.

Each box on the right holds one piece of the question. They will fill in one at a time as you scroll, using the case study of conversational AI and self-harm.

Step 06

Subject — who or what is at risk.

The subject is the population whose welfare is at stake. The choice has to be narrow enough to be measurable, broad enough to capture the phenomenon actually in question.

For this case: people living in the United States. The choice of country fixes the available denominators downstream — censuses, regulatory filings, survey instruments.

Example subjects: Workers in customer-service roles in California; hospital patients in NHS England trusts; drivers on US interstate highways.

Step 07

Opportunity — what creates the exposure.

Opportunity isolates the specific mechanism through which the subject is exposed to the harm. Simply “people who use AI” would cast too wide a net. It is the precise interaction pattern that makes the risk event possible.

Here: using conversational AI systems for emotional support. The narrower the opportunity, the tighter the proxy choices available to estimate exposure later.

Example opportunities: Being screened by an automated resume-filtering system during a job application; receiving a diagnosis assisted by a clinical decision-support tool; driving alongside a vehicle operating in autonomous mode.

Step 08

Risk event — the specific harm.

The risk event is the countable harm itself, phrased so that an incident report can be matched against it. The paper specifies: receiving responses that encourage, or fail to discourage, suicidal ideation or self-harm.

A vaguer phrasing — “AI causes mental health harms” — would inflate the number of partial matches and make the trend signal noisier.

Example risk events: Being rejected from consideration on the basis of a protected characteristic; receiving a missed or delayed diagnosis traceable to the tool's recommendation; being involved in a collision the autonomous system failed to avoid.

Step 09

Timeframe — the unit of comparison.

Timeframe defines the observation window. Per calendar year is the default chosen here: the underlying databases publish in year-resolution, and year-on-year change is what the framework is trying to surface.

Example timeframes: per quarter, fiscal year, or month.

Step 10

Assembled, the monitoring question reads:

That single sentence is the unit of analysis. Everything downstream — which databases to search, which proxies to allow, what counts as a full match — flows from its exact phrasing.

Step 11

Classifying source reliability.

Answering a monitoring question means estimating two numbers across consecutive time periods — the harm associated with the risk event, and the exposure defined by the subject and opportunity. Both estimates rest on whatever sources the data environment makes available, and those sources vary widely in how directly they speak to the question.

The paper sorts estimation methods into four tiers by the strength of the underlying evidence. Click on each of the rows to learn more about the tier classification.

Step 12

Harm, source one — the AI Incident Database.

With no authoritative single source for this monitoring question, the procedure begins at Tier 2 — combining proxy measures to construct bounds.

An LLM-assisted scan of the AIID returns 2 full matches in 2024 and 17 in 2025. Two matches in 2024 falls below the threshold for a reliable signal, so this database alone cannot resolve the trend. A second source is needed.

Step 13

Source two — OECD AIM carries the trend.

The OECD AI Incidents Monitor uses a different sourcing pipeline from the AIID, drawing on a broader set of news and regulatory feeds. Filtered for US-based incidents involving conversational AI and resulting in physical or psychological injury, the same LLM analysis yields 8 full matches in 2024 and 55 in 2025 — a roughly seven-fold increase in the match count.

The associated harm counts — the number of people affected per matched incident — also jump sharply, from a 9–17 range in 2024 to roughly the hundred-thousand range in 2025. This second number is driven by a small number of incidents involving large user populations (a single platform-level event can put hundreds of thousands into the affected count), so it's noisier than the match count and shouldn't be read as a clean signal of per-incident severity. The match count is the load-bearing trend signal here.

Step 14

An upper bound from a single proxy.

For an upper-bound estimate, the paper draws on OpenAI's own disclosure: approximately 0.15% of weekly active users engage in conversations indicating potential suicidal planning or intent — more than one million people per week, globally.

That number is not a lower-bound match count — it's a ceiling derived from a proxy proportion. The two kinds of evidence belong on different scales: the AIID and OECD AIM counts are floors built from confirmed reports, while the OpenAI figure is a roof scaled from a population-level rate.

Step 15

The Pew proxy, plus a scaling assumption.

Pew Research data on ChatGPT use “to learn new things” and “for entertainment” by age group serves as the proxy frontier. The mid-point of those two shares becomes the point estimate; the individual shares form the lower and upper bounds.

To extend from ChatGPT to all conversational AI, the paper applies a market-share scalar: 80% at the point estimate, 90% and 70% for the upper and lower bounds.

Step 16

Exposure — 64M in 2024, 88M in 2025.

Combining the assumption stack with the Pew bucket data and the US census yields a central estimate of 64 million people in 2024 (plausible range 54–73M) and 88 million in 2025 (75–99M). Order of magnitude: 10⁸.

The trend is increasing — approximately 40% year on year. Confidence tier 2 · Medium: the bounds are derived from reasonable sources, the assumptions are explicit, and the directional reading is robust to the moves used to construct it.

Step 17

The chatbot case lands in the top-right.

Plot the chatbot case. OECD AIM's match count jumped roughly seven-fold between 2024 and 2025. Exposure rose ~40% over the same period. Harm grew faster than exposure, so harm-per-exposure — Ĥ — is rising. Exposure is also rising, so E is up.

Both arrows point up. The dot lands in the top-right quadrant: Escalating.

Step 18

Verdict — Escalating.

Both the population at risk and the harm per unit exposure are growing. The framework's recommendation: urgent attention — expanded monitoring, active investigation into causal drivers, and possibly regulatory intervention.

The confidence tier is Low; tightening it would require either mandatory disclosure of conversational-AI use or a dedicated survey instrument. Both fall outside the current data environment.

Same framework · Different case · Different verdict

AV crashes are rising too. Why does the framework call them mitigating?

The same procedure — define a monitoring question, estimate harm and exposure separately, classify — is now applied to a second case. The numbers come from NHTSA's mandatory reporting and the Autonomous Vehicle Industry Association. Watch where the dot lands.

Step 19

Now apply the same framework to autonomous vehicles.

NHTSA's mandatory reporting puts the procedure at Tier 1 for harm: ADS incidents rose from 526 in 2024 to 975 in 2025, an 85% increase — primarily driven by property-damage cases rather than injuries.

A headline that, on its own, would suggest the framework's most urgent classification. The chatbot dot from the previous section is ghosted for comparison.

Step 20

But exposure doubled in the same period.

The point estimate for AV exposure puts it at 78M miles in 2024 and 156M miles in 2025, drawn from the Autonomous Vehicle Industry Association's whole-year totals and Waymo's published ride velocity. Exposure roughly doubled — a 100% increase against the harm side's 85%.

Step 21

Verdict — Mitigating.

Exposure growth (≈100%) outpaces harm growth (≈85%), yielding a decreasing harm-per-exposure trend against rising exposure. Fewer incidents occur per million vehicle-miles than the year before.

Same procedure. Same direction on raw harm. Opposite governance implication. The framework's value is that it makes the second number — the exposure denominator — visible enough to change the verdict.

The harm side is Tier 1 (NHTSA mandatory reporting), but exposure remains Tier 2 — AVIA and Waymo disclosures are the best available, not authoritative. Mandatory mile-reporting from AV operators would lift exposure to Tier 1 and tighten the verdict considerably.

What the framework reveals

Two harms moving the same direction on the raw counts, two opposite trajectories.

Conversational AI and self-harm gets escalating: both exposure and harm-per-exposure are rising. Autonomous-vehicle crashes get mitigating: exposure is rising faster than harm. The raw counts alone could not have told the difference between them.

The point of the framework isn't to settle the verdict. It's to make the assumption stack visible — the bound construction, the proxy choices, the confidence tier — so that policy makers and practitioners can argue about the moves, not just the conclusion.

Slattery et al. (2026) · Classification of AI incident trajectories