Flagship · Cross-DB

UNIQUE — only on NoSqlStudio

Ctrl+Alt+Shift+Q

Query Regressions

Every query gets a stable fingerprint (Oracle sql_id-style) that survives changing literal values, so "the same query" stays one line even when the filter values differ. NoSqlStudio learns a per-fingerprint baseline (EWMA/MAD), watches the live stream instance-wide, and the moment a query regresses it explains WHY across five levels — from before/after metrics down to an AI root-cause narrative. It is the only IDE that does this on MongoDB, Azure Cosmos DB (Mongo API) and AWS DocumentDB alike.

Where it lives in the app: Monitoring ▾ → Query Regressions

Why a DBA needs Query Regressions

A slow-query log tells you a statement was slow. It never tells you which query, whether it was always like that, what changed, or whether it even matters — so the DBA ends up grepping logs at 2am, correlating by hand, guessing. Query Regressions exists to end that.

It gives every query a stable identity (a fingerprint that survives changing values), learns what "normal" looks like for each one, and watches the whole instance live. The question stops being "is something slow?" and becomes "this exact query just got 74× slower, here's the plan that changed, and an index was dropped two minutes before."

That matters because the expensive incidents are the silent ones: an index dropped in a maintenance window, a deploy that changed a query shape, data growth that quietly crossed a threshold, a Cosmos bill that doubled. None of them page you — until customers do. This catches them the moment they regress and names the cause, across MongoDB, Cosmos and DocumentDB alike.

And it respects the DBA's reality: your laptop is not a server. You arm monitoring for the scope and window you choose; alerts reach you on Slack, Discord, Teams or the OS; history survives restarts; and the AI writes the root-cause summary you'd otherwise assemble by hand. Less log spelunking, faster MTTR, fewer 2am surprises.

What it does

Stable sql_id-style fingerprint per query shape — native queryHash on MongoDB, a client-side shape-hash on Cosmos/DocumentDB — so a query keeps one identity across different literal values.
Cross-engine by design: MongoDB / Atlas, Azure Cosmos DB (Mongo API) and AWS DocumentDB — the only tool that gives regression RCA on all three.
Instance-wide live capture: it sees every client's queries, so it catches the undeclared change made in a maintenance window — not just the queries you happen to run.
5-level root cause: (1) before/after metrics, (2) execution-plan diff, (3) index / cardinality probes, (4) DDL & deploy events correlated within ±5 min, (5) an AI narrative with your own LLM.
Classifies the cause: plan change, selectivity collapse, data growth, RU spike, or a brand-new slow collection scan.
Per-engine metric: p95 latency on MongoDB / DocumentDB, RU on Cosmos.
DBA-driven monitoring: you arm it on screen (whole instance or chosen databases, optionally scheduled); it runs as a Job Manager job with a live recording dot — your laptop is not a server.
Alerting on confirmed regressions: native OS notification + webhook (Slack, Discord and Microsoft Teams auto-detected from the URL), with a re-alert interval you set in seconds.
Everything persists: learned baselines and the alert history survive restarts (file + MongoDB mirror), so a regression found overnight is still there in the morning — filter by period, mark read/unread, delete.
One-click remediation: clear the plan cache on a plan change, acknowledge, snooze, or open the explain plan.

Step-by-step

1
Open Monitoring ▾ → Query Regressions
The tab opens with the alerts that previously fired for this connection, grouped by database. Nothing to configure to read history.
2
Arm monitoring for the scope you care about
Click Start and pick the whole instance or specific databases, optionally with a start/end schedule. It registers as a job in the Job Manager and keeps running independent of the tab.

Example — a "new slow query"

A first-seen collection scan over many docs is flagged immediately, no baseline needed. A filter on an unindexed field is the classic case:

// unindexed field -> COLLSCAN over the whole collection
db.orders.find({ promoCode: "BLACKFRIDAY" })

// → card: New slow query — shop.orders
//   plan: COLLSCAN  ·  docs examined 1,550,000  ·  p95 642 ms  ·  WARN

Example — a baseline regression

A query that was cheap against its learned baseline suddenly costs much more. Same fingerprint, only the plan/cost changed — e.g. an index was dropped so IXSCAN flips to COLLSCAN:

// same query, same fingerprint — only the plan changed
baseline (35 samples):  IXSCAN { userId:1, status:1 }   p95   2.5 ms
now:                    COLLSCAN                        p95 185.0 ms

// → card: Plan regression — shop.orders
//   delta 74x  ·  cause: plan-change  ·  CRITICAL
//   what changed nearby: dropIndex idx_user_status (2 min before)

5
Read the card + get the AI root cause (Level 5)
Severity, the fingerprint, Baseline → Now for latency and docs examined, the plan diff, the query shape, and what changed nearby. Then run the Level-5 analysis with your own LLM — it reads only the fact sheet and ends with the single most likely cause + a confidence score.
6
Fix it, and get alerted next time
Clear the plan cache or add the index; set up OS + Slack/Discord/Teams alerting once in Settings ▸ Alerting so the next regression reaches you wherever you are.

Real use cases

The silent regression

A nightly job dropped an index. At 2am a find() flipped IXSCAN → COLLSCAN on a 1.5M-doc collection. The card fired with the plan diff and the dropIndex event correlated 2 minutes before — root cause on one screen, no log spelunking.

Catch a bad deploy

A release changed a query shape so it stopped using the composite index. The regression confirmed within minutes and the Level-4 timeline pointed at the deploy marker — rolled back before on-call paged.

Cosmos bill spike

A query's RU cost tripled after a selectivity collapse. The RU before/after plus the AI narrative gave the partition/index fix before month-end billing.

Undeclared change during a maintenance window

Instance-wide capture flagged a regression in a database that wasn't part of the planned change — the out-of-scope safety banner surfaced it before it became an incident.

Triage from your phone

Confirmed regressions push to Slack / Discord / Teams with the namespace and delta; the re-alert interval keeps reminding while it stays broken, then goes quiet once it's fixed.

FAQ

How is this different from a slow-query log?

A slow log gives you isolated lines. Query Regressions gives a stable identity per query, a learned baseline, the cause classified across five levels, correlation with DDL/deploys, an AI narrative — plus alerting and history. It tells you what changed, not just that something was slow.

Does it really work on Cosmos DB and DocumentDB?

Yes — that's the point. MongoDB reads the slow log, Cosmos reads Azure diagnostic logs (RU as the metric), DocumentDB reads the CloudWatch profiler. Same fingerprint + RCA model on all three.

Do I have to keep the app open?

Monitoring runs while NoSqlStudio is open (tab open or not) and survives restart by reloading the persisted baselines. A headless 24/7 agent is a separate product on the roadmap.

Where do alerts go?

A native OS notification plus an optional webhook. One URL works for Slack, Discord or Microsoft Teams — the payload is adapted to the provider. The re-alert interval (in seconds) controls how often a still-regressed query re-notifies.

What does the AI step send to the LLM?

Only a compact fact sheet (before/after metrics, plan diff, index findings, cardinality, query shape). You bring your own provider/key via the AI Registry; nothing leaves until you run the analysis.

Ready to see it on your Cosmos account?Download NoSqlStudio Next → Operations Center