Skip to content
Documentation

Test Manual — Query Regressions

A step-by-step guide to open, use, and validate Query Regressions — the stable-fingerprint performance-regression detector — across MongoDB, Cosmos DB, and DocumentDB.

Estimated time: ~25 minutes
In this manual

Introduction

A step-by-step guide to help you open, use, and validate the Query Regressions screen — the performance-regression and anomaly detector built into NoSqlStudio. Follow from Step 1 to the end, in order. Each step tells you what to do and what you will see.

How to read this manual

Each step has:

  • Do — the exact action (click, type, run such-and-such command).
  • See — what should happen on screen. This is your “pass / fail”.
  • Why — extra context, only when it helps (skip it if you're in a hurry).

What Query Regressions is

Concept

A stable fingerprint, like Oracle's sql_id

Every query shape gets a stable fingerprint that never changes between runs — the same idea as Oracle's sql_id. NoSqlStudio learns a rolling baseline for each fingerprint and raises an alert when the same query starts costing more: slower wall-clock, more documents examined, more RU, or a worse plan (for example IXSCAN → COLLSCAN).

Concept

One screen, three engines

The same screen works on MongoDB, Azure Cosmos DB (Mongo API), and AWS DocumentDB. Capture is instance-wide — it sees slow queries from every client and every database, not just the ones you run from NoSqlStudio. That is what catches an undeclared change during a maintenance window.

Concept

Five levels of root-cause analysis

Each alert is explained in layers: L1 before/after metrics · L2 the plan diff · L3 investigators (indexes, collection stats, cardinality) · L4 correlated DDL/deploy events in a ±5 min window (“what changed right before this got slow”) · L5 an AI narrative.

Before you start

You can explore the whole UI with demo data and no database at all (Stages 1–3). Live capture (Stages 4–6) needs a connection, and the data source depends on the engine:

  1. 1MongoDB — reads the server slow-query log via getLog:'global'. Works on self-hosted replica sets and dedicated Atlas clusters. Not available on Atlas shared tiers (M0/M2/M5), which block getLog.
  2. 2Cosmos DB — reads diagnostic logs from Azure Log Analytics. Requires Log Analytics wired in Cloud Credentials (Cosmos Optimizer).
  3. 3DocumentDB — reads the DocDB profiler stream from Amazon CloudWatch Logs. Requires the profiler enabled at the cluster parameter group, exported to CloudWatch, and AWS credentials set in Cloud Credentials.
  4. 4A second tab where you can run shell commands — the Scratchpad or Mongo Shell — to generate slow queries.

Estimated time for the full walkthrough: ~25 minutes.

Stage 1

Open the screen (no database needed)

Get the screen open and load the built-in demo so you can learn the layout before wiring a live source.

Step 1

Open Query Regressions

Do

In the toolbar, choose Monitoring ▼ → Query Regressions (keyboard shortcut Ctrl+Alt+Shift+Q).

See

A new tab opens with the Query Regressions header on the left, a master list, and a detail panel on the right.

Step 2

Load the demo data

Do

If the list is empty, click Load demo data in the empty state.

See

Two sample alerts appear under a SHOP group: a Plan regression — shop.orders and a Selectivity collapse — shop.events. The header shows a red 2 CRITICAL badge.

Why

The demo runs entirely in memory — no server, no permissions. It is the fastest way to learn what every section of a card means.

Step 3

Select an alert

Do

Click the Plan regression — shop.orders card in the list.

See

The detail panel fills in. The header shows a CRIT badge, the title, and below it the fingerprint (a stable hash), the namespace, and the occurrence count.

Stage 2

Anatomy of an alert

Read each section of the detail panel top to bottom — this is the same layout for every engine.

Step 4

WHAT CHANGED (Level 1)

See

A table with Signal · Baseline · Now · Δ. For MongoDB/DocumentDB the metric is p95 duration (ms); for Cosmos it is RU. A second row shows docs examined / returned. The “Now” and “Δ” columns are highlighted when they regressed (for example 2.6 → 197 = 75.8×).

Why

This is the headline: the same query is now far more expensive than its own learned baseline.

Step 5

HOW IT CHANGED — the plan diff (Level 2)

See

The before/after plan, for example IXSCAN { userId:1, status:1 } COLLSCAN. A dropped index or a bad plan-cache entry shows up here immediately.

Step 6

WHY — root cause (Level 3)

See

Bullet findings from the investigators: which indexes exist on the collection, whether the plan changed, whether the plan-cache key changed. Each bullet is colored green (ok) or amber (suspect).

Step 7

WHAT CHANGED NEARBY (Level 4)

See

A timeline of DDL and deploy events within ±5 minutes of detection — for example 2m before · dropIndex · shop.orders · idx_user_status.

Why

This is the “what changed right before this got slow” answer. A dropped index two minutes before a plan flip to COLLSCAN is a near-certain root cause.

Step 8

QUERY SHAPE and AI Root Cause (Level 5)

See

The canonical shape (op, filter fields, sort fields), then an AI Analysis panel with a model picker and a Run analysis button.

Do

Click Run analysis.

See

The button glows while the model works, then a concise root-cause narrative + remediation appears, written from the fact sheet of this alert.

Step 9

Open the details modal

Do

Click the ··· button at the top-right of the detail panel.

See

A modal opens with a fixed header (severity + title + fingerprint) and a scrolling body: the real executed command, an Explain plan button, and an AI section.

Concept

The Command is the actual command captured from the slow-query source — not a reconstruction — so the Explain runs the real query shape.

Step 10

Suggested remediation

See

At the bottom: Clear plan cache (enabled for MongoDB plan-change alerts), Acknowledge, and Snooze 24h.

Do

Click Acknowledge, then Snooze 24h, and watch the alert's status change in the list.

Heads-up

Clear plan cache writes to the live server. It always asks for confirmation first — read the dialog before confirming, especially on production.

Stage 3

Scope and the undeclared-change guard

Step 11

Switch scope

Do

In the SCOPE bar, click a database chip (or All databases).

See

The list filters to the selected databases. All databases is the default and is what you want during a maintenance window.

Why

Detection always runs instance-wide. The scope only filters what you see — it never stops the engine from watching every database.

Step 12

The out-of-scope critical banner

See

If you narrow the scope and a critical regression fires in a database outside your filter, a warning banner appears: “N critical regression(s) in databases outside your filter”.

Why

This is the guard against the classic incident: a change says it touches database X, but a hidden script alters an undeclared database Y. The banner surfaces Y even when you weren't looking at it.

Stage 4

Live capture on MongoDB

Now wire a real source. MongoDB needs no credential setup — it reads the server slow-query log directly.

Step 13

Connect a replica set or dedicated Atlas cluster

Do

Connect NoSqlStudio to a MongoDB replica set or a dedicated Atlas cluster, then open Query Regressions on that connection.

See

The header shows the live status as live (green).

Heads-up

Atlas shared tiers (M0/M2/M5) block the getLog admin command, so live status will read unavailable. Use a dedicated cluster or a self-hosted replica set.

Step 14

Generate an obviously-bad scan

Do

In a shell tab, run a slow collection scan on a field that has no index — sorting by an unindexed field forces a COLLSCAN over the whole collection:

js·2 linhas
use <database>
db.<collection>.find({}).sort({ <unindexedField>: -1 }).limit(25).toArray()
See

The query takes well over 100 ms and scans the whole collection.

Step 15

See the New slow query card

Do

Switch back to the Query Regressions tab and wait a few seconds (the slow log is polled every ~5 s).

See

A WARN card titled New slow query — <namespace> (code Q7) appears, with plan COLLSCAN and a high docs examined count.

Why

A never-seen query that is already a large COLLSCAN is flagged on first sight — no baseline needed. This is the guard for changes that land during a maintenance window.

Step 16

Prove the fingerprint is stable (dedup)

Do

Run the exact same query several more times.

See

No new card appears. The repeats fold into the same alert — same fingerprint = same identity.

Why

This is the whole point of the stable fingerprint: re-running a query never spams duplicate alerts. One query = one identity, like sql_id.

Step 17

Prove a different shape is a new alert

Do

Run a COLLSCAN with a different shape — sort by a different unindexed field:

js·2 linhas
use <database>
db.<collection>.find({}).sort({ <anotherUnindexedField>: -1 }).limit(25).toArray()
See

A second, distinct card appears with a different fingerprint. Different shape = new identity = new alert.

Stage 5

Live capture on Cosmos DB (RU mode)

Cosmos has no slow-query log to poll, so capture reads the diagnostic logs from Azure Log Analytics. The metric is RU, not milliseconds.

Step 18

Wire Log Analytics in Cloud Credentials

Do

On a Cosmos DB (Mongo API) connection, open the Cosmos Optimizer (Ctrl+Alt+Shift+O) → Configuration → Cloud Credentials, and fill the Azure section: the Cosmos account Resource ID and the Log Analytics workspace ID. Click Save.

See

A green confirmation appears (credentials saved … MetricsBridge promoted).

Heads-up

Log Analytics is a paid tier (ingestion + retention). It is the only path that sees queries from other clients — the free header path only sees what NoSqlStudio itself runs, which is not enough for the undeclared-change scenario.

Step 19

Confirm live status and generate a costly query

Do

Open Query Regressions on the Cosmos connection; confirm the status reads live. Then run a cross-partition or unindexed query that burns RU.

See

After Log Analytics ingests the row (minutes of lag), a card appears with the RU before/after as the headline metric.

Concept

Explain is not available on Cosmos RU mode — the modal says so and points you to the RU metric and the Index Policy pane instead.

Stage 6

Live capture on DocumentDB

DocumentDB does not support getLog, so capture reads the DocDB profiler stream from CloudWatch Logs. It needs the profiler enabled and AWS credentials.

Step 20

Enable the DocDB profiler

Do

In AWS, create a custom cluster parameter group with profiler = enabled and a low profiler_threshold_ms (for example 50), apply it (a reboot is required), and add profiler to the cluster's Log exports to CloudWatch.

See

Slow ops start landing in the CloudWatch log group /aws/docdb/<cluster>/profiler.

Step 21

Set AWS credentials in Cloud Credentials

Do

On the DocumentDB connection, open the Cosmos Optimizer (Ctrl+Alt+Shift+O) → Configuration → Cloud Credentials, and fill the AWS DocumentDB section: Region, DocDB cluster identifier, and optionally an AWS profile (leave empty to use the machine's default credential chain). Scroll down and click Save credentials.

See

The section shows a green CONFIGURED badge, and the aws-documentdb MetricsBridge is registered for this connection.

Concept

This pane is shared between AWS DocumentDB and Azure Cosmos and is per connection — fill the AWS section on the DocumentDB connection, the Azure section on a Cosmos connection. Each connection carries its own region/cluster, so multiple DocDB clusters in different regions are handled independently.

Step 22

Open the screen and generate a slow scan

Do

Open Query Regressions on the DocumentDB connection (confirm status live), then run a slow COLLSCAN in a shell tab:

js·2 linhas
use <database>
db.<collection>.find({}).sort({ <unindexedField>: -1 }).limit(25).toArray()
See

Within ~15 s a New slow query card appears, with plan COLLSCAN and a derived docs examined count.

Heads-up

DocumentDB does not emit docsExamined directly, so the value is derived from the profiler execStats scan stage — exact for unfiltered scans, a lower bound when a stage filter excludes rows. An efficient index-only count (IXONLYSCAN) is intentionally not flagged, even if it's a little slow — only COLLSCANs examining many docs are.

Stage 7

Persistence and behaviors

Step 23

Status survives a reopen

Do

Acknowledge or snooze an alert, then close and reopen the Query Regressions tab.

See

The acknowledged/snoozed status and your scope selection are still there.

Step 24

The same screen, every engine

See

Open the screen on a MongoDB, a Cosmos, and a DocumentDB connection in turn. The layout, the cards, the five RCA levels, and the AI panel are identical — only the headline metric (ms vs RU) and the availability of Explain differ.

Cleanup (when you're done)

Do

Drop any throwaway test collections you created:

js·2 linhas
use <database>
db.<collection>.drop()
Do

For DocumentDB, if the profiler was enabled only for this test, you can disable it on the parameter group and delete the CloudWatch log group to stop ingestion costs. For Cosmos, lower or disable Log Analytics diagnostic settings if you don't need ongoing capture.

See

Test data is gone and any paid cloud tiers you enabled for the test are wound down.

Summary of what you validated

StageFeature
1Open Query Regressions and load demo data with no database
2Read an alert: L1 metrics, L2 plan diff, L3 findings, L4 nearby events, L5 AI, details modal, remediation
3Scope filter and the out-of-scope critical banner (undeclared-change guard)
4Live MongoDB capture: New slow query, stable-fingerprint dedup, new shape = new alert
5Live Cosmos DB capture via Log Analytics, with RU as the metric
6Live DocumentDB capture via the CloudWatch profiler stream
7Persistence of status/scope and one consistent screen across all three engines
If every step gave the expected “See”, Query Regressions is 100% validated across MongoDB, Cosmos DB, and DocumentDB. Note down the step number of any discrepancy so we can fix it.