Welcome to the Kusto County Courthouse, where data disputes are settled with the sharp edge of KQL! In this blog, you’re Attorney Query, a master of Kusto Query Language, defending the Network in the sensational case of “The Great Outage.” Use KQL skills to build your case. Let’s dive into the drama!
Grab the dataset for this blog post: https://github.com/rod-trent/KQL_Courtroom_Drama
Case Background: The Great Outage
The Network stands accused of causing a catastrophic server crash on July 1, 2025, leaving thousands of users disconnected. The prosecution claims negligence, but you, Attorney Query, believe the truth lies buried in the logs. Armed with KQL, you’ll interrogate the data, summon witnesses, and present evidence to prove the Network’s innocence.
Today’s hearing focuses on the KQL summarize
operator to uncover patterns in the evidence logs. The courtroom is buzzing, and Judge DataSlam is ready to bang the gavel. “Order in the table!” she roars. Let’s get to work.
The Courtroom Scene
The jury leans forward as you call your first witness: a server log named Loggy McLogface, sworn to “tell the truth, the whole truth, and nothing but the data.” Loggy’s testimony is a sprawling dataset of timestamps, error codes, and server IDs. The prosecution smirks, thinking the volume of data will overwhelm you. But you’ve got KQL.
Your goal: Prove the crash wasn’t the Network’s fault by showing a pattern of external requests overwhelming a specific server. You’ll use summarize
to group the data and reveal the truth.
KQL Skill: Summarizing the Evidence
The summarize
operator in KQL is like a detective’s notepad—it groups and aggregates data to reveal patterns. Here’s how it works:
TableName
| summarize Count=count() by ColumnName
This query counts occurrences of unique values in ColumnName
. You can add more aggregations, like sum()
, avg()
, or max()
, and group by multiple columns for deeper insights.
Let’s apply it to the evidence logs. Imagine a dataset called CourtEvidenceLogs
with columns: Timestamp
, ServerID
, ErrorCode
, and RequestSource
. You suspect external requests (from RequestSource == "External"
) spiked before the crash.
Here’s your query to summarize requests by source and server:
CourtEvidenceLogs
| where Timestamp >= datetime(2025-07-01 00:00:00) and Timestamp < datetime(2025-07-01 23:59:59)
| summarize RequestCount=count() by RequestSource, ServerID
| order by RequestCount desc
You present the results to the jury: Server SRV-042
handled 10,000 external requests in an hour, far above normal. The prosecution objects, but Judge DataSlam snaps, “Overruled! The data speaks!”
Interactive Challenge: Build Your Case
Here’s a sample CourtEvidenceLogs
dataset for you to query:
Challenge 1: Prove the Crash Cause
Write a KQL query to count errors by ServerID
and ErrorCode
for July 1, 2025. Hint: Use summarize
and filter by Timestamp
. Expected output might show SRV-042
with thousands of ERR-500
errors.
Challenge 2: Dig Deeper
Modify the query to filter for RequestSource == "External"
and summarize by hour using bin(Timestamp, 1h)
. Can you pinpoint the hour with the most external requests?
Try these queries in your KQL environment (e.g., Azure Data Explorer). Share your results in the comments!
Jury Verdict: Poll
What do the query results suggest about the crash?
A) External requests overwhelmed SRV-042
, causing the crash.
B) Internal errors on SRV-001
triggered the outage.
C) The data is inconclusive; more evidence needed.
Courtroom Wrap-Up
With summarize
, you’ve turned a chaotic dataset into compelling evidence. The jury is intrigued, and even Loggy McLogface winks from the witness stand. But the trial isn’t over. In the next post, we’ll use the join
operator to cross-examine user activity logs and server metrics. Will you uncover a hidden culprit?
“Court adjourned!” Judge DataSlam declares, slamming her gavel. Stay tuned for more KQL courtroom drama!
Have a KQL query you want to see in court? Drop it in the comments or share your challenge results. See you at the next hearing!