Must Learn KQL Part 3: Workflow

Chapter 3

May 31, 2023

This post is part of an ongoing series to educate about the simplicity and power of the Kusto Query Language (KQL). If you’d like the 90-second post-commercial recap that seems to be a standard part of every TV show these days…

The full series index (including code and queries) is located here:

https://aka.ms/MustLearnKQL

The book version (pdf) of this series is located here:

https://github.com/rod-trent/MustLearnKQL/tree/main/Book_Version

The book will be updated when each new part in this series is released.

…

As I noted in Part 2 of this Must Learn KQL series…

Even though the structure can deviate, understanding a common workflow of a KQL query can have powerful results and help you develop the logic needed to build your own workflows when its time to create your own queries.
Rod Trent, November 18, 2021

The workflow (some folks call it logic, others call it anatomy, even others call it something else) is a big step into wrapping your mind around how to produce a KQL query. Just like a developer, assigning uniform, repeatable steps ensures you’re not missing something and that your query results will produce the information you are looking to capture.

I tell customers all the time that its not necessary to be a pro at creating KQL queries. Its OK not to be a pro on day 1 and still be able to use tools like Microsoft Sentinel to monitor security for the environment. As long as you understand the workflow of the query and can comprehend it line-by-line, you’ll be fine. Because ultimately, the query is unimportant. Seriously. What’s important for our efforts as security folks is the results of the query. The results contain the critical information we need to understand if a threat exists and then – if it does exist – how that threat occurred from compromise to impact.

Now, those that go on to develop their own queries and own Sentinel Analytics Rules after becoming a KQL pro will be much more capable. And, that should be your goal, too. BUT, don’t get hung up on that. Again, its about the results.

We’ve made it so crazily easy to share KQL queries that its quite possible you may never have to create your own KQL query (aside: I highly doubt it, but COULD BE possible).

In a future post in this series, I’ll go over the actual interface you use to write and run the KQL queries in-depth, but suffice to say that almost every service in Azure has a Logs blade (option in the Azure portal interface/menu) to accommodate querying that service’s logs. This area provides for saving your queries, but also to share your queries.

Because of this built-in capability, many of our customers regularly share their creations with each other, other colleagues, to their own blogs and GitHub repos, and even to the official Microsoft Sentinel GitHub repository (https://aka.ms/ASGitHub). In Part 1 of this series, I supplied links to these and more. So, to prove my point…yes, it’s absolutely possible you might not have to write your own KQL query for a long time.

So, because of that, it becomes even more critical that you at least understand the workflow. Again, if you can read a query line-by-line and determine that the results will produce what you are looking for, you’re golden. If, through your newfound understanding, the query isn’t capable of producing your requirements, you can modify it by line instead of a wholesale adaption. This should be your first KQL goal: read queries.

Through this series, I’ll provide queries for you to use and get hands-on experience because I believe in learning by doing. We’ll be using the links in the Practice Environments section in Part 1 for the hands-on. But, focus initially more on the structure and logical workflow.

And, with that…

A Common KQL Workflow

To get started on the journey to learning KQL, let’s look at the standard workflow of a common search query. Not the search operator (I’ll talk about in the next post), but the search query. This is the query structure we use to search, locate information, and produce results.

The following represents the common workflow of a KQL search query.

P.S. I’ve enabled image linking in this post so you can click or tap to open the image in a larger view. So, you can open the image in a new window or new tab to better follow along.

Let’s break this query down by the steps.

The first step is to identify the table we want to query against. This table will contain the information that we’re looking for. In our example here, we’re querying the SecurityEvent table. The SecurityEvent table contains security events collected from windows machines by Microsoft Defender for Cloud or Microsoft Sentinel. For a full list of all services tables, see the Azure Monitor Logs table reference (also available in Part 1).
The pipe (|) character (the shifted key above the Enter key on most keyboards) is used to separate commands issued to the query engine. You can see here that each command is on its own line. It doesn’t have to be this way. A KQL query can actually be all one single line. For our efforts, and as a recommendation, I prefer each command on its own line. For me, it’s just neater and more organized which makes it easier to troubleshoot when a query fails or when I need to adjust the query to produce different results.
Next, we want to filter the data in some way. If I simply entered the table and ran that as its own, single query, it will run just fine. Doing that returns all rows and columns (up to a limit – which I believe is now 50,000 rows) of the data stored in the table. But our goal is getting exact data back. As an analyst looking for threats, we don’t want to have to sift through 50,000 rows of data. No, we want to look for specific things. The Where operator is one of the best ways to accomplish this. You can see here in the example that I’m filtering first by when the occurrence happened (TimeGenerated) and then (remember the pipe character – another line, another command) by a common Windows Event ID (4624 – successful login).
The next step in our workflow is to provide data aggregation. What do we want to do with this filtered data? In our case in the example, we want to create a count of the Accounts (usernames) that produced a successful login (EventID 4624) in the last 1 hour (TimeGenerated).
Next let’s tell the query engine how we want to order the results. Using the Order operator, I’m telling the query engine that when the results are displayed, I want it shown in alphabetical order by the Account column. The ‘asc’ in the query in the Order Data step is what produces this ordering. If we wanted descending order, we’d use ‘desc’. Don’t worry, we’ll dig deeper into each of these operators as we go along in the series.
Generally, the last thing that I’ll do with this search query is tell the query engine exactly what data I want displayed. The Project operator is a powerful command. We’ll dig deeper into this operator later in this series, but for our step here, I’m telling the query engine that after all my filtering, data aggregation, and ordering, I only want to display two columns in my results: Account and SuccessfulLogins

So, let’s recap what this query accomplished…

It searched our stored security events in the SecurityEvent table for all Accounts that had a successful login in the last hour and chose to display only the Account and number of successful logins per Account in alphabetical order.

7. Our search query output is exactly that:

See that? The Account column is in alphabetical order ascending and the SuccessfulLogons column shows how many times each Account successfully logged in.

If you need to, jump back through each step above until you get a good understanding of the workflow. Again, this is very common, and you’ll see this structure many times working with Microsoft Sentinel and Defender products. Remember, it’s about the results. If you can look at this example and get a good feel that you understand how the results were accomplished, line-by-line, you’re on your way.

I invite you, though, to take this example and copy/paste it into a Logs environment to test. You can have this query to play with it in your own Microsoft Sentinel environment or using the KQL Playground I provided as a resource in Part 1.

SecurityEvent 
| where TimeGenerated  > ago (1h)
| where EventID == 4624
| summarize count() by Account
| order by Account asc 
| project Account , SuccessfulLogons = count_

This query is also available from the GitHub repository for this blog series: https://aka.ms/MustLearnKQL

I’d like to share one extra tidbit with you that you might find helpful as you start testing this KQL query example in your own, or our, environment.

Every language (scripting, coding, querying) has the capability to add comments or comment-out code through special characters. When the query, scripting, or development engine locates these characters, it just skips them. KQL has this same type of character. The character for KQL is the double forwardslash, or //

When you start testing this post’s KQL query example, comment-out a line or two (put the double forwardslash at the beginning of the line) and rerun the query just to see how eliminating a single line can alter the results. You’ll find that this is an important technique as you start developing your own KQL queries. I’ll talk about this more later, too.

In the next post (Part 4) I’ll talk through another, yet just as powerful, way to search for information using KQL that is a top pocket tool for Threat Hunters.

And, then I’ll come back for Part 5 and show how to tie together both search methods to create the full operation of hunting to Analytics Rule. But, don’t worry, that’s not the end. I have no clue how many parts this series be. A lot of it depends on you.

Stay tuned…

Like this series so far? Share it with someone.

Need clarification on something? Hit me up on Twitter: @rodtrent

Must Learn KQL Part 3: Workflow

Chapter 3

A Common KQL Workflow

Discussion about this post