Must Learn KQL Part 7: Schema Talk

Chapter 7

May 31, 2023

This post is part of an ongoing series to educate about the simplicity and power of the Kusto Query Language (KQL). If you’d like the 90-second post-commercial recap that seems to be a standard part of every TV show these days…

The full series index (including code and queries) is located here:

https://aka.ms/MustLearnKQL

The book version (pdf) of this series is located here:

https://github.com/rod-trent/MustLearnKQL/tree/main/Book_Version

The book will be updated when each new part in this series is released.

…

Before jumping directly into talking through some common KQL operators and providing you example queries for hands-on learning (see the TOC) in the next part of this series, there’s some lingering discussion from the last post around the UI, but also how this relates to table schema. I wanted to keep this information separate from the rest and in its own area because it will help you determine where things exist in the tables and how to better pinpoint the data. You saw in Part 4 that it’s easy to find anything in the data. But as you start getting closer and closer to taking the knowledge to develop your very own Analytics Rules for Microsoft Sentinel, you want to take the learning from Part 5 and go just a tad bit further. This where an understanding of the schema becomes important.

The table schema is important. As with any data storage function or service, data is collected and stored – most times appropriately – in organized columns. I noted in Part 5 about the getschema operator for KQL that produces the list of all columns and their types.

In case you missed it or you forgot…

Example:

OfficeActivity
| getschema

Sample results:

As you can see in the results, getschema shows a lot of great information. It shows the actual column names that are important to know for what types of information can be found, but also note the DataType and ColumnType results. These tell us how to query the data – or, rather, the approach we need to take (the type of KQL operator) to query, extract, and manipulate the data.

Using just the information displayed in the screenshot example, I can see that I can use Part 5’s knowledge to show regular Exchange users that sent emails. The following example shows that.

OfficeActivity
| where UserType == "Regular"
| where OfficeWorkload == "Exchange" 
| where Operation == "Send"
| project UserId, UserDomain

Query example is located at: https://cda.ms/3pf

Note that not everything is as neatly stored and defined as the OfficeActivity table in the screenshot. I said earlier that most times data is stored neatly and orderly. There are exceptions and you need to be aware of these. In these cases, you’ll need to utilize some parsing functions of KQL to extract the data yourself. But let’s not focus on that here in this post. I promise, I’ll dig into that later in the series (see the TOC) just before creating your first Analytics Rule.

But fortunately, most times data is store neatly and orderly. This is where the Data Connectors come into play in Microsoft Sentinel. The parsing is done for you when an actual Data Connector is in play. The “parser” is part of the Data Connector or the Sentinel Solution. For those situations where an official Data Connector does not exist, you may be called on to create your own parser. Again, I’ll cover this later in this series, but I do want to call this out, as its important. So, for your efforts as you begin building your KQL knowledge, stick with the tables that are part of a Data Connector, otherwise you’ll bump off into unknown territory that can get miry fast.

OK…with this knowledge firmly in-hand, let’s jump back to the UI to talk about some areas in the console that help shortcut some of this activity.

Column Types

As shown in the screenshot example, there are various KQL column types. Again, knowing these date column types will alter your approach for querying specific columns. I don’t want to spend a lot of time here on this as to not start the varying levels of confusion. But I’ll include this here so I can refer back to it later on in the series.

The KQL column types are…

Basic
- int, long (numerical types)
- bool: true, false (logical operators)
- string: “example”, ‘example’
Time
- datetime: datetime(2016-11-20 22:30:15.4), now(), ago(4d)
- timespan: 2d, 20m, time(1.13:20:05.10), 100ms
Complex
- dynamic: JSON format

For anyone that’s worked with any query language or data format before, these are not uncommon or new. As I talked about in Part 2, KQL – the query language – was not designed to be difficult nor revolutionary. The revolutionary part is how it utilizes the power of the cloud (Azure) to accomplish sifting through mass seas of data quickly and efficiently. No, KQL – the query language – takes the best pieces of a lot of existing query languages. For example, anyone that’s worked with SQL Server, will have an easy time with KQL.

Back to the UI

The UI has an area that aids in organizing and customizing the table/schema view, but it also has capabilities to enable easier and quicker access to KQL query creation. In this post, I’m not going to focus heavily on areas 2-4. You should be able to figure out how to click through and use most of those on your own. And, while I’ll provide a quick overview of all the areas just now, I’ll circle back and focus on the Tables area. As you’re getting started learning KQL, this is the important area that will save you a lot of time learning to create your own queries.

UI Overview:

This is the Tables list. This is where you can find all the available tables for which you can create queries against. We’ll focus on this area just below.
This is the Queries list. This tab area contains a slew of pre-made KQL queries that you can spend hours and days executing, reverse engineering, and all other matters of query learning importance. These are separated by category types like Applications, Audit, Azure Monitor, Azure Resources, Containers, Databases, Desktop Analytics, IT & Management Tools, Network, Security, Virtual Machines, Windows Virtual Desktop, Workloads, and Others.
This is the Functions list. A Function is like a stored procedure in SQL, except in our case the query code is in KQL. This is a hugely useful component of KQL. I’ll cover this in-depth later in the series (see the TOC). Did you know that the Watchlist feature of Microsoft Sentinel relies heavily on a Function? If you access the Function tab in the UI, you’ll see the _GetWatchlist function.
The Filter tab. The Filter tab is absolutely awesome and delivers another shortcut method of developing your KQL queries. After running a query the Filter tab will contain a list of empty data columns that you can select to filter out of the query results. Once a column is selected and applied, you can see in the screenshot that the query is updated automatically with the where operator to use as the filter mechanism and then the query is rerun. The isempty() function is used, which, in itself is a powerful tool that we’ll talk about later in this series.

Schema Area Focus

I noted in Part 6 that everything that can be done in the UI we should eventually accomplish in the KQL query itself. That’s still the case here, but the UI provides some neat shortcuts that shouldn’t be overlooked.

First off, every Table in the list can be expanded to show the schema underneath. So, instead of always resorting to the getschema operator, you can expand the Table while you’re creating your queries to have a quick-glance reference list of what you can query against.
Secondly, if you hover your mouse cursor over a Table name, a new pop-up window displays that provides even more query shortcut value. Also of importance, notice that the pop-up will display the description of the table.
If you click the Use in editor option, the Table name will automatically be placed in the query window so you can start querying against the table.
The Useful links option links directly to the Azure Monitor Logs table reference that I provided as a resource in Part 1.
And, finally, the most excellent, super-cool shortcut is the capability to click and look at sample results from the table itself. Clicking on this will produce its own window similar to the following:

Incidentally, this most excellent, super-cool shortcut is actually a KQL query itself that uses the take operator that I’ll cover later in the series. In fact, it’s a take 10 similar to the following:

OfficeActivity
| take 10

This tells the query engine to display a random set of 10 records as a data sample. Because its random, every time it runs different data will display.

OK, now that we have all the concepts and UI functionality finally out of the way, it’s time to start building queries using the most common KQL operators. From this point on in the series, I’ll supply a KQL example based on an operator you can expect to use and see constantly in Microsoft Sentinel and our other security platform services. You should make it your intent to make use of the public KQL Playground I supplied in the Part 1 resources, or your own environment, to get hands-on with each operator I talk about.

You’ll see as I go along, I’ll take a simple query and start to build on it with each new part in this series. We’ll begin simple and end up with a pretty interesting, but more complex query than what we started with.

Stay tuned…

Must Learn KQL Part 7: Schema Talk

Chapter 7

Column Types

Back to the UI

Schema Area Focus

Discussion about this post