This post is part of an ongoing series to educate about the simplicity and power of the Kusto Query Language (KQL). If you’d like the 90-second post-commercial recap that seems to be a standard part of every TV show these days…
The full series index (including code and queries) is located here:
The book version (pdf) of this series is located here:
https://github.com/rod-trent/MustLearnKQL/tree/main/Book_Version
The book will be updated when each new part in this series is released.
…
The next couple parts/chapters in the Must Learn KQL series are shorter ones as we complete the series-within-the-series (parts 13-16) on manipulating the query results So, in this mini-series far we’ve talked about how to create custom columns with the Extend operator and shown how to display (or not display) specific data in the results using the Project operator. But what if you need to get even more granular with the data that is presented? This is where the Distinct operator comes in.
As you can imagine by the operator’s name, the Distinct operator delivers results based on a distinct combination of the columns you provide.
For example, if I have 100’s of computers stored in the table, each with their own combination of activity and data, but I only want to know each computer name, I would supply a KQL query similar to the following:
SecurityEvent //the table
| distinct Computer //show distinct computer names
Feel free to use the KQL Playground (https://aka.ms/LADemo) we’ve used throughout this series to try this query out yourself.
Look at the results differences between using just the Project operator against the Distinct operator. Notice the differences in simplicity of what is displayed along with the volume of what is displayed.
Distinct can also be used for more than one data column as is shown above and is generally intended (as I mentioned above) to produce a combination of the columns you provide. The beauty of the Distinct operator is that it allows you to get extremely precise in what is returned, which is hugely important when using KQL to perform security Hunting operations – which I’ll cover after we’ve achieved our goal in this series of creating our first Analytics Rule for Microsoft Sentinel (watch the TOC for details).
Before handing you off to a series of hands-on opportunity examples, let’s take some of our accumulated knowledge and apply it to this same scenario. What if we wanted to not only show the distinct computer names (as in the example above), but also needed to show how much activity each computer has been responsible for over the last 24 hours?
Don’t cheat! Think about it for a second. What operators would you combine with Distinct? How would you get a summarized count of distinct computers? <== there’s a hint in there
Remember, story problems and developing storylines are the basis for all security. That doesn’t just mean things like who did it and why, but also includes how to expose the data to show the story. That’s important.
OK…here’s how I would supply the result from the ask:
SecurityEvent //the table
| summarize count() by Computer //getting the count of computers
| distinct Computer, count_ //showing the distinct computer names, combined with the count of how many times each computer has been reported with some sort of activity
In the example, notice that I’ve used summarize to get a count of all Computers first, then combined the Computer data column with the count in the last line using Distinct.
Your results using the KQL Playground (https://aka.ms/LADemo) will be something similar to the following…
All YOU, baby!
The following series of KQL queries gives you a chance to get some hands-on experience with more of the Distinct operator and a chance to shine your growing expertise. While the following examples are not specifically security related, I believe you will find them interesting because KQL transcends one product, one workload, or one area of focus. It’s good to reiterate this. KQL is important to anyone working in Azure and will only become more critical as time progresses. You need to elbow nudge your colleagues over this. Your KQL learning in this Must Learn KQL series could lead to a whole new career path.
Take the following examples and run them in the KQL Playground (https://aka.ms/LADemo) and then prepare for the last part/chapter of this series-within-a-series on manipulating results.
P.S. If you’re just being introduced to this series, I entreat you to start at the beginning before digging into the examples – unless of course, you’re already a KQL master.
Perf //the table
| distinct Computer //find all the computers that are reporting performance data to Log Analytics
Perf //the table
| distinct ObjectName //Using the same performance example, finding all the object types that we have performance data for
Perf //the table
| distinct ObjectName, CounterName //We want to see all the metrics for each object, in this case CounterName
UpdateSummary //the table
| distinct Computer, WindowsUpdateSetting //We can get our Windows Update Settings for all servers we’re managing with the Update Management solution
UpdateSummary //the table
| distinct Computer, WindowsUpdateSetting, OsVersion, OldestMissingSecurityUpdateInDays //However, we’re not limited to just one or two fields. We can add more, in this example we’ll get our servers, their update setting, OS version and the oldest update they need in days
Update //the table
| where UpdateState == "Needed" //retrieve only those systems where updates are needed
| distinct Computer, KBID, Title //Finally, we can quickly build a report of systems needing updates, the KB number and title of the update