The KQL externaldata operator is a powerful tool to access data that exists outside, or external, to the tenant where the querying is being performed. This enables organizations to bring together data from both local and external sources.
Want to dig deeper into the externaldata operator? Get your copy of The Definitive Guide to KQL from Microsoft Press
I’ve written a few KQL-based plugins recently that make use of the externaldata operator to source additional intelligence for Copilot for Security. One example is the Copilot for Security Plugin: MITRE ATT&CK Reference. This plugin queries a .csv file that resides in my GitHub repository to pull in data such as:
Tactic name
id
url
platforms
killchainphases
description
datasources
detection
The result is the ability to get information about MITRE ATT&CK tactics in relation to potential threats within the environment being serviced by Copilot for Security. This is valuable.
But there’s a big caveat: Can you trust the external data source?
Who has access to the external data source? Who can edit the data source? Much of the data source may look valid, but is the entire content valid? i.e., is there any tidbit that could’ve been injected to skew results, generate fake, harmful, or abusive responses, or do even more nefarious things?
Here’s an example of what I’m talking about and why you need to be super careful about the external data sources you access for intelligence.
I created a fake MITRE ATT&CK .csv file that looks good and has good data - EXCEPT I inserted only a couple rows of fake data. Then I updated the original MITRE ATT&CK plugin to point to the new, fake data file. Now (as shown in the image) when I engage the MITRE_Attack plugin to get information about a fake T1012.666, you can see that the response is not quite what you’d expect.
And the URL? Well, the actual link is obscured, and it links off to here:
The recommendation is that, if at all possible, you should control and maintain your own data sources instead of using externaldata. Or use a secure API if the data source offers it.
In one instance recently, I migrated my data source for The Definitive Guide to KQL from Microsoft Press plugin from my GitHub repo to a free Azure Data Explorer cluster.
This is a much better way to vet, control, and maintain the data source for temporary purposes. If you need the data source longer term and still need to utilize the externaldata operator consider moving the external data to an Azure storage account in the same tenant.
[Want to discuss this further? Hit me up on Twitter or LinkedIn]
[Subscribe to the RSS feed for this blog]
[ Subscribe to the Bi-weekly Copilot for Security Newsletter]
[Subscribe to the Weekly Microsoft Sentinel Newsletter]
[Subscribe to the Weekly Microsoft Defender Newsletter]
[Subscribe to the Weekly Azure OpenAI Newsletter]
[Learn KQL with the Must Learn KQL series and book]
[Learn AI Security with the Must Learn AI Security series and book]