Harnessing the KQL let Operator for Testing and Learning
Explore how to use the let operator to craft datasets, why it’s useful, and provide practical examples to get you started.
If you're diving into Kusto Query Language (KQL) to analyze data in Azure Data Explorer, Microsoft Sentinel, or other Kusto-powered platforms, the let operator is a powerful tool to streamline your queries and enhance your learning experience. One of its most practical applications is creating synthetic datasets for testing and experimentation. In this blog post, we’ll explore how to use the let operator to craft datasets, why it’s useful, and provide practical examples to get you started.
Why Use the let Operator for Datasets?
The let operator in KQL allows you to define variables, which can hold scalar values, tabular results, or even functions. When it comes to testing and learning, creating synthetic datasets with let offers several benefits:
Controlled Data: You can design datasets with specific characteristics to test queries without relying on real data, which may be sensitive or incomplete.
Reusability: Store datasets as variables to reuse across multiple queries, saving time and reducing redundancy.
Learning Environment: Practice KQL concepts like joins, aggregations, or time-series analysis with predictable data you create.
Debugging: Test edge cases or specific scenarios to understand how KQL operators behave.
Let’s dive into how to use let to create datasets and see it in action.
Creating a Simple Dataset with let
The let operator can be paired with the datatable operator to create a tabular dataset. The datatable operator allows you to define a table with columns and rows directly in your query.
Here’s a basic example of creating a dataset to simulate user activity logs:
let UserActivity = datatable(UserId: string, Activity: string, Timestamp: datetime)
[
"U001", "Login", datetime(2025-06-25T08:00:00Z),
"U001", "FileAccess", datetime(2025-06-25T08:15:00Z),
"U002", "Login", datetime(2025-06-25T09:00:00Z),
"U002", "Logout", datetime(2025-06-25T09:30:00Z)
];
UserActivity
What’s happening here?
We define a variable UserActivity using let.
The datatable operator creates a table with three columns: UserId (string), Activity (string), and Timestamp (datetime).
We populate the table with four rows of data.
Finally, we output the UserActivity dataset.
This dataset can now be used to practice filtering, grouping, or time-based queries. For example, to count activities per user:
let UserActivity = datatable(UserId: string, Activity: string, Timestamp: datetime)
[
"U001", "Login", datetime(2025-06-25T08:00:00Z),
"U001", "FileAccess", datetime(2025-06-25T08:15:00Z),
"U002", "Login", datetime(2025-06-25T09:00:00Z),
"U002", "Logout", datetime(2025-06-25T09:30:00Z)
];
UserActivity
| summarize ActivityCount = count() by UserId
Output:
Generating Larger Datasets with range
For more complex testing, you might need a larger dataset. KQL’s range operator, combined with let and make-series, can help generate synthetic data programmatically. Here’s an example of creating a dataset with simulated temperature readings:
let TemperatureData = range Timestamp from ago(7d) to now() step 1h
| extend SensorId = "S001",
Temperature = rand() * 20 + 15; // Random temperature between 15°C and 35°C
TemperatureData
| summarize AvgTemperature = avg(Temperature) by SensorId, bin(Timestamp, 1d)
What’s happening here?
The range operator generates a series of timestamps from 7 days ago to now, with a 1-hour step.
We use extend to add a SensorId column and a Temperature column with random values between 15°C and 35°C.
The dataset is stored in the TemperatureData variable.
We then summarize the average temperature per sensor per day.
This approach is ideal for testing time-series analysis, aggregations, or anomaly detection without needing real sensor data.
Using let for Reusable Test Scenarios
The let operator shines when you want to create reusable datasets for multiple test scenarios. For example, suppose you’re learning about joins. You can create two datasets and experiment with different join types:
let Orders = datatable(OrderId: int, CustomerId: string, Amount: real)
[
1, "C001", 100.50,
2, "C002", 200.75,
3, "C001", 150.25
];
let Customers = datatable(CustomerId: string, Name: string)
[
"C001", "Alice",
"C003", "Bob"
];
Orders
| join kind=leftouter Customers on CustomerId
Output:
Here, the Orders and Customers datasets are stored as variables, making it easy to test inner, leftouter, or other join types without redefining the data.
Tips for Using let in Testing and Learning
Start Small: Begin with simple datasets to understand KQL operators, then scale up to more complex scenarios.
Simulate Real-World Data: Mimic the structure of your actual data (e.g., timestamps, IDs, metrics) to make your tests relevant.
Combine with Functions: Use let to define reusable functions that generate or transform datasets dynamically.
Document Your Datasets: Add comments to your queries to describe the dataset’s purpose and structure for future reference.
Experiment Freely: Since synthetic data isn’t real, you can test extreme cases or errors without consequences.
TLDR
The KQL let operator, combined with tools like datatable and range, is a game-changer for creating synthetic datasets for testing and learning. Whether you’re practicing aggregations, joins, or time-series analysis, let enables you to craft controlled, reusable datasets that accelerate your mastery of KQL. By experimenting with these techniques, you’ll gain confidence in writing queries and tackling real-world data challenges.
Try creating your own datasets with let and share your favorite use cases in the comments! Happy querying!