KQL Time Machine: Querying Historical Events to Learn Data Analysis
Learn how to use KQL operators like summarize, timechart, and pivot to reveal trends and patterns.
Welcome to the KQL Time Machine, where we travel through history using Kusto Query Language (KQL) to uncover insights from ancient datasets. In this first post, we’ll dive into the Silk Road trade records from the 8th century to analyze spice trade volumes. By exploring this dataset, you’ll learn how to use KQL operators like summarize
, timechart
, and pivot
to reveal trends and patterns, making data analysis both fun and educational.
Grab the dataset for this activity: https://github.com/rod-trent/SilkRoadTrades
The Dataset: Silk Road Spice Trade Records
Imagine a bustling caravan winding through the deserts of Central Asia, carrying saffron, cinnamon, and pepper. Our dataset, inspired by historical trade logs, contains records of spice shipments between 700–800 CE. Each entry includes:
TradeDate: When the shipment was recorded.
SpiceType: The type of spice (e.g., Saffron, Cinnamon, Pepper).
Volume: Quantity traded (in kilograms).
Origin: Where the spice came from (e.g., Persia, India, China).
Destination: Where it was headed (e.g., Byzantium, Samarkand).
This data is stored in a Kusto table called SilkRoadTrades
. Our goal is to analyze trade patterns and learn KQL along the way.
Grab the dataset and learn how to utilize it: https://github.com/rod-trent/SilkRoadTrades
Step 1: Exploring the Data with summarize
To get a sense of the dataset, let’s find the total volume of each spice traded. The summarize
operator in KQL is perfect for aggregating data.
SilkRoadTrades
| summarize TotalVolume = sum(Volume) by SpiceType
| order by TotalVolume desc
This query groups the data by SpiceType
, calculates the sum of Volume
for each spice, and sorts the results in descending order. The output might look like:
Pepper was the king of the Silk Road! This simple query introduces summarize
, a key operator for grouping and aggregating data.
Step 2: Visualizing Trends with timechart
Trade wasn’t constant—wars, weather, and demand affected volumes over time. Let’s visualize how saffron trade volumes changed year by year using the timechart
operator, which creates time-series visualizations.
SilkRoadTrades
| where SpiceType == "Saffron"
| summarize TotalVolume = sum(Volume) by bin(TradeDate, 1y)
| render timechart
Here, we:
Filter for
Saffron
usingwhere
.Group by year (
bin(TradeDate, 1y)
rounds dates to the nearest year).Sum the
Volume
for each year.Use
render timechart
to generate a line chart.
The chart might show a spike in saffron trade around 750 CE, perhaps due to a new trade agreement with Byzantium. The timechart
operator is ideal for spotting temporal trends.
Step 3: Comparing Origins with pivot
Which regions dominated the saffron trade? The pivot
operator reshapes data to compare values across categories. Let’s see how saffron volumes varied by origin and destination.
SilkRoadTrades
| where SpiceType == "Saffron"
| summarize TotalVolume = sum(Volume) by Origin, Destination
| pivot (sum(TotalVolume), Destination) by Origin
This query:
Filters for
Saffron
.Summarizes
Volume
byOrigin
andDestination
.Pivots the data to create a table where rows are
Origin
, columns areDestination
, and values are the summedTotalVolume
.
The result might look like:
Persia was a major saffron supplier to Byzantium! The pivot
operator makes it easy to compare relationships in the data.
Why KQL for Historical Data?
KQL’s strength lies in its simplicity and power for querying large datasets. By applying it to historical records, we can:
Uncover trends (e.g., spice trade booms).
Compare categories (e.g., origins vs. destinations).
Visualize patterns over time.
Plus, it’s fun to imagine merchants haggling over saffron while we query their records centuries later!
Try It Yourself
To follow along, you can simulate the SilkRoadTrades
table in Azure Data Explorer or a Kusto emulator. Create a table with columns for TradeDate
(datetime), SpiceType
(string), Volume
(real), Origin
(string), and Destination
(string). Populate it with sample data, then run the queries above.
For details on how to apply the dataset, see: https://github.com/rod-trent/SilkRoadTrades
Next time, we’ll sail into the 19th century with ship logs to analyze voyage durations using KQL’s join
and percentiles
operators. Until then, keep querying the past to uncover the future!