Time-Series Analysis with KQL: Unlocking Trends and Forecasting in Azure
Chasing Trends and Predicting Futures
Time-series analysis is a powerful technique for understanding trends, detecting anomalies, and forecasting future values based on historical data. In Azure, Kusto Query Language (KQL) provides robust tools like make-series and series_fit to perform time-series analysis efficiently. This blog post explores KQL’s time-series capabilities, demonstrating how to leverage these functions for trend analysis and forecasting in Azure Data Explorer (ADX) or other KQL-supported services like Azure Monitor.
What is Time-Series Analysis?
Time-series data consists of observations collected at regular time intervals, such as CPU usage, website traffic, or sensor readings. Time-series analysis helps uncover patterns (e.g., trends, seasonality) and predict future values. KQL’s time-series functions simplify these tasks by transforming raw data into aggregated series and applying mathematical models for analysis.
Key KQL Time-Series Functions
KQL offers several functions tailored for time-series analysis. The most important ones include:
make-series: Aggregates data into a time-series format, creating a sequence of values over fixed time intervals.
series_fit_line: Fits a linear regression line to a time-series to identify trends.
series_fit_2lines: Detects changes in trends by fitting two linear regression lines.
series_periods_detect: Identifies seasonality or periodic patterns in the data.
series_outliers: Detects anomalies in a time-series.
series_decompose: Separates a time-series into trend, seasonal, and residual components.
In this post, we’ll focus on make-series and series_fit functions to perform trend analysis and forecasting.
Setting Up the Environment
To follow along, you’ll need access to an Azure Data Explorer cluster or a KQL-compatible service like Azure Monitor. For this example, we’ll use a sample dataset containing server performance metrics (e.g., CPU usage) stored in a table called ServerMetrics. The table has columns for Timestamp, ServerName, and CPUUsage.
Step 1: Creating a Time-Series with make-series
The make-series operator aggregates data into a time-series by grouping values over fixed time intervals. Let’s create a time-series of average CPU usage per hour for a specific server.
ServerMetrics
| where ServerName == "Server01"
| make-series AvgCPU = avg(CPUUsage) default=0 on Timestamp from ago(7d) to now() step 1h
Explanation:
make-series: Creates a time-series.
AvgCPU: The name of the series, calculated as the average of CPUUsage.
default=0: Fills missing values with 0.
on Timestamp: Specifies the time column.
from ago(7d) to now(): Defines the time range (last 7 days).
step 1h: Sets the time interval to 1 hour.
Output:
The query returns a single row with two columns:
Timestamp: An array of timestamps (e.g., [2025-05-26T00:00:00Z, 2025-05-26T01:00:00Z, ...]).
AvgCPU: An array of average CPU usage values (e.g., [45.2, 50.1, ...]).
This time-series is ready for further analysis.
Step 2: Trend Analysis with series_fit_line
To identify the overall trend in CPU usage, we can use series_fit_line, which fits a linear regression line to the time-series.
ServerMetrics
| where ServerName == "Server01"
| make-series AvgCPU = avg(CPUUsage) default=0 on Timestamp from ago(7d) to now() step 1h
| extend Trend = series_fit_line(AvgCPU)
Explanation:
series_fit_line(AvgCPU): Fits a linear regression line to the AvgCPU series.
The output includes the slope, intercept, and fitted values, which describe the trend.
Output:
The Trend column contains a dynamic object with properties like:
slope: Indicates the direction and steepness of the trend (positive = increasing, negative = decreasing).
intercept: The y-intercept of the regression line.
line_fit: An array of fitted values for visualization.
To visualize the trend, you can plot the AvgCPU and Trend.line_fit arrays in ADX’s built-in charting tools.
Step 3: Detecting Trend Changes with series_fit_2lines
If you suspect a change in the trend (e.g., a sudden increase in CPU usage), use series_fit_2lines to fit two linear regression lines and identify the breakpoint.
ServerMetrics
| where ServerName == "Server01"
| make-series AvgCPU = avg(CPUUsage) default=0 on Timestamp from ago(7d) to now() step 1h
| extend TwoTrends = series_fit_2lines(AvgCPU)
Explanation:
series_fit_2lines(AvgCPU): Fits two regression lines and detects the point where the trend changes.
Output:
The TwoTrends column includes:
breakpoint: The index in the series where the trend changes.
line1_fit and line2_fit: Arrays of fitted values for the two lines.
This is useful for pinpointing events like system upgrades or failures that alter resource usage patterns.
Step 4: Forecasting with Linear Regression
While KQL doesn’t have a dedicated forecasting function, you can extend the linear regression from series_fit_line to predict future values. Here’s an example:
ServerMetrics
| where ServerName == "Server01"
| make-series AvgCPU = avg(CPUUsage) default=0 on Timestamp from ago(7d) to now() step 1h
| extend Trend = series_fit_line(AvgCPU)
| project Timestamp, AvgCPU, TrendSlope = Trend.slope, TrendIntercept = Trend.intercept
| extend Forecast = TrendSlope * array_length(Timestamp) + TrendIntercept
Explanation:
Trend.slope and Trend.intercept: Extracted from series_fit_line.
array_length(Timestamp): Represents the next time step for forecasting.
Forecast: Calculates the predicted value for the next hour.
This simple approach assumes the trend continues linearly. For more complex forecasting, consider combining KQL with Azure Machine Learning.
Step 5: Visualizing Results
Azure Data Explorer’s visualization pane lets you plot time-series data. To visualize the actual CPU usage and the fitted trend:
ServerMetrics
| where ServerName == "Server01"
| make-series AvgCPU = avg(CPUUsage) default=0 on Timestamp from ago(7d) to now() step 1h
| extend Trend = series_fit_line(AvgCPU)
| project Timestamp, AvgCPU, TrendLine = Trend.line_fit
| render timechart
This query generates a line chart with AvgCPU and TrendLine, making it easy to spot trends.
Practical Applications
KQL’s time-series capabilities are invaluable for:
Monitoring: Track resource usage (e.g., CPU, memory) and detect anomalies.
Capacity Planning: Forecast future resource needs based on historical trends.
Anomaly Detection: Identify unusual spikes or drops in metrics.
Business Insights: Analyze time-based KPIs like website traffic or sales.
Tips for Effective Time-Series Analysis
Choose the Right Time Interval: Use a step size in make-series that matches your analysis needs (e.g., 1m for real-time monitoring, 1d for long-term trends).
Handle Missing Data: Use the default parameter in make-series to fill gaps appropriately.
Combine with Other Functions: Use series_outliers or series_periods_detect for deeper insights.
Scale with ADX: For large datasets, leverage ADX’s distributed architecture to process millions of records quickly.
TLDR
KQL’s time-series functions, such as make-series and series_fit, provide a powerful and accessible way to perform trend analysis and forecasting in Azure. By transforming raw data into time-series and applying regression models, you can uncover insights and make data-driven decisions. Whether you’re monitoring infrastructure, analyzing business metrics, or predicting future trends, KQL and Azure Data Explorer are your go-to tools for time-series analysis.