top of page
Writer's pictureGeordie M. Hines

Kusto Query Language (KQL) [Knowledge Base]

Article below is intended to be a repository for KQL knowledge sharing and documentation of this query language used in Azure. Please use this Log Analytics demo environment to test any of the concepts explained.


Definition

Kusto Query Language is a powerful tool to explore your data and discover patterns, identify anomalies and outliers, create statistical modeling, and more. The query uses schema entities that are organized in a hierarchy similar to SQL's: databases, tables, and columns.


A Kusto query is a read-only request to process data and return results. The request is stated in plain text, using a data-flow model that is easy to read, author, and automate. Kusto queries are made of one or more query statements


When working in Sentinel and Log analytics is important to start by identifying the sections or areas you need to know in order to use while working with logs, investigating or when testing queries. See the image below to have a better understanding of those sections:

The most common kind of query statement is a tabular expression statement, which means both its input and output consist of tables or tabular datasets. Tabular statements contain zero or more operators, each of which starts with a tabular input and returns a tabular output. Operators are sequenced by a | (pipe). Data flows, or is piped, from one operator to the next. The data is filtered or manipulated at each step and then fed into the following step.


Now that the concept and sections under logs in Sentinel were explained, let's jump into the syntax and examples of some Commands, Operators and Functions.

 

KQL Column Types


Basic

  • ­int, long

  • null: empty

  • ­bool: true, false

  • ­string: “example”, ‘example’

Time

  • ­datetime: datetime(2016-11-20 22:30:15.4), now(), ago(4d)

  • ­timespan: 2d, 20m, time(1.13:20:05.10), 100ms

Complex

  • ­dynamic: JSON format

 

Statement


’let’

Let statements bind names to expressions.


Example:

let starttime = 14d;
let endtime = 1d;
let timeframe = 1h;
let TotalEventsThreshold = 5;
  • Declare global ‘variable’ or reuse of ‘variables’.

  • Reuse of query runs

  • Declare functions

  • Declare dynamic table

 

Commands


‘where’

Filters a table to the subset of rows that satisfy a predicate.

Syntax:

T | where Predicate

Examples:

SecurityEvent
| where TimeGenerated > ago(1d)

SecurityEvent
| where * contains “Kusto”
  • String predicates: ==, has, has_any, contains, startswith, endswith, matches regex, etc

  • Numeric/Date predicates: ==, !=, <, >, <=, >=

  • Empty predicates: isempty(), notempty(), isnull(), notnull()


‘limit’

Return up to the specified number of rows.

Syntax:

T | limit <number>

Example:

SecurityEvent 
| limit 5
  • Sort is not guaranteed to be preserved.

  • Consistent result is not guaranteed (when running the same query twice)

  • Very useful when trying out new queries.

  • Default limit is 10,000.


‘count’

Returns the number of records in the input record set.

Syntax:

T | count

Example:

SecurityEvent
| where TimeGenerated > ago(1h)
| where EventID == 4624
| count

‘summarize’

Produces a table that aggregates the content of the input table.

Syntax:

T | summarize Aggregation [by Group Expression]

Example:

SecurityEvent
| where TimeGenerated > ago(1h)
| where EventID == 4624
| summarize count() by AccountType, Computer
  • Simple aggregation functions: count(), sum(), avg(), min(), max(),

  • Advanced functions: arg_min(), arg_max(), percentiles(), makelist(), countif()

arg_min(), arg_max(): returns the extreme value

Example:

Supplies 
| summarize arg_min(Price, Supplier) by Product 
//Cheapest supplier of each product

percentiles(): returns the value at the percentile

Example:

CallDetailRecords 
| summarize percentile(Duration, 95) by continent 
//The value of Duration that is larger than 95% of the sample set 

makelist(), makeset(): returns a list of all values/distinct values respectively

Example:

PageViewLog 
| summarize countries=make_set(country) by continent

‘summarize’: bin and time series

A very useful summarize operation is creating time series:

Example:

SecurityEvent 
| summarize count() by bin(TimeGenerated, 1h) 
| render timechart
  • Other time measurements: 1h, 5d, 10m (defaults to 1h)

  • Can create multiple legends by aggregating additional field


‘extend’

Create calculated columns and append them to the result set.

Syntax:

 T | extend ColumnName [= Expression] [, ...]

Example:

SecurityEvent
| extend ComputerNameLength = strlen(Computer)
  • The new added column is not indexed.

  • To only change a column name, use ‘project-rename’.

  • Useful function for in ‘extend’: iff, extract

‘project’

Select the columns to include, rename or drop, and insert new computed columns.

Syntax:

T | project ColumnName [= Expression] [, ...]

Example:

SecurityEvent
| project TimeGenerated, Computer
  • ‘| project-away’ – Removed specified column/s.

  • ‘| project-rename’ – Rename specified column/s.


‘distinct’

Produces a table with the distinct combination of the provided columns of the input table.

Syntax:

T | distinct Column1, Column2

Example:

SecurityEvent 
| distinct Computer
 

Operators


’search’

Searches a text pattern in multiple tables and columns.


’iff’

Returns the value of ifTrue if predicate evaluates to true, or the value of ifFalse otherwise.

Syntax:

iff(predicate, ifTrue, ifFalse)

Example:

StormEvents
| extend Rain = iff((EventType in ("Heavy Rain", "Flash Flood", "Flood")), "Rain event", "Not rain event")
| project State, EventId, EventType, Rain

‘order by’ - ‘sort by’ - ‘top’

Order by: Sort the rows of the input table into order by one or more columns.

Top: returns the top values after sort. Faster and can sort by expression


Syntax:

T | sort by column [asc | desc] [nulls first | nulls last]
T | top NumberOfRows by Expression [asc | desc] [nulls first | nulls last]

Example:

Table 
| order by country asc, price desc
  • Don’t assume order by default


‘union’

Takes two or more tables and returns the rows of all of them.

Example:

SecurityEvent 
| union (SecurityAlert | where Severity > 3)
  • kind=inner(common columns), outer (all columns- default)

  • Supports wildcard to union multiple tables (union Security*)

  • Can union between tables from different clusters (or workspaces)


‘join’

Merge the rows of two tables to form a new table by matching values of the specified column(s) from each table.

Syntax:

LeftTable | join [JoinParameters] ( RightTable ) on Attributes

Example:

SecurityEvent
| join (SecurityAlert | where Severity > 3) on Account

’mv-expand’

Expands multi-value dynamic arrays or property bags into multiple records.

mv-expand can be described as the opposite of the aggregation operators that pack multiple values into a single dynamic-typed array or property bag, such as summarize ... make-list() and make-series. Each element in the (scalar) array or property bag generates a new record in the output of the operator.

Syntax:

T | mv-expand [bagexpansion=(bag | array)] [with_itemindex=IndexColumnName] ColumnName [to typeof( Typename)] [, ColumnName ...] [limit Rowlimit]

T | mv-expand [bagexpansion=(bag | array)] Name = ArrayExpression [to typeof(Typename)] [, [Name =] ArrayExpression [to typeof(Typename)] ...] [limit Rowlimit]

Example:


 

Functions


’extract’

Get a match for a regular expression from a text string.

Syntax:

extract(regex, captureGroup, text [, typeLiteral])

Example:

extract("x=([0-9.]+)", 1, "hello x=45.6|wo") == "45.6“

’parse_json’ / ’todynamic’

Interprets a string as a JSON value and returns the value as dynamic. If possible, the value is converted into relevant data types. For strict parsing with no data type conversion, use extract() or extractjson() functions.

Syntax:

parse_json(json) todynamic(json)

Example:

In the following example, when context_custom_metrics is a string that looks like this:

JSON {"duration":{"value":118.0,"count":5.0,"min":100.0,"max":150.0,"stdDev":0.0,"sampledValue":118.0,"sum":118.0}}

T
| extend d=parse_json(context_custom_metrics) 
| extend duration_value=d.duration.value, duration_min=d["duration"]["min"]
 

Where to go from here?





961 views0 comments

Comments


bottom of page