Home > Enterprise >  How to extract the ids from the 99th percentile of results for each day
How to extract the ids from the 99th percentile of results for each day

Time:03-31

I have a data set in Kusto and I would like to gather the ids from each day where their latency is greater than the 99th percentile for that day.

e.g I have a table like this:

id latency time
a 23.4 1/2/2022
c 1.4 1/2/2022
b 25.6 1/3/2022
d 2.3 1/3/2022

and I want the result table to just be

time id
1/2/2022 a
1/3/2022 b

where a and b come from the 99th percentile of latency times for those days

What I would like to work:

let 99p= T
| summarize percentiles(latency, 99) by bin(time, 1d);
T
| where startofday(time) == 99p.time
| where latency > 99p.latency
| project time, id

Thanks for any insight, pretty new to kusto

CodePudding user response:

// This is not a part of the solution, only generation of a sample data set
let T = materialize (range id from 1 to 1000 step 1 | extend ['time'] = ago(rand()*10d), latency = round(rand()*100,2));
// The solution starts here
T | extend day = bin(['time'], 1d) | as T2
| summarize percentiles(latency, 99) by day
| join kind=inner T2 on day 
| where latency >= percentile_latency_99
| summarize make_list(id) by day
day list_id
2022-03-20T00:00:00Z [298]
2022-03-21T00:00:00Z [642]
2022-03-22T00:00:00Z [200,504]
2022-03-23T00:00:00Z [726,885]
2022-03-24T00:00:00Z [590,975]
2022-03-25T00:00:00Z [107]
2022-03-26T00:00:00Z [836]
2022-03-27T00:00:00Z [68]
2022-03-28T00:00:00Z [527,859]
2022-03-29T00:00:00Z [172,874]
2022-03-30T00:00:00Z [274]

Fiddle

  • Related