I am using azure AKS for Mechine learning model deployment and it automatiaclly deploys models weekly
Now AKS produces more costs for log analytics data ingestion
We are working to optimize the data ingestion to the log analytics
i have two nodes in AKS
Somehow we can reduce some data ingestion. but when i see the data ingestion today for past 24 hour it again increases and when i try to see the nodes which produces billable data ingestion it shows one more filed which shows as 'deprecate field: see http://aka'
below i mentioned query and the query result for reference
query
find where TimeGenerated > ago(24h) project _BilledSize, Computer
| extend computerName = tolower(tostring(split(Computer, '.')[0]))
| where computerName != ""
| summarize TotalVolumeBytes=sum(_BilledSize) by computerName
query result
ComputerName TotalVolumeBytes
aks-agentpool-28198374-vmss000007 232,567,315
aks-agentpool-28198374-vmss000001 617,340,843
deprecated field: see http://aka 129,052
computerName
deprecated field: see http://aka
TotalVolumeBytes
129052
Here aks-agentpool-28198374-vmss000007 and aks-agentpool-28198374-vmss000001 are my nodes
But i dont have any idea about 'deprecated field: see http://aka'
I am not handling ML model deployment(i dont know obviuosly) and i asked ML team and they also dont know how this came into this
I analyzed many documents and query but i could not get the answer about what it is and how to get rid of this.
Can anyone guide me on this what is this in my node list and how can i stop this?
CodePudding user response:
The value you are getting http://aka
is probably part of a link to some microsoft documentation. You are truncating it though when you do tolower(tostring(split(Computer, '.')[0]))
.
Try add Computer
to your summarize clause so that you can get the full link:
find where TimeGenerated > ago(24h) project _BilledSize, Computer
| extend computerName = tolower(tostring(split(Computer, '.')[0]))
| where computerName != ""
| summarize TotalVolumeBytes=sum(_BilledSize) by computerName, Computer
Once you do this you discover that the value of the field Computer
is Deprecated field: see http://aka.ms/LA-Usage
Visiting that link tells you that the Usage
table contains "Hourly usage data for each table in the workspace." And in the column reference table you can see that there are a number of columns that are deprecated, for example "Computer".
I.e The computer field is being deprecated and will later be removed from the usage table. It is still there, but the value of the field will always be Deprecated field: see http://aka.ms/LA-Usage
, until it is removed indefinitely.
This just means that the Computer
field in the Usage table should no longer be used and will be removed. The Usage table alone can no longer be used to determine which Computer
incurred which cost.
You can query the usage table with the following query and see that the Computer
field always conatains the deprecation message.
Usage
| summarize sum(_BilledSize) by Computer, SourceSystem, DataType, Type
EDIT:
If you are concerned with finding which resource contributes more to the cost of your workspace rather than which resource generates the most logs (which do not necessarily need to be the same since some data is ingested for free in log analytics), please add _IsBillable as a filter to your query to exclude entries that are not billed:
find where TimeGenerated > ago(24h) project _BilledSize, Computer, _IsBillable
| extend computerName = tolower(tostring(split(Computer, '.')[0]))
| where computerName != "" and _IsBillable == true
| summarize TotalVolumeBytes=sum(_BilledSize) by computerName
Since the entries in the usage table where the field Computer
is a deprecated field is not billable it will not show up.