Home > Blockchain >  matlibplot - network bytes y axis to human readable
matlibplot - network bytes y axis to human readable

Time:03-25

I'm trying to create some auto generated reports from networking equipment.

The data is sent from the sensors in bytes, as expected with networking equipment. However the "issue" that I'm having with this is, that I would like to modify the y-axis to interpreter the result as either strictly to megabytes(mb) or even better kb/mb/gb depending on the average stream size of the set.

For reference, if I look at this through the gui (graphite) this specific query yields a render with the Y-axis to be Mb enter image description here

While matlibplot interprates the float values as floats hence defaults to scientific notation

enter image description here

# a small fraction sample data
# Typically handing ~8400 rows per iteration
    [
        [
            973920.8333330154,
            '2022-03-01 00:00:00'
        ],
        [
            993574,
            '2022-03-02 00:00:00'
        ],
        [
            964486.5,
            '2022-03-03 00:00:00'
        ],
        [
            972345.4166669846,
            '2022-03-04 00:00:00'
        ],
        [
            972770.9166665077,
            '2022-03-05 00:00:00'
        ],
        [
            961019.75,
            '2022-03-06 00:00:00'
        ],
        [
            957663.9166665077,
            '2022-03-07 00:00:00'
        ],
        [
            957588.5,
            '2022-03-08 00:00:00'
        ],
        [
            959108,
            '2022-03-09 00:00:00'
        ],
        [
            959551.8333334923,
            '2022-03-10 00:00:00'
        ]
    ]

This is my first interaction with plotting so might be a cleaver function to solve this that I have not found.

All ideas are welcome.

Update:

I have not yet been able to figure out custom major formatter suggested by MR.T

I have a prototype of changing the dataframe values to MB before plotting, but needs to test if mean is a good fit for my purpose

    import pandas as pd
    import numpy as np
    
    
    def sizeof_fmt(num):
        for x in ['bytes', 'KB', 'MB', 'GB', 'TB']:
            if num < 1024.0:
                return "%3.1f_%s" % (num, x)
            num /= 1024.0
    
    
    df = pd.DataFrame(
        [[2097152, '2020-01-02 00:00:00'],
         [2831155.2, '2020-01-02 00:00:00'],
         [1077936128, '2020-01-02 00:00:00'],
         [np.NaN, '2020-01-02 00:00:00'],
         [np.NaN, '2020-01-02 00:00:00']])
    df.plot()
    
    
    operator = sizeof_fmt(df[0].mean()).split('_')[1]
    print(sizeof_fmt(df[0].mean()))
    print(operator)
    
    if (operator == 'MB'):
        df[[0]] = df[[0]].div(1024*1024)
    
    print(df)

CodePudding user response:

Your function is perfect but you never apply it to the formatter. You can do this like:

from matplotlib import pyplot as plt
import matplotlib.ticker as tkr  
import numpy as np

def sizeof_fmt(x, pos):
    if x<0:
        return ""
    for x_unit in ['bytes', 'kB', 'MB', 'GB', 'TB']:
        if x < 1024.0:
            return "%3.1f %s" % (x, x_unit)
        x /= 1024.0

#sample data
import pandas as pd
df = pd.DataFrame(
        [[12097152, '2020-01-02 00:00:00'],
         [31155.2, '2020-01-02 10:00:00'],
         [np.NaN, '2020-01-02 12:00:00'],
         [1234, '2020-01-03 04:00:00'],
         [1877936128, '2020-01-03 05:10:00']])

df[1] = pd.to_datetime(df[1])
ax = df.plot(x=1, y=0)

ax.yaxis.set_major_formatter(tkr.FuncFormatter(sizeof_fmt))
plt.show()

enter image description here

The advantage of the Formatter is that the updated ticks when you zoom in are still in the correct format.

enter image description here

You could also write your own Locator method to ensure that ticks are always on multiples of 1024 but I have never looked into this.

  • Related