Home > other >  format index at stacked bar plot
format index at stacked bar plot

Time:03-31

Hpw can I setup a useful index in minutes and hours?

csv_file = dir_path   "/stacktest.csv"
with open(csv_file, newline='') as csv_file:
    data = pd.read_csv(csv_file, sep=',')
df = pd.DataFrame(data)
df = df[['seconds', 'marker', 'data1', 'data2', 'data3']]
df['seconds'] = df['seconds'].astype(str)
df = df.set_index('seconds')
dfStacked = df[['data1', 'data2']]

ax = dfStacked.plot(kind='bar', stacked=True, alpha=0.5)
import matplotlib.dates as mdates
majorFmt = mdates.DateFormatter('%H:%M')
ax.xaxis.set_major_locator(mdates.MinuteLocator(interval=30))
ax.xaxis.set_major_formatter(majorFmt)
plt.plot(df.index, df['data3'], linestyle='solid', color='blue', alpha=0.4, label='data1')
plt.show()

If I remove DateFormatter it seems to be something wrong with the index.

By changing the lines to:

#majorFmt = mdates.DateFormatter('%H:%M')
ax.xaxis.set_major_locator(mdates.MinuteLocator(interval=60*60))
#ax.xaxis.set_major_formatter(majorFmt)

There is an x-Index with [121,377,...] 121 is the seconds value, it sets a marker at 2 minutes with an interval 60*60.

Example data

seconds,marker,data1,data2,data3,data4
0,B,0,0,0,0
59,C,42000,8000,369000,0
74,B,42000,8000,369000,283041
121,B,42000,8000,369000,283041
179,B,42000,8000,369000,283041
239,B,42000,8000,369000,283041
304,B,42000,8000,369000,283041
360,B,42000,8000,369000,283041
377,A,42000,8000,369000,283041
420,B,42000,8000,369000,283041
493,B,42000,8000,369000,283041
540,B,42000,8000,369000,283041
600,B,42000,8000,369000,283041
659,B,42000,8000,369000,283041
719,B,64000,8000,412000,283041
780,B,64000,8000,412000,283041
840,B,64000,8000,412000,283041
880,A,64000,8000,412000,283041
900,B,64000,8000,412000,283041
961,B,64000,8000,412000,283041
1020,B,64000,8000,412000,283041
1079,B,64000,8000,412000,283041
1141,B,64000,8000,412000,283041
1200,B,64000,8000,412000,283041
1260,B,64000,8000,412000,283041
1320,B,64000,8000,412000,283041
1365,A,64000,8000,412000,283041
1382,B,64000,8000,412000,283041
1440,B,64000,8000,412000,283041
1498,B,64000,8000,412000,283041
1559,B,64000,8000,412000,283041
1621,B,64000,8000,412000,283041
1679,B,64000,8000,412000,283041
1740,B,64000,8000,412000,283041
1800,B,42000,8000,369000,283041
1830,A,42000,8000,369000,283041
1867,B,42000,8000,369000,283041
1921,B,42000,8000,369000,283041
1979,B,42000,8000,369000,283041
2040,B,42000,8000,369000,283041
2099,B,42000,8000,369000,283041
2159,B,42000,8000,369000,283041
2220,B,42000,8000,369000,283041
2272,A,42000,8000,369000,283041
2288,B,42000,8000,369000,283041
2341,B,42000,8000,369000,283041
2400,B,42000,8000,369000,283041
2460,B,42000,8000,369000,283041
2520,B,42000,8000,369000,283041
2579,B,42000,8000,369000,283041
2640,B,42000,8000,369000,283041
2700,B,42000,8000,369000,283041
2720,A,42000,8000,369000,283041
2759,B,42000,8000,369000,283041
2833,B,28000,14000,248000,260096
2880,B,28000,14000,248000,247808
2940,B,14000,28000,124000,123904
3000,B,0,42000,0,0
3060,B,0,42000,0,0
3120,B,0,42000,0,0
3136,A,0,42000,0,0
3180,B,0,42000,0,0
3251,B,0,42000,0,0
3267,D,0,42000,0,0
3300,B,0,42000,0,0
3359,B,0,42000,0,0
3419,B,0,42000,0,0

CodePudding user response:

You could write a custom formatter to show the numeric seconds as hours and minutes.

To draw the barplot with a numeric x-axis, matplotlib's bar() can be used. The widths of the bars varies, they can be calculated from the differences between the successive seconds. The code below shows the bars sticking to each other. Setting an edgecolor (ax.bar(..., ec='white', lw=1) would show a small separation.

import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
import pandas as pd
import numpy as np

def hms_formatter(x, pos):
    seconds = int(x)
    minutes = seconds // 60
    seconds %= 60
    hours = minutes // 60
    minutes %= 60
    if hours == 0:
        return f'{minutes:2d}:{seconds:02d}:'
    else:
        return f'{hours:2d}:{minutes:02d}:{seconds:02d}:'

df = pd.read_csv(...)
fig, ax = plt.subplots(figsize=(15, 5))
bottom = 0
widths = np.diff(df['seconds'])
widths = np.append(widths, widths[-1])
for col in ['data1', 'data2', 'data3']:
    ax.bar(df['seconds'], df[col], bottom=bottom, width=widths,
           align='edge', label=col)
    bottom  = df[col]

ax.plot(df['seconds'], df['data3'], linestyle='solid', color='crimson', lw=3, alpha=0.4, label='data3 (unstacked)')

ax.margins(x=0.01)
ax.xaxis.set_major_locator(MultipleLocator(10 * 60))
ax.xaxis.set_major_formatter(hms_formatter)
ax.legend()
plt.tight_layout()
plt.show()

stacked bars via matplotlib, custom hhmmss formatter

  • Related