Home > Software design >  Matplotlib confuses hours and days
Matplotlib confuses hours and days

Time:04-28

I'm trying to plot a pandas DataFrame with timestamps as index and int or float as columns:

Time Current_Gen1 Current_Gen2 Current_Gen3 Current_100A_110V Current_100A_220V Current_shore_power Gens_sum
2022-04-01 19:00:00 44.011111 0.0 42.833333 2.000000 19.055556 -0.066667 86.844444
2022-04-01 20:00:00 44.522222 0.0 43.616667 2.000000 17.861111 -0.155556 88.138889
2022-04-01 21:00:00 45.155556 0.0 44.705556 2.005556 18.616667 -0.161111 89.861111
2022-04-01 22:00:00 45.972222 0.0 45.372222 2.000000 19.133333 -0.277778 91.344444
2022-04-01 23:00:00 45.677778 0.0 45.755556 2.000000 20.744444 -0.377778 91.433333

I down sampled it to the hour to have fewer points to display

I want to display a major grid for the days and minor for the hours

If let the default settings, I get this graph :

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv(r"data.csv", index_col=['Time'], parse_dates=['Time'])
df2 = df.resample('H').mean()

fig_gen_sum, axe = plt.subplots()
df2.plot(ax=axe, y='Gens_sum', grid=True)

default graph

First issue: I don't understand why there is one date with the full description

And when I'm using formatter to get the days :

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as dates

df = pd.read_csv(r"data.csv", index_col=['Time'], parse_dates=['Time'])
df2 = df.resample('H').mean()

fig_gen_sum, axe = plt.subplots()
df2.plot(ax=axe, y='Gens_sum', grid=True)

xax = axe.get_xaxis()
xax.set_major_formatter(dates.DateFormatter('%d-%m'))
xax.set_major_locator(dates.DayLocator())

day formatter graph

zoomed formatter graph

Second issue: I'm getting a graph with days counted as hours if read it right

I've tried different configurations, but it always gives me weird result

I'm guessing the issue is with the formatting of my timestamps, but I can't find what

CodePudding user response:

@tdy gave the response in the comment of the post

The solution is to set the option x_compat=True like:

df2.plot(ax=axe,x='time', y='Gens_sum', grid=True, x_compat=True)

As specified in pandas documentation

Pandas includes automatic tick resolution adjustment for regular frequency time-series data. For limited cases where pandas cannot infer the frequency information (e.g., in an externally created twinx), you can choose to suppress this behavior for alignment purposes.

I can then get this result:

Wanted result

Everything else, including major and minor ticks works as well from there :

fig_gen_sum, axe = plt.subplots()
df2.plot(ax=axe,x='time', y='Gens_sum', grid=True, x_compat=True)

xax = axe.get_xaxis()

xax.set_major_locator(dates.DayLocator())
xax.set_major_formatter(dates.DateFormatter('%y/%m/%d'))
xax.set_minor_locator(dates.HourLocator(interval=5))
xax.set_minor_formatter(dates.DateFormatter('%H h'))

Gives:

Multi-ticks graph

  • Related