Home > Back-end >  How to make the timeseries axis ticks and labels match the data points
How to make the timeseries axis ticks and labels match the data points

Time:12-01

I've managed to bypass this issue by manually setting my xticks. However, I would like to understand the reason behind this 'issue'.

I have the following data:

enter image description here

As you can see, I've calculated the average basket for our weekly campaign. However, if I try plotting it, the first couple dates are changed.

plt.figure(figsize=(16,6))

plt.plot(pd.to_datetime(avg_basket_pbm["trunc_usage_date"]), avg_basket_pbm["average_base_spend"], label="Monday Coupon Users")
plt.plot(pd.to_datetime(avg_basket_reg["TRUNC(EV.EFFECTIVE_DAY,'DD')"]), avg_basket_reg["average_base_spend"], label="Regular Coupon Users")
# plt.xticks(pd.to_datetime(avg_basket_pbm["trunc_usage_date"],format="%Y-%m-%d"), labels = avg_basket_pbm["trunc_usage_date"])
plt.title("Average Basket", weight = "bold", fontsize=14)
plt.legend()

enter image description here

This seems to only influence the first couple dates, the latter ones remain unchanged.

Reproducible Example:

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

dates = ['2021-10-11 00:00:00',
 '2021-10-18 00:00:00',
 '2021-10-25 00:00:00',
 '2021-11-01 00:00:00',
 '2021-11-08 00:00:00',
 '2021-11-15 00:00:00',
 '2021-11-22 00:00:00']


stamps = pd.to_datetime(dates)
vals = np.random.random(7)

plt.figure(figsize=(14,6))
plt.plot(stamps, vals)

CodePudding user response:

Matplotlib does not try to match the ticks exactly to your data because in most situations that just looks bad. To illustrate, I adapted your example and added just four datapoints and forced the xticks to match the data:

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

dates = ['2021-10-11 00:00:00',
 '2021-10-18 00:00:00',
 '2021-10-25 00:00:00',
 '2021-11-01 00:00:00',
 '2021-11-08 00:00:00',
 '2021-11-09 00:00:00', # added
 '2021-11-10 00:00:00', # added
 '2021-11-11 00:00:00', # added
 '2021-11-12 00:00:00', # added
 '2021-11-15 00:00:00',
 '2021-11-22 00:00:00']


stamps = pd.to_datetime(dates)
vals = np.random.random(11)

plt.figure(figsize=(14,6))
plt.xticks(stamps, labels = dates)
plt.plot(stamps, vals)

enter image description here

The tick labels overlap and are unreadable. If you comment the plt.xticks line out, matplotlib is allowed to "do its own thing" and put the ticks where it considers it "reasonable". In this specific case, it appears to be "weekly, starting from the 1st of the month". If you have a different opinion/algorithm for what you consider "reasonable" tick mark positions, then you have to place them manually.

CodePudding user response:

  • The correct way to plot a pandas.DataFrame is with enter image description here

    • Specifying xticks=df1.date1, or xticks=df2.date2, all of the dates from one DataFrame will be on the x-axis.
      • The dates in the two DataFrames are not the same.
    ax = df1.plot(x='date1', marker='o', figsize=(12, 5))
    df2.plot(x='date2', marker='o', ax=ax, xticks=df1.date1, rot=90, grid=True)
    
    ax.set_title("Average Basket", weight="bold", fontsize=14)
    

    enter image description here

    • Combine the dates from both DataFrames to use as xticks
    # combine the dates to be used for xticks
    xticks = df1.date1.tolist()   df2.date2.tolist()
    
    # plot the dataframe
    ax = df1.plot(x='date1', marker='o', figsize=(12, 5))
    df2.plot(x='date2', marker='o', ax=ax, xticks=xticks, rot=90, grid=True)
    
    ax.set_title("Average Basket", weight="bold", fontsize=14)
    

    enter image description here

  • Related