Home > Mobile >  How to make Time Series plot and mark 5 days with the highest sales in Python Pandas?
How to make Time Series plot and mark 5 days with the highest sales in Python Pandas?

Time:06-01

I have DataFrame in Python Pandas like below:

  • date_col - in "datetime64" format

  • sales - in "int64" format

    date_col sales
    2019-01-05 100
    2019-03-20 500
    2019-04-28 290
    ... ...

And I need to create Time Series plot and mark in a separate color the 5 days with the highest number of sales.

Currently I have code like below:

df['sales'].plot(linewidth=1.5,
                 grid = True,
                 marker="o",
                 linestyle="-",
                 markersize=4,
                 label="Daily sales",
                 color = "steelblue")
plt.xlabel("date")
plt.ylabel("sales")
plt.legend()
plt.show()

And it gives result: enter image description here

But as a final result I need something like below:

  • vertical lines representing 5 days with the highest sales
  • date in year month day format for those 5 days with the highest sales

enter image description here

How can I do that in Python ? How can I modify my code or do that in other way?

mrCopiCat I used your code and I have result like below, why ? enter image description here

CodePudding user response:

well, you can use the ax.annotate function built in matplotlib with the ax.vlines. here's an example with 5 maximums (I did use simple int values for date (for the sake of the example) but it'll surely work with your datetime values):

import numpy as np
import pandas as pd
import random
import matplotlib.pyplot as plt

# init figure and axis
fig = plt.figure()
ax = fig.add_subplot(111)
ax.set(title="maximum sales ...")

# sample df
data = {'date_col': list(range(20)), 'sales': [random.randint(1, 30) for _ in range(20)]}
df = pd.DataFrame.from_dict(data).set_index('date_col')

# plotting the df
df.plot(ax=ax)

# adding the lines
num_max = 5 # change this if you want more or less points

for row in df.sort_values(by='sales', ascending=False).iloc[:num_max].iterrows():
    print(row[0], row[1])
    xmax, ymax = row[0], row[1]
    ax.vlines(xmax, 0, ymax, color="tab:red") 
    ax.annotate(f'{xmax}', xy=(xmax, ymax), xytext=(xmax, ymax   1), color="tab:red")

# setting size limit and plotting
ax.set_ylim(0,40) # change or remove that too
plt.show()

output:

enter image description here

  • Related