The dataframe is a snapshot of what I'm working with. What I want to achieve is a line chart of the monthly sales of the different brands. Months with '-' before the number indicates the months prior up to 12 months prior and months without '-' before the number indicates months after also up to 12 months. On the line chart, all the '-' months(prior months) will be on the left and the months after will be on the right with 0 as center. Count_Toy_Brand_A is the count of different Toy_Brands at company A whereas Count_Toy_Brand_B is the count of different TOY_Brands at company B. In total I'm going to end up with three different line chart for A,B,C with two lines each labelled Company A and Company B. I was wondering if this can be done in pandas. Any help would be appreciated. Attached is a snapshot of the output I'm looking for.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'toy_brand_A':['A', 'B', 'C'],
'count_toy_brand_A':[11, 5, 16],
'month-0_sales_A':[50, 12, 6],
'month-1_sales_A':[8,7, 6],
'month-2_sales_A':[8,7, 6],
'month1_sales_A':[20, 40, 18],
'month2_sales_A':[8,7, 6],
'count_toy_brand_B':[15, 25, 4],
'month-0_sales_B':[10, 8, 61],
'month-1_sales_B':[3,5, 90],
'month-2_sales_B':[4,7, 1],
'month1_sales_B':[30, 20, 18],
'month2_sales_B':[8,7, 6],
})
CodePudding user response:
Try to reformat the data first:
# Removes unnecessary columns and sets new indices.
cols = df.columns[~df.columns.str.contains('count')]
df = df[cols]
df.rename(columns={'toy_brand_A': 'toy_brand'}, inplace=True)
df.set_index('toy_brand', inplace=True)
df = df.T
df['company'] = 0
df.loc[df.index.str.contains('_A'), 'company'] = 'A'
df.loc[df.index.str.contains('_B'), 'company'] = 'B'
df.index = df.index.str.replace('_A', '')
df.index = df.index.str.replace('_B', '')
Output:
toy_brand A B C company
month-0_sales 50 12 6 A
month-1_sales 8 7 6 A
month-2_sales 8 7 6 A
month1_sales 20 40 18 A
month2_sales 8 7 6 A
month-0_sales 10 8 61 B
Plot:
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib import pyplot as plt
plt.style.use('ggplot')
plt.figure(figsize=(10, 8))
for i, brand in enumerate(['A', 'B', 'C']):
ax = plt.subplot(2, 2, i 1)
data = df[[brand, 'company']].reset_index()
s = sns.lineplot(data=data, x='index', y=brand, hue='company', ax=ax)
s.set_xticklabels(a.index, rotation=45, ha='right')
s.set_xlabel('months', fontsize=20)
s.set_ylabel(f'Sales brand {brand}', fontsize=20)
s.set_title('My Title')
plt.tight_layout()
plt.show()