Home > front end >  Bar plot for multidimensional columns using pandas
Bar plot for multidimensional columns using pandas

Time:04-21

I want to plot my dataframe (df) as a bar plot based on the time columns, where each bar represents the value counts() for each letter that appears in the column.

Expected output

enter image description here.

date,00:00:00,01:00:00,02:00:00,03:00:00,04:00:00
2002-02-01,Y,Y,U,N,N
2002-02-02,U,N,N,N,N
2002-02-03,N,N,N,N,N
2002-02-04,N,N,N,N,N
2002-02-05,N,N,N,N,N

When I select individual time columns, I can do as below

import pandas as pd 
import numpy as np 
from datetime import datetime
import matplotlib.pyplot as plt

df = pd.read_csv('df.csv')
df = df['04:00:00'].value_counts()
df.plot(kind='bar')
plt.show()

How can I plot all the columns on the same bar plot as shown on the expected output.

CodePudding user response:

One possible solution is:

pd.DataFrame({t: df[t].value_counts() for t in df.columns if t != "date"}).T.plot.bar()

CodePudding user response:

Here is an approach via seaborn's catplot:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from io import StringIO

df_str = '''date,00:00:00,01:00:00,02:00:00,03:00:00,04:00:00
2002-02-01,Y,Y,U,N,N
2002-02-02,U,N,N,N,N
2002-02-03,N,N,N,N,N
2002-02-04,N,N,N,N,N
2002-02-05,N,N,N,N,N'''

df = pd.read_csv(StringIO(df_str))
df_long = df.set_index('date').melt(var_name='hour', value_name='kind')
g = sns.catplot(kind='count', data=df_long, x='kind', palette='mako',
                col='hour', col_wrap=5, height=3, aspect=0.5)
for ax in g.axes.flat:
     ax.set_xlabel(ax.get_title()) # use the title as xlabel
     ax.grid(True, axis='y')
     ax.set_title('')
     if len(ax.get_ylabel()) == 0:
          sns.despine(ax=ax, left=True) # remove left axis for interior subplots
          ax.tick_params(axis='y', size=0)
plt.tight_layout()
plt.show()

sns.catplot with counts

  • Related