Home > OS >  Group by the dataframe with specific id and then plot with another columns
Group by the dataframe with specific id and then plot with another columns

Time:02-11

I have a dataframe as follow, I have grouped them the columns with "specific_id". However, I need to plot the data frame based on the column "time" ( if the time is sorted, it will be great also).

Here is the dataframe I have,

import pandas as pd
import numpy as np
df = pd.DataFrame()
df['time'] = ['2019-01-07 09:38:30', '2020-01-08 09:38:30', '2021-01-07 09:38:30', 
'2020-01-07 09:38:30']
df['specific_id'] = ['d', 'd', 'f', 'f']
df['c1'] = [2, 3,7, 5]
df['c2'] = [0, 5, 10, 3]

df

I have group the dataframe with the following code,

df_sticked = df.filter(regex='c\d ', axis=1) \
.groupby(df['specific_id']).apply(np.ravel).apply(pd.Series) \
.rename(lambda x: f"c{x   2}", axis=1).reset_index().fillna(0)
df_sticked

However, when I want to plot the data, I can not show the time in the x axis.

import matplotlib.pyplot as plt
%matplotlib inline
dfforplot = df_sticked.iloc[:, 1:-1]
dfforplot.T.plot(figsize=(20,11), legend=False)
plt.show()

Could you please help me? Thanks

CodePudding user response:

I simplified the steps in order to plot time by value:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

df = pd.DataFrame()
df['time'] = ['2019-01-07 09:38:30', '2020-01-08 09:38:30', '2021-01-07 09:38:30', 
'2020-01-07 09:38:30']
df['specific_id'] = ['d', 'd', 'f', 'f']
df['c1'] = [2, 3,7, 5]
df['c2'] = [0, 5, 10, 3]

df['c'] = df.filter(regex='c\d ', axis=1).values.tolist()
df = df[['specific_id', 'time', 'c']]
df = df.explode('c')
items = [x for _, x in df.groupby('specific_id')]

for item in items:
    plt.plot(item['time'].values,
             item['c'].values,
             label=item['specific_id'].iloc[0])
plt.legend(loc="upper left")
plt.show()
  • Related