I have a big data.frame with roughly 100 columns and try to plot all the time-series in one graph. Is there an easy way to deal with it, without specifying every y-axis manually?
This would be a simple example with these time-series: 02K W, 03K W, and 04K W:
import pandas as pd
import matplotlib.pyplot as plt
df1 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05'],
'index':[0, 1, 2, 3, 4],
'02K W':[3.5, 0.1, 3, 'nan', 0.2],
'03K W':[4.2, 5.2, 2.5, 3.0, 0.6],
'04K W':[1.5, 2.6, 8.2, 4.2, 5.3]})
df1['Date'] = pd.to_datetime(df1['Date'])
df1 = df1.set_index('index')
So far, I manually specify all y-axis to plot the individual time-series.
plt.plot(df1['Date'], df1['02K W'])
plt.plot(df1['Date'], df1['03K W'])
plt.plot(df1['Date'], df1['04K W'])
Is there a more elegant way to specify the relevant columns for the plot? Thank you very much for your suggestions :)
CodePudding user response:
import pandas as pd
import matplotlib.pyplot as plt
df1 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05'],
'index':[0, 1, 2, 3, 4],
'02K W':[3.5, 0.1, 3, 'nan', 0.2],
'03K W':[4.2, 5.2, 2.5, 3.0, 0.6],
'04K W':[1.5, 2.6, 8.2, 4.2, 5.3]})
df1['Date'] = pd.to_datetime(df1['Date'])
df1 = df1.set_index('index')
for col in df1.colums[1:]:
plt.plot(df1['Date'], df1[col])
CodePudding user response:
You can melt
your columns and use seaborn.lineplot
:
import seaborn as sns
sns.lineplot(data=df1.replace('nan', float('nan')).melt(id_vars=['Date']),
x='Date', y='value', hue='variable'
)
output:
CodePudding user response:
Is there a more elegant way to specify the relevant columns for the plot?
Use DataFrame.plot
with Date
as the index and filter by the desired columns
:
columns = ['02K W', '03K W', '04K W']
df1.set_index('Date')[columns].plot()
Note that you have a string 'nan'
in your sample data. If this is true in your real data, you should convert it to a real np.nan
, e.g., with pd.to_numeric
or DataFrame.replace
.