Home > OS >  PyPlot line plot changing color by column value
PyPlot line plot changing color by column value

Time:07-15

I have a dataframe with a structure similar to the following example.

df = pd.DataFrame({'x': ['2008-01-01', '2008-01-02', '2008-01-03', '2008-01-04'], 'y': [1, 2, 3, 6],
                   'group_id': ['OBSERVED', 'IMPUTED', 'OBSERVED', 'IMPUTED'], 'color': ['blue', 'red', 'blue', 'red']})
df['x'] = pd.to_datetime(df['x'])

I.e. a dataframe where some of the values (y) are observed and others are imputed.

           x  y  group_id color
0 2008-01-01  1  OBSERVED  blue
1 2008-01-02  2   IMPUTED   red
2 2008-01-03  3  OBSERVED  blue
3 2008-01-04  6   IMPUTED   red

How to I create a single line which changes color based on the group_id (the column color is uniquely determined by group_id as in this example)?

I have tried the following two solutions (one of them being omitted by the comment)

df_grp = df.groupby('group_id')
fig, ax = plt.subplots(1)
for id, data in df_grp:
    #ax.plot(data['x'], data['y'], label=id, color=data['color'].unique().tolist()[0])
    data.plot('x', 'y', label=id, ax=ax)
plt.legend()
plt.show()

However, the plow is not

  1. a single line.
  2. colored correctly by each segment.

enter image description here

CodePudding user response:

You can use the below code to do the forward looking colors. The key was to get the data right in the dataframe, so that the plotting was easy. You can print(df) after manipulation to see what was done. Primarily, I added the x and y from below row as additional columns in the current row for all except last row. I also included a marker of the resultant color so that you know whether the color is red of blue. One thing to note, the dates in the x column should be in ascending order.

#Add x_end column to df from subsequent row - column X
end_date=df.iloc[1:,:]['x'].reset_index(drop=True).rename('x_end')
df = pd.concat([df, end_date], axis=1)

#Add y_end column to df from subsequent row - column y
end_y=df.iloc[1:,:]['y'].reset_index(drop=True).astype(int).rename('y_end')
df = pd.concat([df, end_y], axis=1)

#Add values for  last row, same as x and y so the marker is of right color
df.iat[len(df)-1, 4] = df.iat[len(df)-1, 0] 
df.iat[len(df)-1, 5] = df.iat[len(df)-1, 1] 

for i in range(len(df)):
    plt.plot([df.iloc[i,0],df.iloc[i,4]],[df.iloc[i,1],df.iloc[i,5]], marker='o', c=df.iat[i,3])

Output plot

enter image description here

  • Related