Home > OS >  How to plot many lines from stacked dataframe column in one plot? [python]
How to plot many lines from stacked dataframe column in one plot? [python]

Time:02-11

I have a dataframe that looks like this:

                timestamp       Value       Color
--------------------------------------------------
 0    2018-03-04 07:11:08          34         Red
 1    2018-03-04 07:11:09          34         Red
 2    2018-03-04 07:11:10          35         Red
 3    2018-03-04 07:11:12          36         Red
 4    2018-03-04 07:11:14          24         Red
 5    2018-03-04 07:11:15          34         Red
... 
55    2018-03-04 07:12:17          34        Blue
56    2018-03-04 07:12:18          35        Blue
57    2018-03-04 07:12:19          36        Blue
58    2018-03-04 07:12:20          37        Blue
59    2018-03-04 07:12:21          35        Blue
60    2018-03-04 07:12:22          32        Blue

And so over the course of 60 seconds, for each time stamp, there is a value recorded, but the values are split between two colors, Red and Blue. And so, within this dataframe we see time series curves for two different curves occurring at different times, one after the other, and not overlapping. What I want to do is plot them. However, I want to ignore the timestamps, so that it is assumed they start at the same time, and so just treating each color as an array of ordered values, ignoring time skips and assuming equally spaced time intervals. I simply want to plot the Red curve and the Blue curve on the same chart. How can this be done in python? I am trying simply

plt.plot(Blue, Red)

Though I am not sure how to account for the x-axis, which I simply want to be seconds.

CodePudding user response:

df = pandas.DataFrame({
            'times':list(pandas.date_range('2020-01-01',periods=10,freq='15T'))   
                    list(pandas.date_range('2020-01-01',periods=10,freq='15T')),
            'colors':['red']*10   ['blue'] * 10,
            'value': numpy.random.randint(0,255,20)
    })

gives us something like your dataframe

                 times colors  value
0  2020-01-01 00:00:00    red    224
1  2020-01-01 00:15:00    red     47
2  2020-01-01 00:30:00    red     25
3  2020-01-01 00:45:00    red    211
4  2020-01-01 01:00:00    red     18
5  2020-01-01 01:15:00    red    119
6  2020-01-01 01:30:00    red     52
7  2020-01-01 01:45:00    red    246
8  2020-01-01 02:00:00    red     54
9  2020-01-01 02:15:00    red    156
10 2020-01-01 00:00:00   blue     42
11 2020-01-01 00:15:00   blue     55
12 2020-01-01 00:30:00   blue    151
13 2020-01-01 00:45:00   blue    236
14 2020-01-01 01:00:00   blue    207
15 2020-01-01 01:15:00   blue    165
16 2020-01-01 01:30:00   blue    131
17 2020-01-01 01:45:00   blue    199
18 2020-01-01 02:00:00   blue    247
19 2020-01-01 02:15:00   blue     61

we can pivot this using

 df2 = df.pivot(index='times',columns=['colors'],values=['value'])

which gives us

                        value     
colors               blue  red
times                         
2020-01-01 00:00:00    70  225
2020-01-01 00:15:00   162   78
2020-01-01 00:30:00   188   37
2020-01-01 00:45:00   134  234
2020-01-01 01:00:00    46   73
2020-01-01 01:15:00    76   60
2020-01-01 01:30:00   143   61
2020-01-01 01:45:00   150  198
2020-01-01 02:00:00    82  159
2020-01-01 02:15:00   127   94

now we can easily just plot it...

df2.plot()
pyplot.show()

you can drop the value part of the column name with

df2 = df2.droplevel(0,axis=1)
df2.plot()
pyplot.show()

enter image description here

The other option is to just call it individually

BLUE = df[df['colors'] == 'blue']
RED = df[df['colors'] == 'red']
pyplot.plot(BLUE['times'],BLUE['value'])
pyplot.plot(RED['times'],RED['value'])
pyplot.show()

you could use pandas groupby also (dont do this one probably :P )

def plot_it(group,values):
    pyplot.plot(values['times'],values['value'])
df.groupby(['colors']).apply(plot_it)
pyplot.show()

but really the "right" way to handle it is probably the first option (to pivot it to the shape you want)

---- Edit (based on comments) ----

if you dont want the months and to just treat it as a list of y values, just use range as your x

BLUE = df[df['colors'] == 'blue']
RED = df[df['colors'] == 'red']
pyplot.plot(range(len(BLUE)),BLUE['value'])
pyplot.plot(range(len(RED)),RED['value'])
pyplot.show()
  • Related