Home > Software engineering >  How to groupby a column string and plot each subset and then save each plot as a PNG? [python]
How to groupby a column string and plot each subset and then save each plot as a PNG? [python]

Time:05-04

I have a dataframe with a certain column that has multiple string categories, and then a timestamp and values column. I want to group this dataframe by each string category and for each of these category subsets, I want to plot the values against the timestamps. I am using python.

Here is the code I am using:

for ID, group in df.set_index('timestamp').groupby('ID'):
    group.plot(kind='line', y='Values', ylabel = 'Value')

And so this code goes inside my dataframe, groups by each "ID" string and creates a separate plot for each subset corresponding to each "ID". Great. I want to send all of the produced plots to a PNG file, and I see that to do this I need to use .savefig(). However, when I try to run:

for ID, group in df.set_index('timestamp').groupby('ID'):
    group.plot(kind='line', y='Values', ylabel = 'Value')
    plt.savefig('Path/to/my/folder/test!.png')

or

for ID, group in df.set_index('timestamp').groupby('ID'):
    group.plot(kind='line', y='Values', ylabel = 'Value')
plt.savefig('Path/to/my/folder/test.png')

there is only one 'test.png' file produced, and it is for the last plot in the grouped for loop.

Intuitively I thought to stay within the grouped loop (sorry, but I am not sure what else to call this) I should try running:

for ID, group in df.set_index('timestamp').groupby('ID'):
    group.plot(kind='line', y='Values', ylabel = 'Value').savefig('Path/to/my/folder/test.png')

but just tacking on ".savefig()" to the end of the "group.plot()" line did not work and just returned:

AttributeError: 'AxesSubplot' object has no attribute 'savefig'

How can I augment my code so that I can loop through each string category in my dataframe's "ID" column, produce a plot for each subset, and then save it as a .PNG file?

CodePudding user response:

Save each file with a different name. You're currently overwriting "test.png" at every iteration:

for ID, group in df.set_index('timestamp').groupby('ID'):
    group.plot(kind='line', y='Values', ylabel = 'Value')
    plt.savefig(f'Path/to/my/folder/test_{ID}.png')
  • Related