Home > Blockchain >  Is it possible to create a line graph in pptx-python from a dataframe?
Is it possible to create a line graph in pptx-python from a dataframe?

Time:04-12

I was wondering if it's possible to take a dataframe that you created in python and have the values automatically populate in a PowerPoint line graph using the pptx-python package? I've looked at all of the examples that I could find and it seems you have to manually input the values that you want (see code below).

chart_data = ChartData()
chart_data.categories = ['Q1 Sales', 'Q2 Sales', 'Q3 Sales']
chart_data.add_series('West',    (32.2, 28.4, 34.7))
chart_data.add_series('East',    (24.3, 30.6, 20.2))
chart_data.add_series('Midwest', (20.4, 18.3, 26.2))

x, y, cx, cy = Inches(2), Inches(2), Inches(6), Inches(4.5)
chart = slide.shapes.add_chart(
    XL_CHART_TYPE.LINE, x, y, cx, cy, chart_data
).chart

chart.has_legend = True
chart.legend.include_in_layout = False
chart.series[0].smooth = True

Is possible to automatically transfer what I have in my dataframe into the chart_data.categories and chart_data.add_series() sections so I don't have to type it all out? Or is it always manual entry?

CodePudding user response:

Sure it's possible. See how you set chart_data.categories to a list? That can be any list you want, so all you need is to figure out how to get that list out of your dataframe. Similarly for each series and its data -- how can you get the series name and the data therein out of your dataframe? There's no way for us to tell, since you forgot to include your dataframe in your question.

Suppose your dataframe looks like so:

         Q1 Sales  Q2 Sales  Q3 Sales
West         32.2      28.4      34.7
East         24.3      30.6      20.2
Midwest      20.4      18.3      26.2

You can get the list of column names with

>>> df.columns.to_list()
['Q1 Sales', 'Q2 Sales', 'Q3 Sales']

You can get the row names with

>>> df.index.to_list()
['West', 'East', 'Midwest']

And for a given index in the dataframe, you can get the values of all columns with:

>>> index = "West"
>>> df.loc[index, :].to_list()
[32.2, 28.4, 34.7]

So, you can do:

chart_data.categories = df.columns.to_list()
for serie in df.index: # Didn't need to convert to list here because a for loop iterates over any iterable
    chart_data.add_series(serie, df.loc[serie, :].to_list())

If add_series needs its second argument to be a tuple, you can convert the row to a tuple instead of a list:

for serie in df.index: 
    chart_data.add_series(serie, tuple(df.loc[serie, :]))
  • Related