How can I plot two lines in the same graph from Excel using Pandas and Matplotlib?-CodePudding

I want to read an Excel file, sum the values for the years 2021, 2020 and 2019 for the locations from the same region (region B) and then create a graph with two lines (for region A and B) which will show how the values for both regions have changed during the years.

I tried with this code:

import numpy
import pandas as pd
import matplotlib.pyplot as plt

excel_file_path = "Testfile.xlsx"
df = pd.read_excel(excel_file_path)

df_region = df.groupby(['Region']).sum()

x = ["2021", "2020", "2019"]
y1 = df_region["Values2021"]
y2 = df_region["Values2020"]
y3 = df_region["Values2019"]

fig = plt.figure(figsize=(20,5))

plt.plot(x, y1, color = 'red', label = "A")
plt.plot(x, y2, color = 'blue', label = "B")

plt.legend(loc='best')

plt.show()

But it isn't working for me - I get the following error:

"Exception has occurred: ValueError x and y must have same first dimension, but have shapes (3,) and (2,)"

And this is my Excel file:

Location    Region    Values2021   Values2020    Values2019
Location1   A         720,00       680,00        554,00
Location2   B         340,00       360,00        389,00
Location3   B         320,00       230,00        287,00

What do I need to do to get the result that I want? Any help would be greatly appreciated.

CodePudding user response：

I think you meant to define your y1 and y2 a little differently.

x = ["2021", "2020", "2019"]

fig = plt.figure(figsize=(20,5))
colors = ['red', 'blue']

for i, region in enumerate(df_region.index):
    y = df_region.loc[region, :]
    plt.plot(x, y, color = colors[i], label = region)

plt.legend(loc='best')

plt.show()

which plots AND reads the region names from the DataFrame.