I have the following code:
import pandas as pd
import matplotlib as mlp
import matplotlib.pyplot as plt
import csv
df = pd.read_csv (r'C:\Users\User\Desktop\Dataset2.csv', index_col=0)
print (df)
dataframe1 = df.sort_values('ums', ascending = False)
fig = plt.figure(figsize=(20,5))
ax1 = fig.add_subplot(1,2,1)
ax2 = fig.add_subplot(1,2,2)
ax1.bar(dataframe1.index.dataframe1['ums'])
ax1.set_xticklabels(dataframe1.index, rotation=60, horizontalalignment = 'right', fontsize = '12')
ax1.set_title('Title', fontsize = '22')
ax1.set_ylabel('Text')
plt.show()
It should read the .csv file named "Dataset2" but every time I execute the code I keep getting "Exception has occurred: KeyError 'ums' File "C:\Users\User\Desktop\datafile2.py", line 10, in dataframe1 = df.sort_values('ums', ascending = False)".
My column in the .csv file has exactly the same name. Here is how the first lines of my file look like:
nr port country ums
1 Port1 Australia 47.03
2 Port2 USA 37.47
What can I do to fix this? Any help is appreciated.
CodePudding user response:
I can't reproduce your issue
this is my code, I just simplified your matplotlib codes
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('Dataset2.csv')
print(df)
dataframe1 = df.sort_values('ums', ascending=False)
names = dataframe1['port']
values = dataframe1['ums']
plt.figure(figsize=(20, 5))
plt.plot(names, values)
plt.show()
Maybe because you're using raw string when defining the dataset path?
try to remove the r
and the index_col, as the default pandas read_csv
will treat the first row as the header
df = pd.read_csv('Dataset2.csv')
this is the result from my code
CodePudding user response:
Usually csv file use comma ,
as default to seperate the names, so the content of your file should be like:
nr,port,country,ums
1,Port1,Australia,47.03
2,Port2,USA,37.47
Or specify the separator explicitly as commented by @wwii:
pd.read_csv(r'C:\Users\User\Desktop\Dataset2.csv', index_col=0, sep='\s ')
CodePudding user response:
Have you tried specifying in the following syntax:
df.sort_values(by=['ums'], ascending=False)
Additionally, if that doesn't work, try removing the index_col=0
to see if that makes a difference.