So I'm doing an assignment. I am given a file called 'population.csv'.
The file contains a list of population of countries by year. Using pandas, I want to obtain the top 20 populations given a column(year).
import pandas as pd
df = pd.read_csv('population.csv', sep='\t')
print(df.nlargest(20,columns='2018'))
I am getting a weird error shown here:
Traceback (most recent call last):
File "C:\Users\nacho\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\indexes\base.py", line 3621, in get_loc
return self._engine.get_loc(casted_key)
File "pandas_libs\index.pyx", line 136, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 163, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '2018'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "c:\Users\nacho\Desktop\Personal\Fordham\Senior\Spring 2022\CompSci\Labs\Lab 8\lab8.py", line 7, in
print(df.nlargest(5,columns='2018'))
File "C:\Users\nacho\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\frame.py", line 6684, in nlargest
return algorithms.SelectNFrame(self, n=n, keep=keep, columns=columns).nlargest()
File "C:\Users\nacho\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\algorithms.py", line 1137, in nlargest
return self.compute("nlargest")
File "C:\Users\nacho\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\algorithms.py", line 1274, in compute
dtype = frame[column].dtype
File "C:\Users\nacho\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\frame.py", line 3505, in getitem
indexer = self.columns.get_loc(key)
File "C:\Users\nacho\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\pandas\core\indexes\base.py", line 3623, in get_loc
raise KeyError(key) from err
KeyError: '2018'
CodePudding user response:
NEVERMIND... File is actually separated by commas. Professor said it was separated by \t when it wasnt...
CodePudding user response:
You can select top 20 rows from dataframe with specific column in this way:
df = df[['year']].head(20)
print("First 20 rows of the Dataframe for year column: ")
print(df)