The following is my CSV file
with the name runs.csv
:
PLAYER NAME, TEST 1, TEST 2, TEST 3, TEST 4, TEST 5
Sachin Tendulkar, 167, 134, 108, 100, 89
Rohit Sharma, 147, 78, 101, 36, 23
Mayank Aggarwal, 230, 143, 67, 90, 21
Virendar Sehwag, 75, 44, 12, 8, 98
M.S. Dhoni, 176, 234, 106, 86, 33
Yuvraj Singh, 445, 239, 123, 215, 67
KL Rahul, 290, 128, 76, 111, 336
Kapil Dev, 104, 87, 65, 90, 200
Sunil Gavaskar, 202, 103, 65, 21, 460
K. Srikanth, 222, 110, 97, 34, 02
Mahendar Amarnath, 12, 43, 87, 267, 341
Ajinkya Rahane, 123, 38, 01, 17, 66
The following is my program written in jupyter notebook:
import pandas as pd
cols = ["PLAYER NAME", "TEST 4"]
runs = pd.read_csv("runs.csv")
print(runs[cols])
I am getting the following error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-28-1eb63cfedd09> in <module>
2 cols = ["PLAYER NAME", "TEST 4"]
3 runs = pd.read_csv("runs.csv")
----> 4 print(runs[cols])
~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2906 if is_iterator(key):
2907 key = list(key)
-> 2908 indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
2909
2910 # take() does not accept boolean indexers
~\anaconda3\lib\site-packages\pandas\core\indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
1252 keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
1253
-> 1254 self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
1255 return keyarr, indexer
1256
~\anaconda3\lib\site-packages\pandas\core\indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
1302 if raise_missing:
1303 not_found = list(set(key) - set(ax))
-> 1304 raise KeyError(f"{not_found} not in index")
1305
1306 # we skip the warning on Categorical
KeyError: "['TEST 4'] not in index"
I don't what problem is going on with my code. I think that it's written correctly but still I'm encountering an error. Please resolve this. Thanks in advance.
CodePudding user response:
If you check the column names:
runs.columns
You'll notice that there's a white space in front of your 'TEST' column names.
Index(['PLAYER NAME', ' TEST 1', ' TEST 2', ' TEST 3', ' TEST 4', ' TEST 5'], dtype='object')
You'll need to either clear the white space by applying the strip() string method:
runs.columns = runs.columns.str.strip()
or change your cols variable to:
cols = ["PLAYER NAME", " TEST 4"]
to match the column name.
CodePudding user response:
You can either do this:
runs = pd.read_csv("runs.csv", sep=",\s*", engine="python")
Or this:
runs.columns = runs.columns.str.strip()