This is a pandas question.
Try to copy this in Jupyter Notebook:
In [1]: df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
df
In [2]: df.pop('shield') # Return as series.
In [3]: pd.DataFrame(df.pop('shield')) # Return as DataFrame.
Then inverse it to the sequence of
In[1]
In[3]
In[2]
Why the 3rd Out[-] always cause an error?
I oftentimes encounter this kinds of error. Is this a cache issue? Redundancy? What is the reason why such error occurs?
CodePudding user response:
I think your code generate expected error:
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
Here DataFrame.pop
take column shield
from original Dataframe, create Series and drop from original:
a = df.pop('shield') # Return as series.
print (a)
cobra 2
viper 5
sidewinder 8
Name: shield, dtype: int64
So no column in df
after pop
:
print (df)
max_speed
cobra 1
viper 4
sidewinder 7
So failed get column shield
, because not exist in df
:
b = pd.DataFrame(df.pop('shield')) # Return as DataFrame.
print (b)
KeyError: 'shield'
CodePudding user response:
pop
removes the series from the dataframe while returning it.
You can't call it twice in sequence no matter what.
CodePudding user response:
in 2 nd line you are deleting shield
in third lien you are pop again
removing 2nd and code will work as expected
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
pd.DataFrame(df.pop('shield'))