As shown below, name must be keep in fisrt
and team in last
.
How can I accomplish this with .drop_duplicates()
or otherwise?
name team ...
0 john a ...
1 mike b ...
2 john c
↓
name team ...
0 john c ...
1 mike b ...
-- Additional note about comments --
.groupby('name').agg({'team': 'last', 'country': 'first'})
The way it works now, if the first line of country
is Nan
If the first line of country is Nan, a value that is not the first
will be obtained as follows.
Is this because the case of Nan
is ignored?
Even if first
is specified and first
is Nan
, Nan
must still be retained.
name team country ...
0 john a Nan ...
1 mike b Brazil ...
2 john c Canada ...
↓
name team country ...
0 john c Canada ...
1 mike b Brazil ...
CodePudding user response:
You can use the .groupby()
function:
df.groupby('name').agg({'team': 'last'})
.
Be aware that in the value that's returned per name is dependent on the sorting of your dataframe.