I have two datafames df1 and df2.
Compare in same day and id
- if rank df1 < rank df2 so keep value df1 type 'D' and df2 type 'A'.
- if rank df1 > rank df2 so keep value df1 type 'A' and df2 type 'D'.
df1:
id | day | rank | type |
---|---|---|---|
1 | 25/01 | 22 | D |
1 | 25/01 | 22 | A |
5 | 25/01 | 66 | D |
5 | 25/01 | 66 | A |
10 | 26/01 | 55 | D |
df2:
id | day | rank | type |
---|---|---|---|
1 | 25/01 | 58 | D |
1 | 25/01 | 58 | A |
5 | 25/01 | 10 | D |
5 | 25/01 | 10 | A |
10 | 26/01 | 100 | D |
10 | 26/01 | 100 | A |
Output df1:
id | day | rank | type |
---|---|---|---|
1 | 25/01 | 22 | D |
5 | 25/01 | 66 | A |
10 | 26/01 | 55 | D |
Output df2:
id | day | rank | type |
---|---|---|---|
1 | 25/01 | 58 | A |
5 | 25/01 | 10 | D |
10 | 26/01 | 100 | A |
I have the code:
if df1.groupby(["id", "Date"])['Rank'] > df2.groupby(["id", "Date"])['Rank']:
df1 = df1[(df1['Type' == 'A'])]
df2 = df2[(df2['Type' == 'D'])]
else:
df1 = df1[(df1['Type' == 'D'])]
df2 = df2[(df2['Type' == 'A'])]
But it is giving me the following error:
TypeError: '>' not supported between instances of 'SeriesGroupBy' and 'SeriesGroupBy'
So how can i fix this?
Thank you!
CodePudding user response:
df1.groupby(["id", "Date"])['Rank']
Is of instance "SeriesGroupBy".
From the Pandas-groupby documents, You want to put str after to get the value. And then you can convert it to an int to be compared.
int(df1.groupby(["id", "Date"])['Rank'].str)
CodePudding user response:
df1 = pd.DataFrame(
{
'id': [1, 1, 5, 5],
'day': ['25/01', '25/01', '25/01', '25/01'],
'rank': [22, 22, 66, 66],
'type': ['D', 'A', 'D', 'A'],
}
)
df2 = pd.DataFrame(
{
'id': [1, 1, 5, 5],
'day': ['25/01', '25/01', '25/01', '25/01'],
'rank': [58, 58, 10, 10],
'type': ['D', 'A', 'D', 'A'],
}
)
df1 = df1.drop_duplicates(['id', 'day', 'rank'])
df2 = df2.drop_duplicates(['id', 'day', 'rank'])
for i in range(df1.shape[0]):
if df1.iloc[i]['rank'] > df2.iloc[i]['rank']:
df1.iloc[i, 3] = 'A'
df2.iloc[i, 3] = 'D'
else:
df1.iloc[i, 3] = 'D'
df2.iloc[i, 3] = 'A'
print(df1)
print(df2)
This will print what you want:
id day rank type
0 1 25/01 22 D
2 5 25/01 66 A
id day rank type
0 1 25/01 58 A
2 5 25/01 10 D