Home > other >  Python TypeError: '>' not supported between instances of 'SeriesGroupBy' and
Python TypeError: '>' not supported between instances of 'SeriesGroupBy' and

Time:11-05

I have two datafames df1 and df2.

Compare in same day and id

  • if rank df1 < rank df2 so keep value df1 type 'D' and df2 type 'A'.
  • if rank df1 > rank df2 so keep value df1 type 'A' and df2 type 'D'.

df1:

id day rank type
1 25/01 22 D
1 25/01 22 A
5 25/01 66 D
5 25/01 66 A
10 26/01 55 D

df2:

id day rank type
1 25/01 58 D
1 25/01 58 A
5 25/01 10 D
5 25/01 10 A
10 26/01 100 D
10 26/01 100 A

Output df1:

id day rank type
1 25/01 22 D
5 25/01 66 A
10 26/01 55 D

Output df2:

id day rank type
1 25/01 58 A
5 25/01 10 D
10 26/01 100 A

I have the code:

if df1.groupby(["id", "Date"])['Rank'] > df2.groupby(["id", "Date"])['Rank']:
    df1 = df1[(df1['Type' == 'A'])]
    df2 = df2[(df2['Type' == 'D'])]
else:
    df1 = df1[(df1['Type' == 'D'])]
    df2 = df2[(df2['Type' == 'A'])]

But it is giving me the following error:

TypeError: '>' not supported between instances of 'SeriesGroupBy' and 'SeriesGroupBy'

So how can i fix this?

Thank you!

CodePudding user response:

df1.groupby(["id", "Date"])['Rank']

Is of instance "SeriesGroupBy".

From the Pandas-groupby documents, You want to put str after to get the value. And then you can convert it to an int to be compared.

int(df1.groupby(["id", "Date"])['Rank'].str)

CodePudding user response:

df1 = pd.DataFrame(
    {
        'id': [1, 1, 5, 5],
        'day': ['25/01', '25/01', '25/01', '25/01'],
        'rank': [22, 22, 66, 66],
        'type': ['D', 'A', 'D', 'A'],
    }
)

df2 = pd.DataFrame(
    {
        'id': [1, 1, 5, 5],
        'day': ['25/01', '25/01', '25/01', '25/01'],
        'rank': [58, 58, 10, 10],
        'type': ['D', 'A', 'D', 'A'],
    }
)

df1 = df1.drop_duplicates(['id', 'day', 'rank'])
df2 = df2.drop_duplicates(['id', 'day', 'rank'])

for i in range(df1.shape[0]):
    if df1.iloc[i]['rank'] > df2.iloc[i]['rank']:
        df1.iloc[i, 3] = 'A'
        df2.iloc[i, 3] = 'D'
    else:
        df1.iloc[i, 3] = 'D'
        df2.iloc[i, 3] = 'A'

print(df1)
print(df2)

This will print what you want:

   id    day  rank type
0   1  25/01    22    D
2   5  25/01    66    A

   id    day  rank type
0   1  25/01    58    A
2   5  25/01    10    D
  • Related