Home > Mobile >  group by ID, then check the last row in two specific columns if it's equal to value
group by ID, then check the last row in two specific columns if it's equal to value

Time:04-18

so I have dataFrame which is this:

ID Age name
1 12 m
1 13 n
1 14 r
1 15 r
2 11 l
2 22 f
2 33 f
2 55 f
3 150 o
3 160 j
3 170 s
3 180 o
4 10 k
4 20 k
4 30 k
4 40 k

I want first to group by ID, then check the last row on the (name) column on the grouped by dataFrame, if the last row is equal let's say 'r' such as group (1) in our example , then go and check the last row in the age column of group(1) as well, if it's greater than specific number let's say that name ('r') which is the last row in the (name) column in group(1), it's last row (age) should be >= 15, if this condition of the age was satsified, create another colum saying that the whole group is satsified or else unsatsified. the same thing again for the other last row of each group in the columns [names] and [ages].

so these are the other letters (name) and thier (age) conditions:

in grouped by [ID] dataFrame, if the last row (name) is (r) the last row (age) should be >= 15, if the the last row (name) is (f) the last row (age) should be >= 55, if the last row (name) is (o) the last row (age) should be >= 182 ,if the last row (name) is (k) the last row (age) should be >= 50.

this is the last out put I am looking for:

ID Age name check_condition
1 12 m satisfied
1 13 n satisfied
1 14 r satisfied
1 15 r satisfied
2 11 l satisfied
2 22 f satisfied
2 33 f satisfied
2 55 f satisfied
3 150 o unsatisfied
3 160 j unsatisfied
3 170 s unsatisfied
3 180 o unsatisfied
4 10 k unsatisfied
4 20 k unsatisfied
4 30 k unsatisfied
4 40 k unsatisfied

How can I do this by pandas python ?

CodePudding user response:

You can try group by then get the last row value of each group with group.iloc[-1][col]

d = {'r': 15, 'f': 55, 'o': 182, 'k': 50}

df['check_condition'] = df['ID'].map(
    df.groupby('ID').apply(lambda group:
                           'satisfied' if group.iloc[-1]['Age'] >= d.get(group.iloc[-1]['name'])
                           else 'unsatisfied')
)
    ID  Age name check_condition
0    1   12    m       satisfied
1    1   13    n       satisfied
2    1   14    r       satisfied
3    1   15    r       satisfied
4    2   11    l       satisfied
5    2   22    f       satisfied
6    2   33    f       satisfied
7    2   55    f       satisfied
8    3  150    o     unsatisfied
9    3  160    j     unsatisfied
10   3  170    s     unsatisfied
11   3  180    o     unsatisfied
12   4   10    k     unsatisfied
13   4   20    k     unsatisfied
14   4   30    k     unsatisfied
15   4   40    k     unsatisfied
  • Related