Home > Blockchain >  Python: How to query data from ranges in another dataframe using pandas
Python: How to query data from ranges in another dataframe using pandas

Time:03-06

Have a df_subject_tracker has columns subject , marks_greater , marks_lower and marks_to_be_given and df_students_marks with columns name subject , rol , marks

using marks column in df_students_marks dataframe how to query the marks which falls in between marks_greater and marks_lower.

if marks fall between them we have to assign the marks_to_be_given column as alloted in df_marks_final dataframe. if marks does'nt fall under these ranges then we have to give alloted as 0

df_subject_tracker

subject      marks_greater       marks_lower    marks_to_be_given

english            50                  100              20
english            150                 200              40
english            250                 300              60

social             50                  100              30
social             150                 200              50
social             250                 300              70

df_students_marks

name     subject          rol   marks

mark     english          123   75
mark     social           123   75
tom     english          124   20
tom     social           124   289

df_marks_final

name     subject          rol   marks   alloted

mark     english          123   75        20
mark     social           123   75        20
tom     english          124   20         0
tom     social           124   289        70

tried this :

df_subject_tracker.loc[(df_subject_tracker['marks_greater'] >= int(marks_greater)) & (df_subject_tracker['marks_greater'] <=int(marks_lower)),'alloted'] = int(marks_to_be_given)

CodePudding user response:

You can merge your two dataframes then keep only rows where marks is between marks_greater and marks_lower.

# df1 <- df_subject_tracker
# df2 <- df_students_marks
df2['alloted'] = (
    df2.reset_index().merge(df1, on='subject', how='left')
       .query("marks.between(marks_greater, marks_lower, inclusive='left')")
       .set_index('index')['marks_to_be_given'].reindex(df2.index, fill_value=0)
)
print(df2)

# Output
   name  subject  rol  marks  alloted
0  mark  english  123     75       20
1  mark   social  123     75       30
2   tom  english  124     20        0
3   tom   social  124    289       70

Step-by-step:

>>> out = df2.reset_index().merge(df1, on='subject', how='left')
    index  name  subject  rol  marks  marks_greater  marks_lower  marks_to_be_given
0       0  mark  english  123     75             50          100                 20
1       0  mark  english  123     75            150          200                 40
2       0  mark  english  123     75            250          300                 60
3       2   tom  english  124     20             50          100                 20
4       2   tom  english  124     20            150          200                 40
5       2   tom  english  124     20            250          300                 60
6       1  mark   social  123     75             50          100                 30
7       1  mark   social  123     75            150          200                 50
8       1  mark   social  123     75            250          300                 70
9       3   tom   social  124    289             50          100                 30
10      3   tom   social  124    289            150          200                 50
11      3   tom   social  124    289            250          300                 70

>>> out = out.query("marks.between(marks_greater, marks_lower, inclusive='left')")
    index  name  subject  rol  marks  marks_greater  marks_lower  marks_to_be_given
0       0  mark  english  123     75             50          100                 20
6       1  mark   social  123     75             50          100                 30
11      3   tom   social  124    289            250          300                 70

>>> out = out.set_index('index')['marks_to_be_given'].reindex(df2.index,fill_value=0)
0    20
1    30
2     0
3    70
Name: marks_to_be_given, dtype: int64
  • Related