Home > Mobile >  Pandas: Generate a new df from 3 dfs based on conditions and comparisons
Pandas: Generate a new df from 3 dfs based on conditions and comparisons

Time:08-16

I have a large dataset split into three dfs.

df1 contains grided (lat-long) and vegetation type data.

df1:

col      row    Type    TypeNo
-125    42.5    BF        4
-125    42.5    MTF       8
-125    42.5    TCF       9
-125    42.5    TDF       10
-125    42.5    SG        13
-125    42.5    AS        15
-125    42.5    XFW       19
-125    42.5    TBEF      33
-125    32.5    BF        4
-125    32.5    MTF       8
-125    32.5    TCF       9
-125    32.5    TDF       10
-125    32.5    SG        13
-125    32.5    AS        15
-125    32.5    XFW       19
-125    32.5    TBEF      33

df2 contains temperature data for each grid.

df2:

col     row     Tmin    Tmax
-125    42.5    7.1     36.7
-125    32.5    7.1     16.7

df3 contains temperature limits for each vegetation type in df1.

df3:

Type    TypeNo  limit_Tmin  limit_Tmax
BF        4          -60    21
MTF       8           -5    23
TCF       9          -40    10
TDF      10          -30    30
SG       13          -20    38
AS       15          -20    38
XFW      19          -40    15
TBEF     33          -40    20

I am trying to apply the temperature limits to df1 such that for every vegetation type:

if (df2: Tmax > df3: limit_Tmax) then vegetation is False, else, True

This has to be done in each grid.

An example output would be df1 and a new column with True/False, i.e.:

col      row    Type    TypeNo     Possible
-125    42.5    BF        4         False
-125    42.5    MTF       8         False
-125    42.5    TCF       9         False
-125    42.5    TDF       10        False
-125    42.5    SG        13        True
-125    42.5    AS        15        True 
-125    42.5    XFW       19        False   
-125    42.5    TBEF      33        False
-125    32.5    BF        4         True
-125    32.5    MTF       8         True
-125    32.5    TCF       9         False
-125    32.5    TDF       10        True
-125    32.5    SG        13        True
-125    32.5    AS        15        True
-125    32.5    XFW       19        False
-125    32.5    TBEF      33        True

I am new pandas and need help on how to do this. Thank you!

CodePudding user response:

First we use merge to put all the data into a new data frame, then we define a new column for df1 based off the values of the new dataframe.

#pd is pandas
import pandas as pd
#Merge df1 and df2
dft = pd.merge(df1, df2, on=["col", "row"], how="left")
#Merge result of that with df3
dft = pd.merge(dft, df3, on=["Type", "TypeNo"], how="left")
#Create new column and put it in df1
df1["Possible"] = dft["Tmax"] <= dft["limit_Tmax"]

CodePudding user response:

just bunch of simple joins to get your desired result:

df_data = df1.merge(df3 , on=[ 'Type', 'TypeNo'],how='left')
df_data = df_data.merge(df2 , on=[ 'col', 'row'],how='left')
df1['possible'] = df_data['Tmax'] <= df_data['limit_Tmax']
print(df1)

output:

>>>
    col   row  Type  TypeNo  possible
0  -125  42.5    BF       4     False
1  -125  42.5   MTF       8     False
2  -125  42.5   TCF       9     False
3  -125  42.5   TDF      10     False
4  -125  42.5    SG      13      True
5  -125  42.5    AS      15      True
6  -125  42.5   XFW      19     False
7  -125  42.5  TBEF      33     False
8  -125  32.5    BF       4      True
9  -125  32.5   MTF       8      True
10 -125  32.5   TCF       9     False
11 -125  32.5   TDF      10      True
12 -125  32.5    SG      13      True
13 -125  32.5    AS      15      True
14 -125  32.5   XFW      19     False
15 -125  32.5  TBEF      33      True
  • Related