I am trying to find a solution that utilises the pandas and/or numpy libaries to do the below. I am merging where the track is equal across dataframes, and the location in the merged_df falls between the start and end values of df3.
I'm sure there is a way using the pandas merge function, but I can't work out how to do it between a range.
df1_length = len(df1.axes[0])
df2_length = len(df2.axes[0])
for j in range(df1_length):
for k in range(df2_length):
if (df1.at[j, 'Track'] == df2.at[k, 'Track'] and
df1.at[j, 'Location'] >= df2.at[k, 'Start'] and
df1.at[j, 'Location'] <= df2.at[k, 'End']):
df1.at[j, 'Label'] = df2.at[k, 'Label']
if df2.at[k, 'Label'] == 'Curve':
df1.at[j, 'Superelevation'] = df2.at[k, 'Superelevation']
df1.at[j, 'Curve Radius'] = df2.at[k, 'Curve Radius']
break
df1:
Track | Location |
---|---|
Up | 1234 |
Up | 2354 |
Up | 4521 |
Up | 8654 |
Up | 9876 |
df2:
Track | Start | End | Label | Superelevation | Curve Radius | Direction |
---|---|---|---|---|---|---|
Up | 0 | 2000 | Curve | 60 | 3200 | R |
Up | 3000 | 4600 | Transition | |||
Up | 9500 | 10000 | Curve | 35 | 900 | L |
Down | 0 | 9999 | Curve | 20 | 1700 | L |
output:
Track | Location | Label | Superelevation | Curve Radius | Direction |
---|---|---|---|---|---|
Up | 1234 | Curve | 60 | 3200 | R |
Up | 2354 | NaN | NaN | NaN | NaN |
Up | 4521 | Transition | NaN | NaN | NaN |
Up | 8654 | NaN | NaN | NaN | NaN |
Up | 9876 | Curve | 35 | 900 | L |
CodePudding user response:
You can use merge()
to inner join merged_df
and df3
on the Track
columns and then do filtering.
merged_df = merged_df.merge(df3, on='Track', how='inner')
merged_df = merged.loc[(merged['Location'] >= merged['Start']) & (merged['Location'] <= merged['End'])].reset_index(drop=True).drop(columns=['Start', 'End'])
output:
> Track Location Label Superelevation Curve Radius Direction
> 0 Up 1234 Curve 60 3200 R
> 1 Up 9876 Tang
CodePudding user response:
df_merged = df1.merge(df2, on='Track')
df_merged['Label'] = np.where((df_merged['Location'] >= df_merged['Start']) & (df_merged['Location'] <= df_merged['End']), df_merged['Label'], np.nan)
df_merged['Superelevation'] = np.where((df_merged['Location'] >= df_merged['Start']) & (df_merged['Location'] <= df_merged['End']), df_merged['Superelevation'], np.nan)
df_merged['Curve Radius'] = np.where((df_merged['Location'] >= df_merged['Start']) & (df_merged['Location'] <= df_merged['End']), df_merged['Curve Radius'], np.nan)
df_merged['Direction'] = np.where((df_merged['Location'] >= df_merged['Start']) & (df_merged['Location'] <= df_merged['End']), df_merged['Direction'], np.nan)
df_filtered = df_merged.dropna(subset='Label')
df_final = pd.merge(df_filtered, df1, on=['Location', 'Track'], how='right')
df_final.drop(labels=['Start', 'End'], axis=1, inplace=True)