I am currently working on a project where I match as well as merge two excel documents by meter indication (km). The two dataframes contain the same type of data but are slightly different. My solution to this problem was to divide each dataframe by 100 to more easily match the columns. However, as I predicted there will be occasions where this doesn't work, making this a temporary solution.
I have two dataframes:
1.
Meter_indication Fuel1
1180784 275
1181278 280
1181791 300
1182285 280
1182801 300
1183295 280
1183717 250
Meter_indication Fuel2
1180785 278
1181282 282
1181800 310
1182401 282
1182824 320
1183310 215
1183727 250
As you can see these dataframes are containing the same type of information but are slightly off. I have previously used the method of merge merged_df = df1filt2.merge(df2filt, on='Meter_indication')
which does only merge values that perfectly match. However, in this scenario this method is irrelevant.
My desired output is:
Meter_indication Fuel1 Fuel2
1180784 275 278
1181278 280 282
1181791 300 310
1182285 280 282
1182801 300 320
1183295 280 215
1183717 250 250
As you can see, the dataframes have merged on "Meter_indication" and have merged by finding the closest value compared to itself.
I have looked around for others with similar problems and have tried many different suggestions such as
Desired graph output:
With the data (used for graph) sometimes looks like this:
CodePudding user response:
merge_asof()
does work. matches your expected output- have not made assumptions about sorting, so did
sort_values()
- for transparency include right Meter_indication in output
df1 = pd.read_csv(io.StringIO("""Meter_indication Fuel1
1180784 275
1181278 280
1181791 300
1182285 280
1182801 300
1183295 280
1183717 250"""), sep="\s ")
df2 = pd.read_csv(io.StringIO("""Meter_indication Fuel2
1180785 278
1181282 282
1181800 310
1182401 282
1182824 320
1183310 215
1183727 250"""), sep="\s ")
pd.merge_asof(
df1.sort_values("Meter_indication"),
df2.sort_values("Meter_indication").assign(mi=lambda d: d["Meter_indication"]),
on="Meter_indication",
direction="nearest",
)
Meter_indication | Fuel1 | Fuel2 | mi |
---|---|---|---|
1180784 | 275 | 278 | 1180785 |
1181278 | 280 | 282 | 1181282 |
1181791 | 300 | 310 | 1181800 |
1182285 | 280 | 282 | 1182401 |
1182801 | 300 | 320 | 1182824 |
1183295 | 280 | 215 | 1183310 |
1183717 | 250 | 250 | 1183727 |