I have two dataframes both consisting of a 1 column with 62 values each:
Distance_obs = [
0.0
0.9084
2.1931
2.85815
3.3903
3.84815
4.2565
4.6287
4.97295
5.29475
5.598
5.8856
6.15975
6.4222
6.67435
6.9173
7.152
7.37925
7.5997
7.8139
8.02235
8.22555
8.42385
8.61755
8.807
8.99245
9.17415
9.35235
9.5272
9.6989
9.86765
10.0335
10.1967
10.3574
10.5156
10.6714
10.825
10.9765
11.1259
11.2732
11.4187
11.5622
11.7041
11.8442
11.9827
12.1197
12.2552
12.3891
12.5216
12.6527
12.7825
12.9109
13.0381
13.1641
13.2889
13.4126
13.5351
13.6565
13.7768
13.8961
14.0144
14.0733
]
and
Cell_mid = [0.814993
1.96757
2.56418
3.04159
3.45236
3.8187
4.15258
4.46142
4.75013
5.02221
5.28026
5.52624
5.76172
5.98792
6.20588
6.41642
6.62027
6.81802
7.01019
7.19723
7.37952
7.55742
7.73122
7.9012
8.0676
8.23063
8.39049
8.54736
8.70141
8.85277
9.00159
9.14798
9.29207
9.43396
9.57374
9.71152
9.84736
9.98136
10.1136
10.2441
10.373
10.5003
10.626
10.7503
10.8732
10.9947
11.1149
11.2337
11.3514
11.4678
11.5831
11.6972
11.8102
11.9222
12.0331
12.143
12.2519
12.3599
12.4669
12.573
12.6782
12.7826
]
I want to run each element in Distance_obs
to run through the values in Cell_mid
and find the corresponding index of nearest value which matches the element.
I have been trying using the following:
for i in Distance_obs:
Required_value = (np.abs(Cell_mid - i)).idxmin()
But I get
error: ufunc 'subtract' did not contain a loop with signature matching types (dtype('<U32'), dtype('<U32')) -> dtype('<U32')
CodePudding user response:
One way to do this, could be as follows:
- Use
pd.merge_asof
, passing "nearest" to thedirection
parameter. - Now, from the merged result select column
Cell_mid
, and useSeries.map
with apd.Series
where the values and index of its originaldf
(here:df2
) are swapped.
df['Cell_mid_index'] = pd.merge_asof(df, df2,
left_on='Distance_obs',
right_on='Cell_mid',
direction='nearest')\
['Cell_mid'].map(pd.Series(df2['Cell_mid'].index.values,
index=df2['Cell_mid']))
print(df.head())
Distance_obs Cell_mid_index
0 0.00000 0
1 0.90840 0
2 2.19310 1
3 2.85815 3
4 3.39030 4
So, at the intermediate step, we had a merged df
like this:
print(pd.merge_asof(df, df2, left_on='Distance_obs',
right_on='Cell_mid', direction='nearest').head())
Distance_obs Cell_mid
0 0.00000 0.814993
1 0.90840 0.814993
2 2.19310 1.967570
3 2.85815 3.041590
4 3.39030 3.452360
And then with .map
we are retrieving the appropriate index values from df2
.
Data used
import pandas as pd
Distance_obs = [0.0, 0.9084, 2.1931, 2.85815, 3.3903, 3.84815, 4.2565,
4.6287, 4.97295, 5.29475, 5.598, 5.8856, 6.15975, 6.4222,
6.67435, 6.9173, 7.152, 7.37925, 7.5997, 7.8139, 8.02235,
8.22555, 8.42385, 8.61755, 8.807, 8.99245, 9.17415, 9.35235,
9.5272, 9.6989, 9.86765, 10.0335, 10.1967, 10.3574, 10.5156,
10.6714, 10.825, 10.9765, 11.1259, 11.2732, 11.4187, 11.5622,
11.7041, 11.8442, 11.9827, 12.1197, 12.2552, 12.3891, 12.5216,
12.6527, 12.7825, 12.9109, 13.0381, 13.1641, 13.2889, 13.4126,
13.5351, 13.6565, 13.7768, 13.8961, 14.0144, 14.0733]
df = pd.DataFrame(Distance_obs, columns=['Distance_obs'])
Cell_mid = [0.814993, 1.96757, 2.56418, 3.04159, 3.45236, 3.8187, 4.15258,
4.46142, 4.75013, 5.02221, 5.28026, 5.52624, 5.76172, 5.98792,
6.20588, 6.41642, 6.62027, 6.81802, 7.01019, 7.19723, 7.37952,
7.55742, 7.73122, 7.9012, 8.0676, 8.23063, 8.39049, 8.54736,
8.70141, 8.85277, 9.00159, 9.14798, 9.29207, 9.43396, 9.57374,
9.71152, 9.84736, 9.98136, 10.1136, 10.2441, 10.373, 10.5003,
10.626, 10.7503, 10.8732, 10.9947, 11.1149, 11.2337, 11.3514,
11.4678, 11.5831, 11.6972, 11.8102, 11.9222, 12.0331, 12.143,
12.2519, 12.3599, 12.4669, 12.573, 12.6782, 12.7826]
df2 = pd.DataFrame(Cell_mid, columns=['Cell_mid'])