I have two different DataFrames that look something like this:
Lat | Lon |
---|---|
28.13 | -87.62 |
28.12 | -87.65 |
...... | ...... |
Calculated_Dist_m |
---|
34.5 |
101.7 |
.............. |
The first DataFrame (name=df
) (consisting of the Lat
and Lon
columns) has just over 1000 rows (values) in it. The second DataFrame (name=new_calc_dist
) (consisting of the Calculated_Dist_m
column) has over 30000 rows (values) in it. I want to determine the new longitude and latitude coordinates using the Lat
, Lon
, and Calculated_Dist_m
columns. Here is the code I've tried:
r_earth = 6371000
new_lat = df['Lat'] (new_calc_dist['Calculated_Dist_m'] / r_earth) * (180/np.pi)
new_lon = df['Lon'] (new_calc_dist['Calculated_Dist_m'] / r_earth) * (180/np.pi) / np.cos(df['Lat'] * np.pi/180)
When I run the code, however, it only gives me new calculations for certain index values, and gives me NaNs for the rest. I'm not entirely sure how I should go about writing the code so that new longitude and latitude points are calculated for each of over 30000 row values based on the initial 1000 longitude and latitude points. Any suggestions?
EDIT
Here would be some sample outputs. Note that these are not exact figures, but give the idea.
Lat | Lon |
---|---|
28.13 | -87.62 |
28.12 | -87.65 |
28.12 | -87.63 |
..... | ...... |
Calculated_Dist_m |
---|
34.5 |
101.7 |
28.6 |
30.8 |
76.5 |
................. |
And so the sample out put would be:
Lat | Lon |
---|---|
28.125 | -87.625 |
28.15 | -87.61 |
28.127 | -87.623 |
28.128 | -87.623 |
28.14 | -87.615 |
28.115 | -87.655 |
28.14 | -87.64 |
28.117 | -87.653 |
28.118 | -87.653 |
28.15 | -87.645 |
28.115 | -87.635 |
28.14 | -87.62 |
28.115 | -87.613 |
28.117 | -87.633 |
28.118 | -87.633 |
...... | ....... |
Again, these are just random outputs (I tried getting the exact calculations, but could not get it to work). But overall, this gives an idea of what would be wanted: taking the coordinates from the first dataframe and calculating new coordinates based on each of the calculated distances from the second dataframe.
CodePudding user response:
If I understood correctly and assuming df1
and df2
as input, you can perform a cross merge
to get all combinations of df1
and df2
rows, then apply your computation (here as new columns Lat2/Lon2):
df = df1.merge(df2, how='cross')
r_earth = 6371000
df['Lat2'] = df['Lat'] (df['Calculated_Dist_m'] / r_earth) * (180/np.pi)
df['Lon2'] = df['Lon'] (df['Calculated_Dist_m'] / r_earth) * (180/np.pi) / np.cos(df['Lat'] * np.pi/180)
output:
Lat Lon Calculated_Dist_m Lat2 Lon2
0 28.13 -87.62 34.5 28.130310 -87.619648
1 28.13 -87.62 101.7 28.130915 -87.618963
2 28.13 -87.62 28.6 28.130257 -87.619708
3 28.13 -87.62 30.8 28.130277 -87.619686
4 28.13 -87.62 76.5 28.130688 -87.619220
5 28.12 -87.65 34.5 28.120310 -87.649648
6 28.12 -87.65 101.7 28.120915 -87.648963
7 28.12 -87.65 28.6 28.120257 -87.649708
8 28.12 -87.65 30.8 28.120277 -87.649686
9 28.12 -87.65 76.5 28.120688 -87.649220
10 28.12 -87.63 34.5 28.120310 -87.629648
11 28.12 -87.63 101.7 28.120915 -87.628963
12 28.12 -87.63 28.6 28.120257 -87.629708
13 28.12 -87.63 30.8 28.120277 -87.629686
14 28.12 -87.63 76.5 28.120688 -87.629220
CodePudding user response:
In case you just want the result as two 2D arrays (without repeats of the input, so also O[m*n]
in memory but 2/5 of the requirement from the result of cross-join):
r_earth = 6371000
z = 180 / np.pi * new_calc_dist['Calculated_Dist_m'].values / r_earth
lat = df['Lat'].values
lon = df['Lon'].values
new_lat = lat[:, None] z
new_lon = lon[:, None] z / lat[:, None]
Example:
df = pd.DataFrame([[28.13, -87.62], [28.12, -87.65]], columns=['Lat', 'Lon'])
new_calc_dist = pd.DataFrame([[34.5], [101.7], [60.0]], columns=['Calculated_Dist_m'])
# result of above
>>> new_lat
array([[28.13031027, 28.13091461, 28.13053959],
[28.12031027, 28.12091461, 28.12053959]])
>>> new_lon
array([[-87.61998897, -87.61996749, -87.61998082],
[-87.64998897, -87.64996747, -87.64998081]])
If you do want those results as DataFrame
s:
kwargs = dict(index=df.index, columns=new_calc_dist.index)
new_lat = pd.DataFrame(new_lat, **kwargs)
new_lon = pd.DataFrame(new_lon, **kwargs)