Home > Back-end >  Get clean Array output from pyproj
Get clean Array output from pyproj

Time:08-22

(if you don't need all details, please see short explanation below...) What I want to do: I'm currently trying to convert a bunch of coordinates from epsg:4326 to epsg: 25832. The conversion part works perfectly fine and I was able to save my array to excel with the help of pandas. Problem: I am now trying to add the old coordinates to the array for reference but I can't seem to get a clean output that I can save. My Code:

import pyproj as pp
import numpy as np
import os

print('Whats the name of the file you are trying to convert (e.g. "coordinates.xlsx")?')
xlsx_file_name = "Mappe1.xlsx" #input()

try:
    import_arr = pd.read_excel(os.path.join(os.getcwd(), xlsx_file_name), sheet_name='Tabelle1')
except:
    print(f'Were not able to find a file named {xlsx_file_name}. Please restart and make sure it is written correctly.')

try:
    lat_long_arr = np.array(import_arr)
except:
    print(f'Were not able to import the data from {xlsx_file_name}. Please make sure it is formated properly.')

xy_arr = []

#epsg:4326 to epsg:25832
epsg = pp.Transformer.from_crs(4326, 25832)

#THIS IS THE PART I STRUGGLE WITH-----------------------------
for i in range(len(lat_long_arr)):
        xy_arr.extend([epsg.transform(lat_long_arr[i,0], lat_long_arr[i,1]), lat_long_arr[i,0], lat_long_arr[i,1]])
#-------------------------------------------------------------

#print(xy_arr)

try:
    pd.DataFrame(xy_arr).to_excel('converted_coordinates.xlsx')
except:
    print('The file "converted_coordinates.xlsx" is already existing and could not be overwritten')

My Otput: [(562146.0879263151, 5407897.591536528), 48.82095040000001, 9.8466309, (562228.5649670326, 5407950.483323202), 48.8214179, 9.8477624, ...] Those are two pairs of coordinates. The first ones in the brakets are converted and the two numbers after them are the refference. Question: Does someone know how I can get rid of the brakets or put the other values inside the brakets? Keep in mind that I need the array to be saved with pd.DataFrame(xy_arr).to_excel('converted_coordinates.xlsx'). Thanks for any suggestions!

Short Version: I extend an array like this: array.extend([epsg.transform(x, y), lat, long]) and get an output e.g. like this [(562146.0879263151, 5407897.591536528), 48.82095040000001, 9.8466309]. I now need the array formated in a way to save it with pandas: pd.DataFrame(array).to_excel('save.xlsx')

CodePudding user response:

We can simplify your code using vectorized numpy operations, eliminating the for-loop.

Let's use your provided two sets of coordinates in the lat_long_arr array:

import numpy as np
import pyproj as pp

lat_long_arr = np.array([[48.82095040000001, 9.8466309], [48.8214179, 9.8477624]])

print(lag_long_arr)
# array([[48.8209504,  9.8466309],
#        [48.8214179,  9.8477624]])

Now, we can pass the entire columns of the array to the epsg.transform() method, as it accepts arrays, not just scalars:

converted_arr = epsg.transform(lag_long_arr[:, 0], lag_long_arr[:, 1])

print(converted_arr)
# (array([562146.08792632, 562228.56496703]),
#  array([5407897.59153653, 5407950.4833232 ]))

The result is a tuple of arrays, and we just need to manipulate it to get it into the right format:

print(np.vstack(converted_arr).T)
# array([[ 562146.08792632, 5407897.59153653],
#        [ 562228.56496703, 5407950.4833232 ]])

We can join the resulting array with the original array and pass the resulting 4-column array to the pd.DataFrame() constructor. You can use pandas options to display float values to the desired precision:

pd.set_option('display.float_format', lambda x: '%.8f' % x)

res = pd.DataFrame(
    np.hstack((np.vstack(converted_arr).T, lat_long_arr)),
    columns=['converted_x', 'converted_y', 'x', 'y']
)

print(res)
#       converted_x        converted_y            x            y
# 0 562146.08792632   5407897.59153653  48.82095040   9.84663090
# 1 562228.56496703   5407950.48332320  48.82141790   9.84776240

CodePudding user response:

https://gis.stackexchange.com/questions/334271/converting-large-data-with-lat-and-long-into-x-and-y

from pyproj import Transformer

trans = Transformer.from_crs(4326, 25832)
xx, yy = trans.transform(import_arr["lat"].values, import_arr["lon"].values)
import_arr["x"] = xx
import_arr["y"] = yy
  • Related