Let's say I have this column in Pandas:
df['GPS'][0]: 0 '39.21,38.6;39.23,38.68;39.26,38.68'
I would like to split the column into:
Xcoord1 Ycoord1 Xcoord2,Ycoord2, Xcoord3, Ycoord3
37.21 38.6 37.23 38.68 37.26 38.68
My approach is to first split the column using:
df['GPS_list']=df['GPS'].apply(lambda x: x.split(';'))
df['GPS_list']:
['39.21,38.6','39.23,38.68','39.26,38.68']
Now I would need to split based on , to separate the x and y value which I am not sure how to do for each row. Can I use apply function work here?
Next, I would need to convert each of those values from string to int.
Is there an elegant way to do this in a step or two?
I am new to Python and Pandas so any nudge in the right direction is helpful.
CodePudding user response:
Example
df = pd.DataFrame(['39.21,38.6;39.23,38.68;39.26,38.68'], columns=['GPS'])
df
GPS
0 39.21,38.6;39.23,38.68;39.26,38.68
Code
col1 = ['Xcoord1', 'Ycoord1', 'Xcoord2', 'Ycoord2', 'Xcoord3', 'Ycoord3']
df['GPS'].str.split(r'[,;]', expand=True).set_axis(col1, axis=1)
result
:
Xcoord1 Ycoord1 Xcoord2 Ycoord2 Xcoord3 Ycoord3
0 39.21 38.6 39.23 38.68 39.26 38.68
CodePudding user response:
It If you always have the same number of coordinates, a simple str.split
will work:
out = (df['GPS'].str.split('[,;]', expand=True)
.set_axis(['Xcoord1', 'Ycoord1', 'Xcoord2', 'Ycoord2', 'Xcoord3', 'Ycoord3'], axis=1)
)
If you have an arbitrary number of pairs, you can use:
out = (df['GPS'].str.split(';', expand=True).stack()
.str.split(',', expand=True)
.set_axis(['Xcoord', 'Ycoord'], axis=1).unstack()
.sort_index(level=1, axis=1)
)
out.columns = out.columns.map(lambda x: f'{x[0]}{x[1] 1}')
Output:
Xcoord1 Ycoord1 Xcoord2 Ycoord2 Xcoord3 Ycoord3
0 39.21 38.6 39.23 38.68 39.26 38.68