Obtaining two values from one column in pandas dataframe-CodePudding

I have the following csv file which I am reading using pandas dataframe:

Timestamp, UTC, id, loc, spd
001, 12z, q20, "52, 13", 320
002, 13z, a32, "53, 12", 321
003, 14z, q32, "54, 11", 321
004, 15`, a43, "55, 10", 330

I am extracting the data as follows:

import pandas as pd
import matplotlib.pyplot as plt

fname = "data.csv"
data = pd.read_csv(fname,sep=",", header=None, skiprows=1)
data.columns = ["Timestamp", "UTC", "Callsign", "Position", "Speed", "Direction"]

t = data["Timestamp"]
utc = data["UTC"]
acid = data["Callsign"]
pos = data["Position"]
spd = ["Speed"]

However, for the position column, this results in 2 values per row in this column. I would like to have the first value of position in a separate list as well as the second value in a separate list as follows:

latitude = [52,53,54,55]
longitude = [13,12,11,10]

How do I select this using the pandas dataframe?

CodePudding user response：

Use Series.str.strip with Series.str.split if need 2 new columns, then cast to floats:

data[['lat','lon']] = (data["Position"].str.strip('"')
                                       .str.split(',\s ', expand=True)
                                       .astype(float))
print (data)
   Timestamp  UTC Callsign  Position  Speed   lat   lon
0          1  12z      q20  "52, 13"    320  52.0  13.0
1          2  13z      a32  "53, 12"    321  53.0  12.0
2          3  14z      q32  "54, 11"    321  54.0  11.0
3          4  15`      a43  "55, 10"    330  55.0  10.0

If need 2 lists:

lat, lon = (data["Position"].str.strip('"')
                            .str.split(',\s ', expand=True)
                            .astype(float)
                            .to_numpy()
                            .T.tolist())
print (lat, lon)
[52.0, 53.0, 54.0, 55.0] [13.0, 12.0, 11.0, 10.0]

CodePudding user response：

We can use str.extract here, followed by a cast:

data[["lat", "lng"]] = data["Position"].str.extract(r'(-?\d (?:\.\d )?),\s*(-?\d (?:\.\d )?)')
data["lat"] = data["lat"].astype(float)
data["lng"] = data["lng"].astype(float)