Home > Software engineering >  Split a Pandas column into multiple columns
Split a Pandas column into multiple columns

Time:01-24

I have some coordinates in a csv file, formatted as [51.447084, -0.081564] that i'm reading into a pandas dataframe.

I want to split them into 2 separate columns.

I've tried

df[[f'Column {i}' for i in range(2)]] = df['event_location'].tolist()

which didn't work as my output was:

     event_location             Column 0           Column 1
0  [51.447084, -0.081564]  [51.447084, -0.081564]  [51.447084, -0.081564]
1    [51.447084, -0.081564]  [51.447084, -0.081564]  [51.447084, -0.081564]

This also didn't work

df[['lat', 'long']] = df['event_location'].str.split(',', expand=True)

Output:

            event_location        lat       long
0  [51.447084, -0.081564]  [51.447084, -0.081564]
1    [51.447084, -0.081564]  [51.447084, -0.081564]

I then thought to look at them in a list and I think this is my problem as each pair is one list item.

['[51.447084, -0.081564]', '[51.447084, -0.081564]']

Any ideas?

CodePudding user response:

You need:

df[['lat', 'lon']] = (df['event_location'].str.strip('[]')
                      .str.split(',', expand=True).astype(float)
                     )

Or:

df[['lat', 'lon']] = (df['event_location'].str.extract('(-?\d .?\d*),(-?\d .?\d*)')
                      .astype(float)
                     )

Output:

        event_location       lat      lon
0  [51.447084, -0.081564]  51.447084 -0.081564
1    [51.447084, -0.081564]  51.447084 -0.081564
  • Related