Unlisting lists in a Dataframe column-CodePudding

I have a column of values split in two lists

coordinates
----
[[36.2046069345455, 23.466756], [56.678766, 45.1405656576776]
[[46.2034534576765, 56.877879], [34.207049, 18.1565655652422]]
[[41.3223449567164, 34.645445], [78.206545, 66.1402362184811]]
[[23.2046069887887, 87.234223], [76.212123, 15.3943493949348]]
[[33.9685958954948, 78.454555], [32.765666, 23.4685489900090]]
[[12.7665776555654, 45.987878], [43.787786, 45.3494893404820]]

I want to divide it in four different columns. I tried with

df['coordinates'] = df['coordinates'].apply(lambda x: ' '.join(dict.fromkeys(x).keys()))

But it returns

TypeError: unhashable type: 'list'

Any idea on how to solve it?

CodePudding user response：

Assuming that each value of 'coordinates' consists of one list containing two lists with two values, you can use something like this:

df = pd.DataFrame({
    "coordinates": [
      [[36.2046069345455, 23.466756], [56.678766, 45.1405656576776]],
      [[46.2034534576765, 56.877879], [34.207049, 18.1565655652422]],
      [[41.3223449567164, 34.645445], [78.206545, 66.1402362184811]],
      [[23.2046069887887, 87.234223], [76.212123, 15.3943493949348]],
      [[33.9685958954948, 78.454555], [32.765666, 23.4685489900090]],
      [[12.7665776555654, 45.987878], [43.787786, 45.3494893404820]]
]})



pd.concat([df.rename(columns={'coordinates': f'coordinates_{i}{j}'})[f'coordinates_{i}{j}'].str[i].str[j] for i in [0, 1] for j in [0, 1]], axis=1)

------------------------------------------------------------------------
    coordinates_00  coordinates_01  coordinates_10  coordinates_11
0   36.204607       23.466756       56.678766       45.140566
1   46.203453       56.877879       34.207049       18.156566
2   41.322345       34.645445       78.206545       66.140236
3   23.204607       87.234223       76.212123       15.394349
4   33.968596       78.454555       32.765666       23.468549
5   12.766578       45.987878       43.787786       45.349489
------------------------------------------------------------------------

Alternative solution that is even shorter and uses .apply:

from itertools import chain

pd.DataFrame(df['coordinates'].apply(lambda x: list(chain.from_iterable(x))).to_dict()).T

You can then just rename the columns as you want.

CodePudding user response：

Probably not the prettiest solution but this should do the trick:

import pandas as pd

coordinates = [
    [[36.2046069345455, 23.466756], [56.678766, 45.1405656576776]],
    [[46.2034534576765, 56.877879], [34.207049, 18.1565655652422]],
    [[41.3223449567164, 34.645445], [78.206545, 66.1402362184811]],
    [[23.2046069887887, 87.234223], [76.212123, 15.3943493949348]],
    [[33.9685958954948, 78.454555], [32.765666, 23.4685489900090]],
    [[12.7665776555654, 45.987878], [43.787786, 45.3494893404820]]]

df = pd.DataFrame({"coordinates": coordinates})
df[["c1", "c2"]] = pd.DataFrame(pd.DataFrame(df["coordinates"].to_list(), columns=['c12', 'c34'])['c12'].to_list(), columns=['c1', 'c2'])
df[["c3", "c4"]] = pd.DataFrame(pd.DataFrame(df["coordinates"].to_list(), columns=['c12', 'c34'])['c34'].to_list(), columns=['c3', 'c4'])
del df["coordinates"]

print(df)
>           c1         c2         c3         c4
  0  36.204607  23.466756  56.678766  45.140566
  1  46.203453  56.877879  34.207049  18.156566
  2  41.322345  34.645445  78.206545  66.140236
  3  23.204607  87.234223  76.212123  15.394349
  4  33.968596  78.454555  32.765666  23.468549
  5  12.766578  45.987878  43.787786  45.349489