I have a column of values split in two lists
coordinates
----
[[36.2046069345455, 23.466756], [56.678766, 45.1405656576776]
[[46.2034534576765, 56.877879], [34.207049, 18.1565655652422]]
[[41.3223449567164, 34.645445], [78.206545, 66.1402362184811]]
[[23.2046069887887, 87.234223], [76.212123, 15.3943493949348]]
[[33.9685958954948, 78.454555], [32.765666, 23.4685489900090]]
[[12.7665776555654, 45.987878], [43.787786, 45.3494893404820]]
I want to divide it in four different columns. I tried with
df['coordinates'] = df['coordinates'].apply(lambda x: ' '.join(dict.fromkeys(x).keys()))
But it returns
TypeError: unhashable type: 'list'
Any idea on how to solve it?
CodePudding user response:
Assuming that each value of 'coordinates' consists of one list containing two lists with two values, you can use something like this:
df = pd.DataFrame({
"coordinates": [
[[36.2046069345455, 23.466756], [56.678766, 45.1405656576776]],
[[46.2034534576765, 56.877879], [34.207049, 18.1565655652422]],
[[41.3223449567164, 34.645445], [78.206545, 66.1402362184811]],
[[23.2046069887887, 87.234223], [76.212123, 15.3943493949348]],
[[33.9685958954948, 78.454555], [32.765666, 23.4685489900090]],
[[12.7665776555654, 45.987878], [43.787786, 45.3494893404820]]
]})
pd.concat([df.rename(columns={'coordinates': f'coordinates_{i}{j}'})[f'coordinates_{i}{j}'].str[i].str[j] for i in [0, 1] for j in [0, 1]], axis=1)
------------------------------------------------------------------------
coordinates_00 coordinates_01 coordinates_10 coordinates_11
0 36.204607 23.466756 56.678766 45.140566
1 46.203453 56.877879 34.207049 18.156566
2 41.322345 34.645445 78.206545 66.140236
3 23.204607 87.234223 76.212123 15.394349
4 33.968596 78.454555 32.765666 23.468549
5 12.766578 45.987878 43.787786 45.349489
------------------------------------------------------------------------
Alternative solution that is even shorter and uses .apply
:
from itertools import chain
pd.DataFrame(df['coordinates'].apply(lambda x: list(chain.from_iterable(x))).to_dict()).T
You can then just rename the columns as you want.
CodePudding user response:
Probably not the prettiest solution but this should do the trick:
import pandas as pd
coordinates = [
[[36.2046069345455, 23.466756], [56.678766, 45.1405656576776]],
[[46.2034534576765, 56.877879], [34.207049, 18.1565655652422]],
[[41.3223449567164, 34.645445], [78.206545, 66.1402362184811]],
[[23.2046069887887, 87.234223], [76.212123, 15.3943493949348]],
[[33.9685958954948, 78.454555], [32.765666, 23.4685489900090]],
[[12.7665776555654, 45.987878], [43.787786, 45.3494893404820]]]
df = pd.DataFrame({"coordinates": coordinates})
df[["c1", "c2"]] = pd.DataFrame(pd.DataFrame(df["coordinates"].to_list(), columns=['c12', 'c34'])['c12'].to_list(), columns=['c1', 'c2'])
df[["c3", "c4"]] = pd.DataFrame(pd.DataFrame(df["coordinates"].to_list(), columns=['c12', 'c34'])['c34'].to_list(), columns=['c3', 'c4'])
del df["coordinates"]
print(df)
> c1 c2 c3 c4
0 36.204607 23.466756 56.678766 45.140566
1 46.203453 56.877879 34.207049 18.156566
2 41.322345 34.645445 78.206545 66.140236
3 23.204607 87.234223 76.212123 15.394349
4 33.968596 78.454555 32.765666 23.468549
5 12.766578 45.987878 43.787786 45.349489