I am trying to split a variable of a pd dataframe into many. The variable values (row wise) look like this (there are other variables in my df and I want to keep them as they are):
Variable1
[('Max'), ('15'), ('Place'), ('FB'), ('27 Aug 2022')]
[('Mily', ), ('Place'), ('Google'), ('22 Aug 2022')]
[('Mishika'), ('Place', ), ('London'), ('22 Aug 2022')]
Ideally I want to split Variable1 in as many variables as there are in elements in each list row wise. The resulting frame should be like following:
V1 V2 V3 V4 V5
Max 15 Place FB 27 Aug 2022
Mily Place Google 22 Aug 2022
Mishika Place London 22 Aug 2022
I know the value types in each variable will be a mess, but that is not an issue for me. I have this as of now:
for l in df['Variable1'].split('\),\s\('):
df['var_l'] = l
Which is of course not correct.
CodePudding user response:
Considering the given dataframe :
import pandas as pd
df = pd.DataFrame({'Variable1': [["(Max)", "(15)", "(Place)", "(FB)", "(27 Aug 2022)"],
["(Mily, )", "(Place)", "(Google)", "(22 Aug 2022)"],
["(Mishika)", "(Place, )", "(London)", "(22 Aug 2022)"]]})
>>> print(df)
CodePudding user response:
If I understood you, you want something like this, don't you?
import pandas as pd
import numpy as np
variable1 = [('Max'), ('15'), ('Place'), ('FB'), ('27 Aug 2022')]
variable2 = [('Mily'), ('Place'), ('Google'), ('22 Aug 2022')]
var1 = np.array([v for v in variable1 ])
var2 = np.array([v for v in variable2 ])
df = pd.DataFrame([var1])
df2 = pd.DataFrame([var2])
df3 = pd.concat([df, df2], axis=0)
print( df3.head() )
The output is:
0 1 2 3 4
0 Max 15 Place FB 27 Aug 2022
0 Mily Place Google 22 Aug 2022 NaN