I want to split the Pandas Series tuple on the fly in to multiple columns. Generate dummy data using code below:
df = pd.DataFrame(data={'a':[1, 2, 3, 4, 5, 6]})
df['b'] = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[0]
df['c'] = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[1]
How can I create column b
and c
in 1 line of code?
I've tried below code but it doesn't split as per the requirement.
df[['b', 'c']] = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[0:2]
CodePudding user response:
Assign both str
to 2 columns:
df = pd.DataFrame(data={'a':[1, 2, 3, 4, 5, 6]})
s = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)])
df['b'], df['c'] = s.str[0], s.str[1]
Or create 2 columns DataFrame:
s = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)])
df[['b', 'c']] = pd.DataFrame(s.tolist(), index=df.index)
print(df)
a b c
0 1 NaN 1
1 2 AB 10
2 3 CD 1
3 4 3 1
4 5 4 1
5 6 NA 1
What is same like one lines code:
df['b'], df['c'] = pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[0], pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).str[1]
df[['b', 'c']] = pd.DataFrame(pd.Series([(np.nan, 1), ('AB', 10), ('CD', 1), (3, 1), (4, 1), ('NA', 1)]).tolist(), index=df.index)