So I have for example a DataFrame with the two columns:
col1 col2
['1', '2', '3'] ['A', 'B', 'C']
['4', '5', '6'] ['D', 'E', 'F']
etc.
I would like to get a third column with:
col3
[['1', 'A'], ['2', 'C'], ['3', 'C']]
[['4', 'D'], ['5', 'E'], ['6', 'F']]
etc
I have tried to use apply and combine it with a lambda function like this:
df['col3'] = df.apply(lambda x: [y,z] for y,z in zip(x['col1'], x['col2']), axis=1)
But this only give the error:
SyntaxError: Generator expression must be parenthesized
Can some help me?
CodePudding user response:
In your solution add []
for list comprehension:
df['col3'] = df.apply(lambda x: [[y,z] for y,z in zip(x['col1'], x['col2'])], axis=1)
print (df)
col1 col2 col3
0 [1, 2, 3] [A, B, C] [[1, A], [2, B], [3, C]]
1 [4, 5, 6] [D, E, F] [[4, D], [5, E], [6, F]]
Or use nested list comprehension:
df['col3'] = [[[a, b] for a, b in zip(*x)] for x in zip(df['col1'], df['col2'])]
print (df)
col1 col2 col3
0 [1, 2, 3] [A, B, C] [[1, A], [2, B], [3, C]]
1 [4, 5, 6] [D, E, F] [[4, D], [5, E], [6, F]]
CodePudding user response:
IIUC you could use list(map(list,zip(*r)))
and result_type='reduce'
in apply
:
df['col3'] = df.apply(lambda r: list(map(list,zip(*r))),
result_type='reduce', axis=1)
output:
col1 col2 col3
0 [1, 2, 3] [A, B, C] [[1, A], [2, B], [3, C]]
1 [4, 5, 6] [D, E, F] [[4, D], [5, E], [6, F]]
If you want to limit the processing to a subset of columns:
cols = ['col1', 'col2']
df['col3'] = df[cols].apply(lambda r: list(map(list,zip(*r))),
result_type='reduce', axis=1)