I have some data that looks like this:
d = {'id' : ["A","A","A","A","A","A","B","B","B","B","B","B"],
'month' : [1,1,1,1,2,2,1,1,2,2,2,2],
'week' : [1,2,3,4,1,2,1,2,1,2,3,4]}
example_df = pd.DataFrame(data = d)
I want to group by id
and create a new column based on the contents of month
and week
but I get this error: KeyError: 'month'
. Here is my attempt:
example_df['final_score'] = (
example_df.groupby(['id'])
.transform(lambda x: 'converted' if ((x['month'] == 1) &
(x['week'].isin([3,4].any())))
else 'not_converted')
)
Does anyone know what's going on here?
CodePudding user response:
groupby.transform
handles only one column at a time.
Use groupby.transform('any')
to build a mask to use with numpy.where
:
m1 = example_df['month'].eq(1)
m2 = example_df['week'].isin([3,4]).groupby(example_df['id']).transform('any')
example_df['final_score'] = np.where(m1&m2, 'converted', 'not_converted')
output:
id month week final_score
0 A 1 1 converted
1 A 1 2 converted
2 A 1 3 converted
3 A 1 4 converted
4 A 2 1 not_converted
5 A 2 2 not_converted
6 B 1 1 converted
7 B 1 2 converted
8 B 2 1 not_converted
9 B 2 2 not_converted
10 B 2 3 not_converted
11 B 2 4 not_converted
CodePudding user response:
What result do you want? Is it maybe one you can get with this line of code ?
import numpy as np
example_df['final_score'] = np.where((example_df['month'] == 1 & example_df['week'].isin([3,4])), "converted", "not converted")
CodePudding user response:
I don't see why you need a groupby
and transform
for this given that your transform
is not dependent on the grouping -
A simple apply
like this should work -
example_df['final_score'] = example_df.apply(lambda x: 'converted' if ((x['month'] == 1) &
(x['week'] in [3, 4]))
else 'not_converted', axis=1)
Output
id month week final_score
0 A 1 1 not_converted
1 A 1 2 not_converted
2 A 1 3 converted
3 A 1 4 converted
4 A 2 1 not_converted
5 A 2 2 not_converted
6 B 1 1 not_converted
7 B 1 2 not_converted
8 B 2 1 not_converted
9 B 2 2 not_converted
10 B 2 3 not_converted
11 B 2 4 not_converted