I want select all data of values have two type 'E' in data values. In this data, we can have many type 'S'
but it's only one value of type 'E'
.
For example: ID: 1114 have two 'Type'
: 'E'
in values so show all values of 1114.
dataframe 1:
id /date /origine /destination /horaire A /horaire B/ Type /Other data
1112 2021-03-11 Paris / Marseille/10:00/14:00/A / ..
1112 2021-03-11 Paris / Marseille/10:00/14:00/E /..
1112 2021-03-11 Paris / Marseille/10:00/14:00/S /..
1112 2021-03-11 Paris / Lyon/10:00/12:00/S/..
1112 2021-03-11 Paris / Marseille/10:00/14:00/S/..
1112 2021-03-11 Paris / Marseille/10:00/14:00/C/..
1114 2021-05-11 Paris / Bordeaux/09:00/13:00/A/..
1114 2021-05-11 Paris / Bordeaux/09:00/13:00/E/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/S/..
1114 2021-05-11 Paris / Bordeaux/10:20/14:00/E/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/S/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/S/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/S/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/C/..
data output:
id /date /origine /destination /horaire A /horaire B/ Type /Other data
1114 2021-05-11 Paris / Bordeaux/09:00/13:00/A/..
1114 2021-05-11 Paris / Bordeaux/09:00/13:00/E/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/S/..
1114 2021-05-11 Paris / Bordeaux/10:20/14:00/E/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/S/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/S/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/S/..
1114 2021-05-11 Paris / Bordeaux/10:00/14:00/C/..
I wrote this code:
mask = df.groupby(['date','Id']).apply(lambda x: x['Type'].value_counts())
data_set = df[((df['Type']=='E).isin(mask.index[mask > 1]))]
data_set
But my output is empty
CodePudding user response:
For count number of E
values create helper column tmp
and caout values by sum
:
df = (df[df.assign(tmp = df['Type']=='E')
.groupby(['date','Id'])['tmp'].transform('sum').gt(1)])