I'm working with a pandas DataFrame that contains two columns one for the ImageID and the other one for its labels
train_data.head()
Image_id Label
0 id_004wknd7qd.jpg blast
1 id_004wknd7qd_rgn.jpg blast
2 id_005sitfgr2.jpg brown
3 id_005sitfgr2_rgn.jpg brown
4 id_00stp9t6m6.jpg blast
Now there are two types of images RGBs and RGNs, the first row is an RGB and the second row is an RGN. I want to split each of them into two different data frames. if I tried the following line to select the RGBs it produces the following output:
train_rgbs = train_data.apply(lambda x : x if 'rgn' not in x.Image_id else None, axis = 1).dropna()
train_rgbs.head()
Image_id Label
0 id_004wknd7qd.jpg blast
2 id_005sitfgr2.jpg brown
4 id_00stp9t6m6.jpg blast
6 id_012zxewnhx.jpg blast
8 id_0186qwq2at.jpg healthy
but if I changed the lambda function if statement the output completely change
train_rgns = train_data.apply(lambda x : x if 'rgn' in x.Image_id else None, axis = 1).dropna()
train_rgns.head()
1 Image_id id_004wknd7qd_rgn.jpg
Label ...
3 Image_id id_005sitfgr2_rgn.jpg
Label ...
5 Image_id id_00stp9t6m6_rgn.jpg
Label ...
7 Image_id id_012zxewnhx_rgn.jpg
Label ...
9 Image_id id_0186qwq2at_rgn.jpg
Label ...
dtype: object
Why When selecting the RGNs the output changes and does not output a data frame as in the first case. Thanks in Advance
CodePudding user response:
train_rgbs = train_data[train_data.Image_id.apply(lambda x: 'rgn' not in x)]
train_rgns = train_data[train_data.Image_id.apply(lambda x: 'rgn' in x)]