Home > OS >  pandas data frame apply method change the o/p using lambda function
pandas data frame apply method change the o/p using lambda function

Time:08-05

I'm working with a pandas DataFrame that contains two columns one for the ImageID and the other one for its labels

train_data.head()
               Image_id  Label
0      id_004wknd7qd.jpg  blast
1  id_004wknd7qd_rgn.jpg  blast
2      id_005sitfgr2.jpg  brown
3  id_005sitfgr2_rgn.jpg  brown
4      id_00stp9t6m6.jpg  blast

Now there are two types of images RGBs and RGNs, the first row is an RGB and the second row is an RGN. I want to split each of them into two different data frames. if I tried the following line to select the RGBs it produces the following output:

train_rgbs = train_data.apply(lambda x : x if 'rgn' not in x.Image_id else None, axis = 1).dropna()

train_rgbs.head()
            Image_id    Label
0  id_004wknd7qd.jpg    blast
2  id_005sitfgr2.jpg    brown
4  id_00stp9t6m6.jpg    blast
6  id_012zxewnhx.jpg    blast
8  id_0186qwq2at.jpg  healthy

but if I changed the lambda function if statement the output completely change

train_rgns = train_data.apply(lambda x : x if 'rgn' in x.Image_id else None, axis = 1).dropna()
train_rgns.head()
1    Image_id    id_004wknd7qd_rgn.jpg
Label       ...
3    Image_id    id_005sitfgr2_rgn.jpg
Label       ...
5    Image_id    id_00stp9t6m6_rgn.jpg
Label       ...
7    Image_id    id_012zxewnhx_rgn.jpg
Label       ...
9    Image_id    id_0186qwq2at_rgn.jpg
Label       ...
dtype: object

Why When selecting the RGNs the output changes and does not output a data frame as in the first case. Thanks in Advance

CodePudding user response:

train_rgbs = train_data[train_data.Image_id.apply(lambda x: 'rgn' not in x)]
train_rgns = train_data[train_data.Image_id.apply(lambda x: 'rgn' in x)]
  • Related