df looks like this:
description and keybenefits (14) | brand_cooltouch (1711) | brand_easylogic (1712) |
---|---|---|
Lorem Ipsum cooltouch Lorem Ipsum | ||
Lorem Ipsum easylogic Lorem Ipsum | ||
Lorem Ipsum Lorem Ipsum |
What I want: When Column description and keybenefits (14) contains the value 'cooltouch' columm brand_cooltouch (1711) needs to be set to value 1 (int). When Column description and keybenefits (14) contains the value 'easylogic' columm brand_easylogic (1712) needs to be set to value 1 (int).
Output that I want:
description and keybenefits (14) | brand_cooltouch (1711) | brand_easylogic (1712) |
---|---|---|
Lorem Ipsum cooltouch Lorem Ipsum | 1 | |
Lorem Ipsum Lorem Ipsum easylogic | 1 | |
Lorem Ipsum Lorem Ipsum |
Any help is very much appreciated.
CodePudding user response:
One can use pandas.Series.str.contains
.
For the string cooltouch
do the following
df['brand_cooltouch (1711)'] = df['description and keybenefits (14)'].str.contains('cooltouch', case=False).astype(int)
[Out]:
description and keybenefits (14) brand_cooltouch (1711) brand_easylogic (1712)
0 Lorem Ipsum cooltouch Lorem Ipsum 1 None
1 Lorem Ipsum easylogic Lorem Ipsum 0 None
2 Lorem Ipsum Lorem Ipsum 0 None
For the string easylogic
, do the following
df['brand_easylogic (1712)'] = df['description and keybenefits (14)'].str.contains('easylogic', case=False).astype(int)
[Out]:
description and keybenefits (14) brand_cooltouch (1711) brand_easylogic (1712)
0 Lorem Ipsum cooltouch Lorem Ipsum 1 0
1 Lorem Ipsum easylogic Lorem Ipsum 0 1
2 Lorem Ipsum Lorem Ipsum 0 0
Notes:
case=False
is to make it case insensitive.
CodePudding user response:
you can use np.where. I'd suggest to fill all cells where the condition is not met with NaN or 0. Here is a solution using np.nan
df["brand_cooltouch (1711)“] = np.where(df["description and keybenefits (14)“].str.contains("cooltouch"), 1, np.nan)
df["brand_easylogic (1712)“] = np.where(df["description and keybenefits (14)“].str.contains("easylogic"), 1, np.nan)
CodePudding user response:
Use Series.str.contains
-
df['brand_cooltouch (1711)'] = df['description and keybenefits (14)'].str.contains("cooltouch").astype(int)
Output
description and keybenefits (14) brand_cooltouch (1711) brand_easylogic (1712)
0 Lorem Ipsum cooltouch Lorem Ipsum 1 NaN
1 Lorem Ipsum easylogic Lorem Ipsum 0 NaN
2 Lorem Ipsum Lorem Ipsum 0 NaN
If you do not wish the resulting column to be 1's and 0's - you could also do something like -
df.loc[df['description and keybenefits (14)'].str.contains("cooltouch"), ['brand_cooltouch (1711)']] = '1'
df.loc[~df['description and keybenefits (14)'].str.contains("cooltouch"), ['brand_cooltouch (1711)']] = ''
Output
description and keybenefits (14) brand_cooltouch (1711) brand_easylogic (1712)
0 Lorem Ipsum cooltouch Lorem Ipsum 1 NaN
1 Lorem Ipsum easylogic Lorem Ipsum NaN
2 Lorem Ipsum Lorem Ipsum NaN