Home > Back-end >  Does data frame row and column contains string? If so, return that string in new column
Does data frame row and column contains string? If so, return that string in new column

Time:12-18

I have a Data frame and I want to create a new column that is- if a string exists in a specific column then output that string as the value for the new column plus 3 number of spaces after that.

Example-

In this example I would want to search for the string "Note" and if that string exist in the column note, then put "Note" and what ever is in the next three spaces after that.

Before:

id partNumber note
1 a1b33 apples
2 hhgh5667 banana, Note 55, and pineapples
3 hhgh5667 Note 1A, and blueberries
4 09890ii blackberries

After:

id part_number note Note_number
1 a1b33 apples NA
2 hhgh5667 banana, Note 55, and pineapples Note 55
3 hhgh5667 Note 1A, and blueberries Note 1A
4 09890ii blackberries NA

CodePudding user response:

You can use a regular expression with str.extract to capture everything from Note to just before the comma.

df['Note_number'] = df.note.str.extract('(Note.*)(?=\,)')

Output

   id partNumber                             note Note_number
0   1      a1b33                           apples         NaN
1   2   hhgh5667  banana, Note 55, and pineapples     Note 55
2   3   hhgh5667         Note 1A, and blueberries     Note 1A
3   4    09890ii                     blackberries         NaN
  • Related