Filter rows matched string patterns (specific string integer another specific string) in Pandas-CodePudding

Given a DataFrame df:

   id                                        items
0   1  CRB Spot Index: Grease: First 2 Months: YoY
1   2                     CRB Spot Index: Textiles
2   3                    CRB Spot Index: Livestock
3   4      Korea: Export Value: First 10 Days: YoY
4   5      Korea: Export Value: First 20 Days: YoY
5   6      Korea: Import Value: First 20 Days: YoY
6   7      Korea: Import Value: First 10 Days: YoY
7   8    Vietnam: Import Value: First 2 Weeks: MoM
8   9    Vietnam: Import Value: First 3 Weeks: MoM

I want to filter items which match patterns: First integer Days or First integer Weeks, then create a new column to indicate them:

   id                                        items indicator
0   1  CRB Spot Index: Grease: First 2 Months: YoY       NaN
1   2                     CRB Spot Index: Textiles       NaN
2   3                    CRB Spot Index: Livestock       NaN
3   4      Korea: Export Value: First 10 Days: YoY         Y
4   5      Korea: Export Value: First 20 Days: YoY         Y
5   6      Korea: Import Value: First 20 Days: YoY         Y
6   7      Korea: Import Value: First 10 Days: YoY         Y
7   8    Vietnam: Import Value: First 2 Weeks: MoM         Y
8   9    Vietnam: Import Value: First 3 Weeks: MoM         Y

How could I do that using Pandas?

I use df.loc[(df['items'].str.contains('First|Days', na=False)) & (df['items'].str.contains('First|Weeks', na=False)), 'indicators']='Y', it doesn't generate an expected result as shown above, also maybe there are other more concise answers.

CodePudding user response：

Check regex for match First, then \s for space(s), then \d for integers, \s for again space(s) and last Days|Weeks for Days or Weeks:

df.loc[df['items'].str.contains('First\s \d \s Days|Weeks', na=False), 'indicators']='Y'
print (df)
   id                                        items indicators
0   1  CRB Spot Index: Grease: First 2 Months: YoY        NaN
1   2                     CRB Spot Index: Textiles        NaN
2   3                    CRB Spot Index: Livestock        NaN
3   4      Korea: Export Value: First 10 Days: YoY          Y
4   5      Korea: Export Value: First 20 Days: YoY          Y
5   6      Korea: Import Value: First 20 Days: YoY          Y
6   7      Korea: Import Value: First 10 Days: YoY          Y
7   8    Vietnam: Import Value: First 2 Weeks: MoM          Y
8   9    Vietnam: Import Value: First 3 Weeks: MoM          Y