Home > Mobile >  Filter rows matched string patterns (specific string integer another specific string) in Pandas
Filter rows matched string patterns (specific string integer another specific string) in Pandas

Time:02-24

Given a DataFrame df:

   id                                        items
0   1  CRB Spot Index: Grease: First 2 Months: YoY
1   2                     CRB Spot Index: Textiles
2   3                    CRB Spot Index: Livestock
3   4      Korea: Export Value: First 10 Days: YoY
4   5      Korea: Export Value: First 20 Days: YoY
5   6      Korea: Import Value: First 20 Days: YoY
6   7      Korea: Import Value: First 10 Days: YoY
7   8    Vietnam: Import Value: First 2 Weeks: MoM
8   9    Vietnam: Import Value: First 3 Weeks: MoM

I want to filter items which match patterns: First integer Days or First integer Weeks, then create a new column to indicate them:

   id                                        items indicator
0   1  CRB Spot Index: Grease: First 2 Months: YoY       NaN
1   2                     CRB Spot Index: Textiles       NaN
2   3                    CRB Spot Index: Livestock       NaN
3   4      Korea: Export Value: First 10 Days: YoY         Y
4   5      Korea: Export Value: First 20 Days: YoY         Y
5   6      Korea: Import Value: First 20 Days: YoY         Y
6   7      Korea: Import Value: First 10 Days: YoY         Y
7   8    Vietnam: Import Value: First 2 Weeks: MoM         Y
8   9    Vietnam: Import Value: First 3 Weeks: MoM         Y

How could I do that using Pandas?

I use df.loc[(df['items'].str.contains('First|Days', na=False)) & (df['items'].str.contains('First|Weeks', na=False)), 'indicators']='Y', it doesn't generate an expected result as shown above, also maybe there are other more concise answers.

CodePudding user response:

Check regex for match First, then \s for space(s), then \d for integers, \s for again space(s) and last Days|Weeks for Days or Weeks:

df.loc[df['items'].str.contains('First\s \d \s Days|Weeks', na=False), 'indicators']='Y'
print (df)
   id                                        items indicators
0   1  CRB Spot Index: Grease: First 2 Months: YoY        NaN
1   2                     CRB Spot Index: Textiles        NaN
2   3                    CRB Spot Index: Livestock        NaN
3   4      Korea: Export Value: First 10 Days: YoY          Y
4   5      Korea: Export Value: First 20 Days: YoY          Y
5   6      Korea: Import Value: First 20 Days: YoY          Y
6   7      Korea: Import Value: First 10 Days: YoY          Y
7   8    Vietnam: Import Value: First 2 Weeks: MoM          Y
8   9    Vietnam: Import Value: First 3 Weeks: MoM          Y
  • Related