Home > Net >  List Comprehension is not None AttributeError: 'NoneType' object has no attribute 'gr
List Comprehension is not None AttributeError: 'NoneType' object has no attribute 'gr

Time:11-04

Date,Amount,Subcategory,Memo,
29/10/2021,953.76,DIRECTDEP,Stripe Payments UK    STRIPE       BGC,
29/10/2021,-1260.44,FT,DIESEL INJECTORS U    TRANSFER          FT,
29/10/2021,-509.15,FT,TNT                   002609348          FT,

Above is some accounts data that I need to group, and later apply labels to.

Firstly I tried this df['Suppliers'] = [re.search(r'\b[a-zA-Z]{3,}\b', item).group(0) for item in df['Memo'] if item is not None]

But get AttributeError: 'NoneType' object has no attribute 'group' I understand that this is because the pattern was not found in the data.

So I tried removing the .group(0) and get a match object for each item respectively e.g <re.Match object; span=(0, 6), match='Stripe'>

Question: I am not sure why if item is not None doesn't skip over those items where no match is found. And why if I am returned a match object that I can't access with .group(0)

I have figured out a solution with a loop, but I would really like to understand what the problem is with the list comp approach.


for item in df['Memo']:
    match = re.search(r'\b[a-zA-Z]{3,}\b', item)
    try:
        my_list.append(match.group(0).lower())
        df['Suppliers'] = pd.DataFrame({'Suppliers': my_list})
    except AttributeError:
        my_list.append('na')
        continue

CodePudding user response:

When you use if item is not None you check if the item is not None, not the result of the re.search(r'\b[a-zA-Z]{3,}\b', item) operation.

Just use Series.str.extract directly:

df['Suppliers'] = df['Memo'].str.extract(r'\b([a-zA-Z]{3,})\b')

Mind you need to use a pair of unescaped parentheses to form a capturing group in the pattern when you want to use with with Series.str.extract.

If you want to add the na as string for the cases where no match was found add .fillna:

df['Suppliers'] = df['Memo'].str.extract(r'\b([a-zA-Z]{3,})\b').fillna('na')
  • Related