This is the relevant line of code:
dfc['Category'] = dfc['Description'].str.findall('Split|Transfer|REF', flags=re.IGNORECASE)
It looks in the column description for Split, Transfer or REF and returns them if they appear. For the ones with none of those words in the description, it leaves the column blank and I am really struggling to find a way to fill it in. I have tried converting from object to string or replacing with 0 or with NaN. This doesn't produce errors but the columns remain blank. I have looked at these solutions) too and none work.
Any assistance appreciated. Full example code:
import pandas as pd
import openpyxl
import datetime as dt
import numpy as np
dff = pd.DataFrame({'Date': ['20221003', '20221005'],
'Tran Type': ['BOOK TRANSFER CREDIT', 'ACH DEBIT'],
'Debit Amount': [0.00, -220000.00],
'Credit Amount': [182.90, 0.0],
'Description': ['BOOK CREDIT FROM ACCOUNT 98743447', 'USREF2448 ACH OFFSET'],
'Amount': [-220000.00, 182.90]})
import re
dff['Category'] = dff['Description'].str.findall('Ref|BCA|Fund|Transfer', flags=re.IGNORECASE)
In the example, I want the Category column to show "Other" or something in the first row and REF in the second (which it does).
Is there anything I can add to the findall line that will give a default value of Other for non-matches? Or any way to replace the blank cells in the dataframe afterwards? Any help appreciated.
CodePudding user response:
A proposition using pandas.Series.str.join
to flatten the list of the matches then pandas.DataFrame.replace
to replace empty strings (0 match) with NaN
values.
dff['Category'] = (
dff['Description'].str.findall('(Ref|BCA|Fund|Transfer)', flags=re.IGNORECASE)
.str.join(", ")
.replace(to_replace="", value=np.NaN)
)
# Output :
print(dff)
Date Tran Type Debit Amount Credit Amount Description Amount Category
0 20221003 BOOK TRANSFER CREDIT 0.0 182.9 BOOK CREDIT FROM ACCOUNT 98743447 -220000.0 NaN
1 20221005 ACH DEBIT -220000.0 0.0 USREF2448 DACH OFFSET 182.9 REF
You can change the value
parameter to "Other"
, "Missing Value"
of whatever you want.