I need help with this pandas split with regex. I'm getting the error ValueError: Columns must be same length as key
.
my column of data is like this
PURCHASE AUTHORIZED ON 03/30 UOFU BOOKSTORE 1 …
PURCHASE AUTHORIZED ON 03/29 WM SUPERC Wal-Mart Sup …
PURCHASE AUTHORIZED ON 03/29 KFC/AW #526 …
PURCHASE AUTHORIZED ON 03/31 UU VISITOR PARKING …
ATM WITHDRAWAL AUTHORIZED ON 04/03 Main Street …
my code is
df[['Auth_date', 'Description']] = df['Description'].str.split('(?<=\d{2}\d{2}).', regex=True)
desired results would be.
Auth_date Description
PURCHASE AUTHORIZED ON 03/30 UOFU BOOKSTORE 1 …
PURCHASE AUTHORIZED ON 03/29 WM SUPERC Wal-Mart Sup …
PURCHASE AUTHORIZED ON 03/29 KFC/AW #526 …
PURCHASE AUTHORIZED ON 03/31 UU VISITOR PARKING …
ATM WITHDRAWAL AUTHORIZED ON 04/03 Main Street …
CodePudding user response:
Given:
Description
0 PURCHASE AUTHORIZED ON 03/30 UOFU BOOKSTORE 1 …
1 PURCHASE AUTHORIZED ON 03/29 WM SUPERC Wal-Mar...
2 PURCHASE AUTHORIZED ON 03/29 KFC/AW #526 …
3 PURCHASE AUTHORIZED ON 03/31 UU VISITOR PARKING …
4 ATM WITHDRAWAL AUTHORIZED ON 04/03 Main Street …
Doing:
df[['Auth_date', 'Description']] = df['Description'].str.split('(?<=\d{2}/\d{2}).', expand=True, regex=True)
print(df)
Output:
Description Auth_date
0 UOFU BOOKSTORE 1 … PURCHASE AUTHORIZED ON 03/30
1 WM SUPERC Wal-Mart Sup … PURCHASE AUTHORIZED ON 03/29
2 KFC/AW #526 … PURCHASE AUTHORIZED ON 03/29
3 UU VISITOR PARKING … PURCHASE AUTHORIZED ON 03/31
4 Main Street … ATM WITHDRAWAL AUTHORIZED ON 04/03
Works fine for me.