1st Desired Output:
Final Desired Output:
Thanks to all who take a look.
import pandas as pd
df = pd.read_excel('ActionNetwork.xlsx')
df['Matchups']= df['Scheduled'].apply(lambda x: x.split('PM | AM')[-1])
print(df)
CodePudding user response:
Since you want to put the matchups back into the same df
, the teams need to go into separate columns (otherwise it will throw an error about uneven row counts).
Instead of multiple splits, use a single str.extract
with the following patterns:
.*[AM|PM]
-- match everything up toAM
orPM
(but don't capture it)([0-9] [a-zA-Z ] )
-- capture 1 numbers and 1 letters/spaces (away team)([0-9] [a-zA-Z ] )
-- capture 1 numbers and 1 letters/spaces (home team)
pattern = r'.*[AM|PM] ([0-9] [a-zA-Z ] )([0-9] [a-zA-Z ] )'
df[['Away', 'Home']] = df['Scheduled'].str.extract(pattern)
# Scheduled Away Home
# 0 Fri 10/29, 11:30 PM 113NavyNAVY114TulsaTLSA 113NavyNAVY 114TulsaTLSA
# 1 Sat 10/30, 2:00 AM 114UNLVUNLV116NevadaNEV 114UNLVUNLV 116NevadaNEV
# 2 Sat 10/30, 11:00 PM 110Ole MissMISS118AuburnAUB 110Ole MissMISS 118AuburnAUB