Home > Blockchain >  How do I Split a Cell With Multiple Delimiters and/or Strings?
How do I Split a Cell With Multiple Delimiters and/or Strings?

Time:10-28

I scraped table data from enter image description here

1st Desired Output:

enter image description here

Final Desired Output:

enter image description here

Thanks to all who take a look.

import pandas as pd

df = pd.read_excel('ActionNetwork.xlsx')
df['Matchups']= df['Scheduled'].apply(lambda x: x.split('PM | AM')[-1])
print(df)

CodePudding user response:

Since you want to put the matchups back into the same df, the teams need to go into separate columns (otherwise it will throw an error about uneven row counts).

Instead of multiple splits, use a single str.extract with the following patterns:

  • .*[AM|PM] -- match everything up to AM or PM (but don't capture it)
  • ([0-9] [a-zA-Z ] ) -- capture 1 numbers and 1 letters/spaces (away team)
  • ([0-9] [a-zA-Z ] ) -- capture 1 numbers and 1 letters/spaces (home team)
pattern = r'.*[AM|PM] ([0-9] [a-zA-Z ] )([0-9] [a-zA-Z ] )'
df[['Away', 'Home']] = df['Scheduled'].str.extract(pattern)

#                                          Scheduled             Away          Home
# 0      Fri 10/29, 11:30 PM 113NavyNAVY114TulsaTLSA      113NavyNAVY  114TulsaTLSA
# 1       Sat 10/30, 2:00 AM 114UNLVUNLV116NevadaNEV      114UNLVUNLV  116NevadaNEV
# 2  Sat 10/30, 11:00 PM 110Ole MissMISS118AuburnAUB  110Ole MissMISS  118AuburnAUB
  • Related