Can I make a Python if condition using Regex on Pandas column to see if it contains something and th-CodePudding

Say I have data like this

	Account
1	Kevin (1234567)
2	Buzz (7896345)
3	Snakes (5438761)
4	Marv
5	Harry (9083213)

I want to use an if condition to search to see if the account number exists at the end of the name in the account column, if it does split the account number off and put it in a new column, if not pass and go on to the next Account.

Something like this although it does not work

dataset.loc[dataset.Account.str.contains(r'\(d ')], 'new'=dataset.Account.str.split('',n=1, expand=True)
dataset["Account_Number"] = new[1]

CodePudding user response：

Try:

df["Account number"] = df["Account"].str.extract(r"\((\d )\)$")
df["Account"] = df["Account"].str.replace(r"\s*\(\d \)$", "", regex=True)
print(df)

Prints:

  Account Account number
1   Kevin        1234567
2    Buzz        7896345
3  Snakes        5438761
4    Marv            NaN
5   Harry        9083213

CodePudding user response：

here is one way to do it

# split the account on ( and create two columns
df[['Account','Account Number']]= df['Account'].str.split('(', expand=True)

#replace the trailing ) with empty string
df['Account Number']=df['Account Number'].str.replace(r'\)','', regex=True ).str.strip()
df
dfdf

        Account     Account Number
0   1   Kevin              1234567
1   2   Buzz               7896345
2   3   Snakes             5438761
3   4   Marv                  None
4   5   Harry              9083213