Say I have data like this
Account | |
---|---|
1 | Kevin (1234567) |
2 | Buzz (7896345) |
3 | Snakes (5438761) |
4 | Marv |
5 | Harry (9083213) |
I want to use an if condition to search to see if the account number exists at the end of the name in the account column, if it does split the account number off and put it in a new column, if not pass and go on to the next Account.
Something like this although it does not work
dataset.loc[dataset.Account.str.contains(r'\(d ')], 'new'=dataset.Account.str.split('',n=1, expand=True)
dataset["Account_Number"] = new[1]
CodePudding user response:
Try:
df["Account number"] = df["Account"].str.extract(r"\((\d )\)$")
df["Account"] = df["Account"].str.replace(r"\s*\(\d \)$", "", regex=True)
print(df)
Prints:
Account Account number
1 Kevin 1234567
2 Buzz 7896345
3 Snakes 5438761
4 Marv NaN
5 Harry 9083213
CodePudding user response:
here is one way to do it
# split the account on ( and create two columns
df[['Account','Account Number']]= df['Account'].str.split('(', expand=True)
#replace the trailing ) with empty string
df['Account Number']=df['Account Number'].str.replace(r'\)','', regex=True ).str.strip()
df
dfdf
Account Account Number
0 1 Kevin 1234567
1 2 Buzz 7896345
2 3 Snakes 5438761
3 4 Marv None
4 5 Harry 9083213