split the last string after delimiter without knowing the number of delimiters available in a new co-CodePudding

I have a column in a data frame that contains multiple words separated by a dot in between but I can never now the number of words inside this cell so what I want is to get the last word after the last dot

Column X:
www.xyz.hgk.lmo.de.eu.price
www.amazon.us.stock
www.hhhh.com.price
www.ebay.eu.stock
www.mmm.price

Desired column values:
price
stock
price
stock
price

My trial:

x2['Desired Column Values'] = x2['Column X'].str.split('.').str[2]

but this is not correct because I can't know the number of '.' in each cell

CodePudding user response：

You can do a rsplit, then extract the last element:

df['Column X'].str.rsplit('.', 1).str[-1]

Equivalently, you can apply the python function(s):

df['Column X'].apply(lambda x: x.rsplit('.',1)[-1])

Alternatively, you can extract a regex pattern:

df['Column X'].str.extract('([^.] )$', expand=False)

Output:

0    price
1    stock
2    price
3    stock
4    price
Name: Column X, dtype: object

CodePudding user response：

x = 'www.xyz.hgk.lmo.de.eu.price'
x_list = x.split('.')
print(x_list[-1])
# x.split -> takes a parameter called sep, it returns a list of words that are separated by '.', then to access the last element of the list we use [-1]

alter and apply this approach to your requirement