I have a column in a data frame that contains multiple words separated by a dot in between but I can never now the number of words inside this cell so what I want is to get the last word after the last dot
Column X:
www.xyz.hgk.lmo.de.eu.price
www.amazon.us.stock
www.hhhh.com.price
www.ebay.eu.stock
www.mmm.price
Desired column values:
price
stock
price
stock
price
My trial:
x2['Desired Column Values'] = x2['Column X'].str.split('.').str[2]
but this is not correct because I can't know the number of '.' in each cell
CodePudding user response:
You can do a rsplit
, then extract the last element:
df['Column X'].str.rsplit('.', 1).str[-1]
Equivalently, you can apply the python function(s):
df['Column X'].apply(lambda x: x.rsplit('.',1)[-1])
Alternatively, you can extract
a regex pattern:
df['Column X'].str.extract('([^.] )$', expand=False)
Output:
0 price
1 stock
2 price
3 stock
4 price
Name: Column X, dtype: object
CodePudding user response:
x = 'www.xyz.hgk.lmo.de.eu.price'
x_list = x.split('.')
print(x_list[-1])
# x.split -> takes a parameter called sep, it returns a list of words that are separated by '.', then to access the last element of the list we use [-1]
alter and apply this approach to your requirement