Home > OS >  extract number of ranking position in pandas dataframe
extract number of ranking position in pandas dataframe

Time:11-28

I have a pandas dataframe with a column named ranking_pos. All the rows of this column look like this: #123 of 12,216.

The output I need is only the number of the ranking, so for this example: 123 (as an integer).

How do I extract the number after the # and get rid of the of 12,216?

Currently the type of the column is object, just converting it to integer with .astype() doesn't work because of the other characters.

CodePudding user response:

You can use .str.extract:

df['ranking_pos'].str.extract(r'#(\d )').astype(int)

or you can use .str.split():

df['ranking_pos'].str.split(' of ').str[0].str.replace('#', '').astype(int)

CodePudding user response:

df.loc[:,"ranking_pos"] =df.loc[:,"ranking_pos"].str.replace("#","").astype(int)
  • Related