Home > Software design >  Get value Dataframe based on similar string
Get value Dataframe based on similar string

Time:12-05

I want get the value of a cell in Dataframe based on string that is not equal but so similar. This is the dataframe


        Teams       GP  Pts
0   Liverpool       15  44
1   Chelsea         15  35
2   Manchester C.   15  32
3   West Ham Utd    15  28
4   Manchester Utd  14  24
5   Leicester City  14  22
6   Watford         15  20
7   Aston Villa     14  19
8   Crystal Palace  14  19
9   Arsenal         14  17
10  Brentford       14  17
11  Everton         14  17
12  Newcastle Utd   15  17
13  Brighton        15  14
14  Burnley         14  14
15  Southampton     15  14
16  Leeds Utd       14  13
17  Tottenham       13  13
18  Wolverhampton   15  12
19  Norwich City    14  8

Code

hometeam = 'Manchester City'
pts_man_city = df[df.Teams == hometeam].iloc[0]['Pts']

But got IndexError: single positional indexer is out-of-bounds

CodePudding user response:

You can use thefuzz.process (previously fuzzywuzzy):

# pip install thefuzz
from thefuzz import process

hometeam = 'Manchester City'

best = process.extractOne(hometeam, df['Teams'])[0]
df.loc[df['Teams'].eq(best), 'Pts'].iloc[0]

output: 32

CodePudding user response:

We need to find similar strings. Ok, let's do it!

from difflib import SequenceMatcher

def similar(a, b):
   return SequenceMatcher(None, a, b).ratio()

alpha = 0.75
idx = df.team.apply(lambda x: x if similar(x, your_team) > alpha else None).dropna().index[0]

df.iloc[idx]['pts']

Just change alpha parameter for your task.

CodePudding user response:

The below code returns row of specific team ''' df.loc[df['Teams'] == hometown] '''

  • Related