I want get the value of a cell in Dataframe based on string that is not equal but so similar. This is the dataframe
Teams GP Pts
0 Liverpool 15 44
1 Chelsea 15 35
2 Manchester C. 15 32
3 West Ham Utd 15 28
4 Manchester Utd 14 24
5 Leicester City 14 22
6 Watford 15 20
7 Aston Villa 14 19
8 Crystal Palace 14 19
9 Arsenal 14 17
10 Brentford 14 17
11 Everton 14 17
12 Newcastle Utd 15 17
13 Brighton 15 14
14 Burnley 14 14
15 Southampton 15 14
16 Leeds Utd 14 13
17 Tottenham 13 13
18 Wolverhampton 15 12
19 Norwich City 14 8
Code
hometeam = 'Manchester City'
pts_man_city = df[df.Teams == hometeam].iloc[0]['Pts']
But got IndexError: single positional indexer is out-of-bounds
CodePudding user response:
You can use thefuzz.process
(previously fuzzywuzzy
):
# pip install thefuzz
from thefuzz import process
hometeam = 'Manchester City'
best = process.extractOne(hometeam, df['Teams'])[0]
df.loc[df['Teams'].eq(best), 'Pts'].iloc[0]
output: 32
CodePudding user response:
We need to find similar strings. Ok, let's do it!
from difflib import SequenceMatcher
def similar(a, b):
return SequenceMatcher(None, a, b).ratio()
alpha = 0.75
idx = df.team.apply(lambda x: x if similar(x, your_team) > alpha else None).dropna().index[0]
df.iloc[idx]['pts']
Just change alpha parameter for your task.
CodePudding user response:
The below code returns row of specific team ''' df.loc[df['Teams'] == hometown] '''