I have a problem with my script which generates KeyError: 'One or more row labels was not found'
df1 = pd.DataFrame({'Destcode' : ['A','B','C','D','E','F','G'],
'City A' : ['Available','Available','Available','Available','Not Available','Not Available','Available'],
'City B' : ['Not Available','Available','Not Available','Available','Not Available','Not Available','Available'],
'City C' : ['Available','Available','Not Available','Available','Not Available','Available','Available']})
df2 = pd.DataFrame({'Destcode' : ['C','F','G','D','E'],
'Origin' : ['City A','City C','City A','City B','City D']})
so, i have 2 DataFrame.
DataFrame 1
df1
Destcode City A City B City C
0 A Available Not Available Available
1 B Available Available Available
2 C Available Not Available Not Available
3 D Available Available Available
4 E Not Available Not Available Not Available
5 F Not Available Not Available Available
6 G Available Available Available
DataFrame 2
df2
Destcode Origin
0 C City A
1 F City C
2 G City A
3 D City B
4 E City D
And I run this script
df2['Cek Available'] = df1.set_index('Destcode').lookup(df2.Destcode, df2.Origin)
And I get an error enter image description here
I know the problem is in Origin City D which is not in DataFrame df1.
How do you do that if the data you are looking for does not exist, it will return the value "Not Available"? Please help me to solve this problem
Destcode Origin Cek Available
0 C City A ?
1 F City C ?
2 G City A ?
3 D City B ?
4 E City D ?
CodePudding user response:
You can use try/except
and a for
loop:
cek = []
df1 = df1.set_index('Destcode')
for c, o in zip(df2['Destcode'], df2['Origin']):
try:
x = df1.loc[c,o]
except:
x = np.nan
cek.append(x)
df2['Cek Available'] = cek
Output:
Destcode Origin Cek Available
0 C City A Available
1 F City C Available
2 G City A Available
3 D City B Available
4 E City D NaN
CodePudding user response:
Data is usually easier to work with when it's in tidy data format. Melt df1
and join.
df3 = df1.melt(id_vars="Destcode", var_name="Origin")
df2.merge(df3, on=["Destcode", "Origin"], how="left")
results in
Destcode Origin value
0 C City A Available
1 F City C Available
2 G City A Available
3 D City B Available
4 E City D NaN
CodePudding user response:
I'd use get
and melt
:
x = df1.melt(id_vars='Destcode', var_name='Origin').set_index(['Destcode', 'Origin']).squeeze()
df2['Cek Available'] = df2.apply(lambda y: x.get(tuple(y), np.nan), axis=1)
Output:
>>> df2
Destcode Origin Cek Available
0 C City A Available
1 F City C Available
2 G City A Available
3 D City B Available
4 E City D NaN
>>>