Found this link and my work is somewhat similar.
Say I have:
x = ['the', 'the', 'and', 'a', 'apple', 'heart', 'heart']
y = {'words': ['the', 'belt', 'computer', 'heart','and'],'values':[3,2,1,1,4]}
Using the suggestion in the link above, I got this:
df = pd.DataFrame.from_dict(y)
items = set(df['words'])
found = [i for i in x if i in items]
print(found)
The result is: ['the', 'the', 'and', 'heart', 'heart']
I want to be able to get the corresponding value of the word and I am stuck. The result I want is this:
[3,3,4,1,1]
Any thoughts on how to achieve this? Would greatly appreciate it.
CodePudding user response:
Not the most efficient one in my opinion, but this should do:
found = [df.loc[df['words'] == i, 'values'].iloc[0] for i in x if i in items]
CodePudding user response:
You don't need pandas. First rework your dictionary to have the words as keys, then use a comprehension:
y2 = dict(zip(*y.values()))
[y2[i] for i in x if i in y2]
Output: [3,3,4,1,1]
The (much less efficient) equivalent in pandas is:
s = df.set_index('words')['values']
pd.Series(x).map(s).dropna()
Output:
0 3.0
1 3.0
2 4.0
5 1.0
6 1.0
dtype: float64