I have a dataframe and a dictionary. The dictionary holds integers, with a different integer per key. There is a column in the dataframe that holds those keys. I have been trying to improve my scripts' speed by vectorizing when possible. Because now, I am iterrows of the dataframe. How could I vectorize this?
Dictionary: Dict = {Words1:[20,10],Words2:[10,20],Words3:[30,50]}
Dataframe:
Col1 Col2
'Words1' Yes
'Words2' No
'Words3' Yes
I tried:
df['Col3'] = Dict[df['Col1']][1]
. Since you are laughing, you know it did not work.
What I want to do is:
Take the second value in the dictionary and put it in Col3 as per the key in Col1.
I will then take that Col3 and compute a final number in Col4 based upon if word is Yes/No in Col2:
a) If its Yes, I will add to the number in Col3
b) If it's no, I will subtract from the number in Col3.
I really appreciate the help.
CodePudding user response:
Using
df = pd.DataFrame({
'Col1':['Words1','Words2','Words3'],
'Col2':['Yes','No','Yes'],
'vals':[1,2,3]})
my_dict = {'Words1':[20,10],'Words2':[10,20],'Words3':[30,50]}
val = 3
The first part can be solved using map with a pretreated dict
reduced_dict = { k : my_dict[k][1] for k in my_dict}
df['Col3'] = df['Col1'].map(reduced_dict)
The second part can be cheesed by mapping Col2
to a sign and multiplying it by the value to add/subtract val
df['Col4'] = df['Col3'] df.Col2.map({'Yes':1,'No':-1})*val
Note that this also works if the value to add/subtract is part of the dataframe
df['Col4'] = df['Col3'] df.Col2.map({'Yes':1,'No':-1})*df.vals