Original df contains millions of rows. Here's an example:
df_sample = pd.DataFrame(
{'keyword': {0: 756, 1: 756, 2: 13586, 3: 1694, 4: 13586}}
)
df_sample
Now we have two lists:
list_a = [756, 13586, 1694]
list_b = [1.44, 4.55, 10]
And I need the following output:
df_output = pd.DataFrame(
{'keyword': {0: 756, 1: 756, 2: 13586, 3: 1694, 4: 13586},
'Standard_Deviation_keyword': {0: 1.44, 1: 1.44, 2: 4.55, 3: 10, 4: 4.55}}
)
df_output
I guess solution would be something like:
def key_std(df):
add a new column = Standard_Deviation_keyword
for every x value of df.keyword:
if x == "a value" in list_a:
find the value at the same index in list_b
and add that value to the same row in
Standard_Deviation_keyword column
CodePudding user response:
zip
the lists and create a mapping dict
then use Series.map
to substitute values
df['std'] = df['keyword'].map(dict(zip(list_a, list_b)))
keyword std
0 756 1.44
1 756 1.44
2 13586 4.55
3 1694 10.00
4 13586 4.55
CodePudding user response:
You can also use df.replace
which will not give NaN
unlike map
if there is no corresponding elements in the lists:
df_sample['std'] = df_sample.keyword.replace(list_a, list_b)
CodePudding user response:
You can use map
with a Series:
s = pd.Series(list_b, index=list_a)
df_sample['std'] = df_sample['keyword'].map(s)
output:
keyword std
0 756 1.44
1 756 1.44
2 13586 4.55
3 1694 10.00
4 13586 4.55