I have two dataframes with labels.
The first is the datetime occurance of the labels.
The 2nd is the frequeny at which these labels occure.
df_labels
DateTime | Labels |
---|---|
01/01/21 | 4 |
02/02/21 | 7 |
03/02/21 | 9 |
04/02/21 | 9 |
06/02/21 | 7 |
05/02/21 | 4 |
df_label_frequency
Label | Label Frequency |
---|---|
4 | 40 |
7 | 66 |
9 | 12 |
These tabels are 10,000 rows in reality FYI. I want to add the Label Frequency row value to each row in the df_labels table like so:
DateTime | Labels | Label Frequency |
---|---|---|
01/01/21 | 4 | 40 |
02/02/21 | 7 | 66 |
03/02/21 | 9 | 12 |
04/02/21 | 9 | 12 |
06/02/21 | 7 | 66 |
05/02/21 | 4 | 40 |
I have been scratching my head and staring at this problem for hours, I have tried lamba functions, merging, if statements, for loops and just can't figure it out. I know it will be so obvious but I just dont get it. Does anyone have any suggestions?
CodePudding user response:
Try this map with a dictionary and zipping df columns together:
df_labels["Label Frequency"] = df_labels["Labels"].map(
dict(zip(df_label_frequency["Label"], df_label_frequency["Label Frequency"]))
)
df_labels
Output:
DateTime Labels Label Frequency
0 01/01/21 4 40
1 02/02/21 7 66
2 03/02/21 9 12
3 04/02/21 9 12
4 06/02/21 7 66
5 05/02/21 4 40
CodePudding user response:
You can do:
df_labels['Label Frequency'] = df_labels['Labels'].map(df_label_frequency.set_index('Label')['Label Frequency'].to_dict())
CodePudding user response:
If your "Label" in the frequency table is an index just use "apply":
df_labels["Label Frequency"] = df_labels['Labels'].apply(lambda x: df_label_frequency[x]).values
I think it's index because you can get this table like
df_labels['Label'].value_counts()