I am working with the R programming language. I have the following example - there are two data frames (height_quantiles and test):
> height_quantiles
salary_type quant_80
1 A 3.752192
2 B 3.713571
3 C 4.117180
> str(height_quantiles)
'data.frame': 3 obs. of 2 variables:
$ salary_type: Factor w/ 3 levels "A","B","C": 1 2 3
$ quant_80 : Named num 3.75 3.71 4.12
..- attr(*, "names")= chr [1:3] "80%" "80%" "80%"
and
> head(test)
salary height salary_type
701 1.358904 1.6148796 A
702 -2.702212 1.0604070 A
703 1.534527 -4.0957218 A
704 5.594247 5.7373110 B
705 -1.823547 5.5808484 A
706 7.949913 -0.2021635 C
str(test)
'data.frame': 300 obs. of 3 variables:
$ salary : num 1.36 -2.7 1.53 5.59 -1.82 ...
$ height : num 1.61 1.06 -4.1 5.74 5.58 ...
$ salary_type: Factor w/ 3 levels "A","B","C": 1 1 1 2 1 3 2 1 2 3 ...
I am trying to write the following code :
test$height_pred = as.numeric(ifelse(test$salary_type == "A", height_quantiles[1,1], ifelse(test$salary_type == "B", height_quantiles[2,1], height_quantiles[3,1])))
But this returning values of "test$height_pred " as "1,2,3" . But I would like it to return values corresponding to the height_quantiles frame such as "3.75, 3.71 , 4.12".
Can someone please show me how to do this?
Thanks
CodePudding user response:
You need to extract data from the second column i.e height_quantiles[1,2]
, height_quantiles[2,2]
etc. Right now, you are doing it from the first column.
Also a better approach would be to use a join or match
.
test$height_pred <- height_quantiles$quant_80[match(test$salary_type, height_quantiles$salary_type)]
Or
merge(test, height_quantiles)