Home > database >  How can I create a column for a dataframe where values are dependent on the values of another column
How can I create a column for a dataframe where values are dependent on the values of another column

Time:11-09

I have been trying to create a column for a dataframe where each new observation depends on the value of the observations of other columns. I tried to come up with code that would do that but haven't been any lucky

I also checked these answers ) but couldn't make them work if anyone has an idea on how to do it I would be truly grateful

    for (dff$prcganancia in dff){ 
     if(dff$grupo == 1){
      prcganancia == 0.3681
     }elseif(dff$grupo == 2){
      prcganancia == 0.2320
     }elseif(dff$grupo == 3){
      prcganancia == 0.2851
     }elseif(dff$grupo == 4){
      prcganancia == 0.4443
     }elseif(dff$grupo == 5){
      prcganancia == 0.3353
     }elseif(dff$grupo == 6){
      prcganancia == 0.2656
     }elseif(dff$grupo == 7){
      prcganancia == 0.2597
     }elseif(dff$grupo == 8){
      prcganancia == 0.2211
     }elseif(dff$grupo == 9){
      prcganancia == 0.3782
     }elseif(dff$grupo == 10){
      prcganancia == 0.3752}
     }

What my lousy code is trying to do is make the observation for prcganancia where grupo equals to 1, 0.3681 (every obsrvation where grupo == 1 the observation for prcganancia == 0.3681) and so on

CodePudding user response:

You could simply index with [:

#Sample data
set.seed(1)
dff <- data.frame(grupo = sample(10, size = 10, replace = TRUE))

#Create vector of unique values per group
var_group <- c(.36, .23, .28, .44, .33, .26, .25, .22, .37, .37)

#Create the variable
dff$prcganancia <- var_group[dff$grupo]

   grupo prcganancia
1      9        0.37
2      4        0.44
3      7        0.25
4      1        0.36
5      2        0.23
6      7        0.25
7      2        0.23
8      3        0.28
9      1        0.36
10     5        0.33

CodePudding user response:

for dplyr it would be:

library(dplyr)

dff <- tibble(
  grupo = sample(1:10, 20, replace = TRUE)
) 

dff |> 
  mutate(prcganancia = case_when(
    grupo == 1 ~ 0.3681, 
    grupo == 2 ~ 0.2320, 
    grupo == 3 ~ 0.2851, 
    grupo == 4 ~ 0.4443, 
    grupo == 5 ~ 0.3353, 
    grupo == 6 ~ 0.2656, 
    grupo == 7 ~ 0.2597, 
    grupo == 8 ~ 0.2211, 
    grupo == 9 ~ 0.3782, 
    grupo == 10 ~ 0.3752
  ))

Output is:

# A tibble: 20 × 2
   grupo prcganancia
   <int>       <dbl>
 1     9       0.378
 2     3       0.285
 3     2       0.232
 4     8       0.221
 5     4       0.444
 6     2       0.232
 7     1       0.368
 8     2       0.232
 9     6       0.266
10     1       0.368
11     9       0.378
12     6       0.266
13     6       0.266
14     9       0.378
15     2       0.232
16     5       0.335
17     8       0.221
18     5       0.335
19     2       0.232
20     1       0.368

CodePudding user response:

Try dplyr::recode(). You can pass either "old value = new value" pairs:

set.seed(13)
library(dplyr)

dff <- data.frame(grupo = sample(10, 10))

dff$prcganancia <- recode(
  dff$grupo,
  `1` = 0.3681, 
  `2` = 0.2320, 
  `3` = 0.2851, 
  `4` = 0.4443, 
  `5` = 0.3353, 
  `6` = 0.2656, 
  `7` = 0.2597, 
  `8` = 0.2211, 
  `9` = 0.3782, 
  `10` = 0.3752
)

Or unnamed values, which will be matched by index:

dff$prcganancia <- recode(
  dff$grupo,
  0.3681, 
  0.2320, 
  0.2851, 
  0.4443, 
  0.3353, 
  0.2656, 
  0.2597, 
  0.2211, 
  0.3782, 
  0.3752
)

Result:

   grupo prcganancia
1      8      0.2211
2      3      0.2851
3     10      0.3752
4      5      0.3353
5      2      0.2320
6      7      0.2597
7      6      0.2656
8      4      0.4443
9      9      0.3782
10     1      0.3681
  • Related