Home > Blockchain >  assign name base on the value in a data frame in R?
assign name base on the value in a data frame in R?

Time:10-28

I have a data frame, there's a column as color code from 1-5

| color code |
| 1          |
| 3          |
| 3          |
| 5          |
| 2          |

each color code means 1=Yellow, 2=White, 3=Black, 4=Blue, 5=Brown

How do I create a new column assign each color code a color name like below?

| color code |color name|
| 1          | Yellow   |
| 3          | Black    |
| 3          | White    |
| 5          | Brown    |
| 2          | White    |

CodePudding user response:

You can use factor

dat$color_name <- factor(dat$color_code, 
                     levels = 1:5,
                     labels=c("Yellow", "White", "Black", "Blue", "Brown"))
dat
  color_code color_name
1          1     Yellow
2          3      Black
3          3      Black
4          5      Brown
5          2      White
6          4       Blue

I used @SamR data.

CodePudding user response:

One possible way to solve your problem:

dat$color_name = c("Yellow", "White", "Black", "Blue", "Brown")[dat$color_code]

Another solution

colors = c(`1`="Yellow", `2`="White", `3`="Black", `4`="Blue", `5`="Brown")

dat$color_name = unname(colors[as.character(dat$color_code)])

CodePudding user response:

You could define a factor variable to do this.

From R for Data Science:

In R, factors are used to work with categorical variables, variables that have a fixed and known set of possible values. They are also useful when you want to display character vectors in a non-alphabetical order.

It is worth reading that chapter - and really the entire book - as factors are a fundamental building block of data frames in R.

In this particular case, you need to set the labels of the factor to your color names, and the levels to your numeric values:

dat  <- data.frame(
    color_code = c(1,3,3,5,2,4)
)

color_codes = c(
    `1` = "Yellow", `2` ="White", `3` = "Black", `4` = "Blue", `5` = "Brown"
)

dat$color_name  <- factor(
    dat$color_code,
    levels = as.integer(names(color_codes)),
    labels = color_codes
)

dat

#   color_code color_name
# 1          1     Yellow
# 2          3      Black
# 3          3      Black
# 4          5      Brown
# 5          2      White
# 6          4       Blue
  •  Tags:  
  • r
  • Related