Home > database >  How do I multiply columns of a dataset by a constant?
How do I multiply columns of a dataset by a constant?

Time:10-14

I am quite new to R so eggscuse my lack of ability. I have tried and failed a fair bit, and would appreciate any input.

I am asked to get rid of inconsistent use of "." and "," to indicate decimals by multiplying every number in certain columns by some multiple of 10. I have tried to simply multiply using the binary operator * but it obviously doesnt work as some columns are factors, which is required in this case.

I have tried using this code aswell but get erros :subscript "Var" cant be "NA"

data %>% mutate_if(is.numeric, ~ . * 1000)

Below is the code I have for my dataset

datat <- c("Starting_year" , "Rank" , "Team" , "Home_total_Games", "Home_Total_Attendance" , "Home_Avg_Attendance" , "Home_capacity" , "Away_Total_Attendance" , "Away_Avg_Attendance" , "Away_Capacity")
names(data) <- datat 

Factors assigned

data$Rank <- as.factor(data$Rank)
data$Starting_year <- as.factor(data$Starting_year)

Thanks in advance

Cant embed but there is a picture below of the data. I am asked to use a function in dplyr to multiply the columns by 1000 to remove all the . and ,

dataset

CodePudding user response:

What is the format of numbers?

If the format is: 1.000.000,5, where . is a thousand separator, while , is a decimal separator, just use gsub:

foo = "1.000.000,5"
bar = gsub("\\.", "", foo) # "1000000,5"
baz = gsub(",", "\\.", bar) # "1000000.5"
as.numeric(baz)

In this case, factor is not a problem because gsub will de-factor the vector.

If you need to multiply the numbers after that, it is not a problem. Transform this into a function (such as convert_decimal) and apply it to columns you want:

data$column = convert_decimal(data$column)

For multiple selected columns (let's call the vector of names selection):

data[selection] = lapply(data[selection], convert_decimal)

CodePudding user response:

Using @Colombo's example, another option is to use readr::parse_number and defining a locale.

foo <- "1.000.000,5"
x <- readr::parse_number(
    foo, locale = readr::locale(decimal_mark = ",", grouping_mark = "."))
x
#[1] 1e 06

You could also define a global locale for your particular analysis that ensures that all numbers are parsed consistently. Obviously this assumes that number formatting is consistent.

BTW, you can verify that x indeed includes the fractional .5 if you do sprintf("%.1f", x).

  •  Tags:  
  • r
  • Related