I working with a data set that contains numbers stored as characters. Below you can see data example of data extracted with the dput function:
structure(list(Y_O_13_Males = c("10", "0", "0", "0", "0", "0",
"0", "0", "0", "0"), Y_O_13_Females = c("1", "0", "0", "0", "0",
"0", "0", "0", "0", "0"), Y_O_13_Unknown = c("7", "0", "0", "0",
"0", "0", "0", "0", "0", "0"), Y14_17_Males = c("2", "0", "0",
"0", "0", "0", "0", "0", "0", "0"), Y14_17_Females = c("0", "0",
"0", "0", "0", "0", "0", "0", "0", "0"), Y14_17_Unknown = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y18_34_Males = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y18_34_Females = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y18_34_Unknown = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y35_64_Males = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y35_64_Females = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y35_64_Unknown = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y65_And_Over_Males = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y65_And_Over_Females = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Y65_And_Over_Unknown = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Unknown_And_Over_Males = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Unknown_And_Over_Females = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0"), Unknown_And_Over_Unknown = c("0",
"0", "0", "0", "0", "0", "0", "0", "0", "0")), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
Now I want to convert all of this data from character format to numeric.I tryed with this line of code but is not working.
df[,1:18]<-as.numeric(df[,1:18])
So can anybody help me how to solve this problem?
CodePudding user response:
Using dplyr
, I would use:
library(dplyr)
df %>%
mutate(
across(
.cols = where(is.character),
.fns = as.numeric
)
)
That also assumes that you want to change to numeric ALL character variables.
CodePudding user response:
You need to apply it, i.e.
df[1:18]<-lapply(df[1:18], as.numeric)
CodePudding user response:
Simple base R assuming you only have those 18 columns:
df[] <- lapply(df, as.numeric)
CodePudding user response:
The class can be changed from character
to numeric
with:
for(i in which(sapply(df, class) == "character")) class(df[[i]]) <- "numeric"
str(df)
#Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 10 obs. of 18 variables:
# $ Y_O_13_Males : num 10 0 0 0 0 0 0 0 0 0
# $ Y_O_13_Females : num 1 0 0 0 0 0 0 0 0 0
# $ Y_O_13_Unknown : num 7 0 0 0 0 0 0 0 0 0
# $ Y14_17_Males : num 2 0 0 0 0 0 0 0 0 0
# $ Y14_17_Females : num 0 0 0 0 0 0 0 0 0 0
# $ Y14_17_Unknown : num 0 0 0 0 0 0 0 0 0 0
# $ Y18_34_Males : num 0 0 0 0 0 0 0 0 0 0
# $ Y18_34_Females : num 0 0 0 0 0 0 0 0 0 0
# $ Y18_34_Unknown : num 0 0 0 0 0 0 0 0 0 0
# $ Y35_64_Males : num 0 0 0 0 0 0 0 0 0 0
# $ Y35_64_Females : num 0 0 0 0 0 0 0 0 0 0
# $ Y35_64_Unknown : num 0 0 0 0 0 0 0 0 0 0
# $ Y65_And_Over_Males : num 0 0 0 0 0 0 0 0 0 0
# $ Y65_And_Over_Females : num 0 0 0 0 0 0 0 0 0 0
# $ Y65_And_Over_Unknown : num 0 0 0 0 0 0 0 0 0 0
# $ Unknown_And_Over_Males : num 0 0 0 0 0 0 0 0 0 0
# $ Unknown_And_Over_Females: num 0 0 0 0 0 0 0 0 0 0
# $ Unknown_And_Over_Unknown: num 0 0 0 0 0 0 0 0 0 0
In this case integer
might fit better than numeric
.
CodePudding user response:
You could try type.convert
like below
> type.convert(df, as.is = TRUE)
# A tibble: 10 × 18
Y_O_13_Males Y_O_13_Females Y_O_13_Unknown Y14_17_Males Y14_17_Females
<int> <int> <int> <int> <int>
1 10 1 7 2 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
5 0 0 0 0 0
6 0 0 0 0 0
7 0 0 0 0 0
8 0 0 0 0 0
9 0 0 0 0 0
10 0 0 0 0 0
# … with 13 more variables: Y14_17_Unknown <int>, Y18_34_Males <int>,
# Y18_34_Females <int>, Y18_34_Unknown <int>, Y35_64_Males <int>,
# Y35_64_Females <int>, Y35_64_Unknown <int>, Y65_And_Over_Males <int>,
# Y65_And_Over_Females <int>, Y65_And_Over_Unknown <int>,
# Unknown_And_Over_Males <int>, Unknown_And_Over_Females <int>,
# Unknown_And_Over_Unknown <int>