I have a character column called ipaq_2 in r that is like "02:00" (hour:minute). Also the column has some NA values. I need to convert this column into a numeric column that shows the minutes (in the above example: 120), replacing at the same time NA values with the mean.
CodePudding user response:
You can convert the character column to hours and minutes with lubridate::hm
, then convert the value to minutes. Then, we can replace any NA
values with the mean of the minutes column.
library(lubridate)
df <- data.frame(ipaq_2 = c("1:00", "0:45", "0:30", "1:30", NA))
df$ipaq_2 <- hm(df$ipaq_2, quiet = TRUE) # format to 'hours:minutes:seconds'
df$minutes <- hour(df$ipaq_2)*60 minute(df$ipaq_2)
df$minutes[is.na(df$minutes)] <- mean(df$minutes, na.rm = TRUE)
Or another possibility (thanks to @Ben):
df$minutes <- as.numeric(hm(df$ipaq_2, quiet = T))/60
df$minutes[is.na(df$minutes)] <- mean(df$minutes, na.rm = TRUE)
Or with tidyverse
:
library(tidyverse)
library(lubridate)
df %>%
mutate(ipaq_2 = hm(ipaq_2, quiet = TRUE),
minutes = hour(ipaq_2)*60 minute(ipaq_2),
minutes = ifelse(is.na(minutes), mean(minutes, na.rm = TRUE), minutes))
# Or using the alternative above:
# df %>%
# mutate(minutes = as.numeric(hm(ipaq_2, quiet = T))/60,
# minutes = ifelse(is.na(minutes), mean(minutes, na.rm = TRUE), minutes))
Output
ipaq_2 minutes
1 1H 0M 0S 60.00
2 45M 0S 45.00
3 30M 0S 30.00
4 1H 30M 0S 90.00
5 <NA> 56.25