I'm trying to make column names from the rows with date. Take the following dataset, for instance:
# create data frame
df <- data.frame(student=c('A', 'B', 'C', 'D', 'E'),
scores=c('May, 30', 2022, 31, 39, 35))
# glimpse data
df
student scores
1 A May, 30
2 B 2022
3 C 31
4 D 39
5 E 35
I want to change the rows 1 and 2 from score
column and changed them into month_year
format and then remove the entire rows. I'm trying the following script to get the column names but getting bizarre results:
colnames(df) <- df[2,]
df <- df[-2,]
Desired Output
student may_2022
1 C 31
2 D 39
3 E 35
What would be the ideal way of getting the desired output? Any suggestions would be appreciated. Thanks!
CodePudding user response:
If this is the way your data are truly imported, as a generalizable approach you could try getting the month from the first row using sub
and then pasting with the year from the second row.
names(df)[2] <- paste0(sub("[^[:alpha:]] ", "", df$scores[1]), "_",df$scores[2])
df <- df[-c(1:2),]
Output:
# student May_2022
# 3 C 31
# 4 D 39
# 5 E 35