I did an online experiment with 700 participants and got the data for each participant in a seperate csv file.
My first step was to import just one file and try to wrangle it the best way for further analysis. The next step would to apply this to all 700 csv files and then merge everything together. Is this more or less the right way to do it?
I am new to R and stuck on the wrangling part. The first picture is what I got so far (current). the second picture is were I want to go (goal).
- Is it possible, to move all the data to the top of each column, that no empty cells/NA is above the data?
- in the column RT_first_letter: is it possible to get only the first entry of the row 6 (in picture current). In this case 2.949...?
Thanks for the help in advance!
CodePudding user response:
possibly not the fastes method, but (imho) easy to follow
#install.packages("qpcR")
library(qpcR)
# Sample data
mydata <- data.frame(col1 = c(NA, 1, 2, NA, 3),
col2 = c(NA, NA, NA, NA, 6),
col3 = c(1, 2, NA, NA, NA))
#Split to individual columns, drop NA values
L <- lapply(
split.default(mydata, f = names(mydata)),
function(x) x[!is.na(x)]
)
# $col1
# [1] 1 2 3
#
# $col2
# [1] 6
#
# $col3
# [1] 1 2
final <- as.data.frame(Reduce(qpcR:::cbind.na, L)) #note the three dots!!
names(final) <- names(L)
# col1 col2 col3
# 1 1 6 1
# 2 2 NA 2
# 3 3 NA NA