I'm pretty new in R and I would like to know why this operation doesn't allow me to keep looping through a data frame.
I have a dataframe filled with the following data (take note that it only has one column):
1,230.1,37.8,69.2,22.1 <-- this is treated as a column like "1,230.1,37.8,69.2,22.1"
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9
for that I'm creating a empty dataframe as the target:
my.df <- data.frame(matrix(ncol = 5, nrow = 200))
colnames(my.df) <- c('i', 'a', 'b', 'c', 'label')
And I'm trying to parse every string to the target dataframe. But it doesn't fill it completly, just the first row. I believe the problem is with this line temp <- strsplit(a, ",")[[1]]
it should only substring one value and keep going, but it stops in the first iteration.
helper <- 1
for(a in DF){
temp <- strsplit(a, ",")[[1]]
print(temp)
my.df[helper, 1] <- temp[1]
my.df[helper, 2] <- temp[2]
my.df[helper, 3] <- temp[3]
my.df[helper, 4] <- temp[4]
my.df[helper, 5] <- temp[5]
helper <- helper 1
}
At the end I only have this output:
i a b c label
1 1 230.1 37.8 69.2 22.1
2 <NA> <NA> <NA> <NA> <NA>
3 <NA> <NA> <NA> <NA> <NA>
4 <NA> <NA> <NA> <NA> <NA>
5 <NA> <NA> <NA> <NA> <NA>
6 <NA> <NA> <NA> <NA> <NA>
Any ideas on how to accomplish this task or explain me why the loop dies in the first iteration? thank you.
CodePudding user response:
Instead of splitting the string, we could directly parse it with read.csv
read.csv(text = DF, header = FALSE,
col.names = c('i', 'a', 'b', 'c', 'label'), fill = TRUE)
-output
i a b c label
1 1 230.1 37.8 69.2 22.1
2 2 44.5 39.3 45.1 10.4
3 3 17.2 45.9 69.3 9.3
4 4 151.5 41.3 58.5 18.5
5 5 180.8 10.8 58.4 12.9
If it is a column in the dataset, extract the column and use read.csv
read.csv(text = DF1$col1, header = FALSE,
col.names = c('i', 'a', 'b', 'c', 'label'), fill = TRUE)
i a b c label
1 1 230.1 37.8 69.2 22.1
2 2 44.5 39.3 45.1 10.4
3 3 17.2 45.9 69.3 9.3
4 4 151.5 41.3 58.5 18.5
5 5 180.8 10.8 58.4 12.9
Regarding the for
loop issue, the helper <- helper 1
doesn't have any effect on the iteration. Instead, it can be
for(i in seq_along(DF1[[1]])) my.df[i,] <- strsplit(DF1[[1]][i], ",")[[1]]
> head(my.df)
i a b c label
1 1 230.1 37.8 69.2 22.1
2 2 44.5 39.3 45.1 10.4
3 3 17.2 45.9 69.3 9.3
4 4 151.5 41.3 58.5 18.5
5 5 180.8 10.8 58.4 12.9
6 <NA> <NA> <NA> <NA> <NA>
data
DF <- "1,230.1,37.8,69.2,22.1
2,44.5,39.3,45.1,10.4
3,17.2,45.9,69.3,9.3
4,151.5,41.3,58.5,18.5
5,180.8,10.8,58.4,12.9"
DF1 <- structure(list(col1 = c("1,230.1,37.8,69.2,22.1 ", "2,44.5,39.3,45.1,10.4",
"3,17.2,45.9,69.3,9.3", "4,151.5,41.3,58.5,18.5", "5,180.8,10.8,58.4,12.9"
)), class = "data.frame", row.names = c(NA, -5L))