Home > Blockchain >  Splitting column with numeric value into separate rows in R data frame
Splitting column with numeric value into separate rows in R data frame

Time:09-29

I have an R dataframe that contains a rather large number of length measurements.

It is structured as follows:

> head(rb.len)
# A tibble: 6 × 5
  Length  Year Value Total Perc_Distr
   <dbl> <dbl> <dbl> <dbl>      <dbl>
1    9.5  1981     1 16641          0
2   10.5  1981     3 16641          0
3   12.5  1981     3 16641          0
4   13.5  1981     4 16641          0
5   14.5  1981    17 16641          0
6   15.5  1981    31 16641          0

Individuals of a certain length are grouped together and the total number of individuals (n) of a length class is listed in the column "value" (e.g. 17 individuals of 14.5cm were measured). For my further analysis I need each measurement to be in a separate row (so basically I need 17 rows with a measurement of 14.5cm). Unfortunately all I have learned so far is how to split columns with observations with multiple delimited values. As I have a single numeric value I am unsure how to proceed.

Hope you can help, Thanks in advance!

CodePudding user response:

1) tidyr Using rb.len shown reproducibly in the Note at the end, use uncount as shown. Add the argument .remove=FALSE to uncount if you prefer to retain the Value column.

library(dplyr)
library(tidyr)

rb.len %>% uncount(Value)
##   Length other
## 1    9.5     a
## 2    9.5     a
## 3   10.5     b
## 4   10.5     b
## 5   10.5     b

2) Base R Using base R we have the following. Replace , ] with , -2] if you prefer to omit Value from the output (because in rb.len shown in the Note Value is the second column).

rb.len[rep(1:nrow(rb.len), rb.len$Value), ]
##     Length Value other
## 1      9.5     2     a
## 1.1    9.5     2     a
## 2     10.5     3     b
## 2.1   10.5     3     b
## 2.2   10.5     3     b

Note

rb.len <- data.frame(Length = c(9.5, 10.5), Value = 2:3, other = letters[1:2])

rb.len
##   Length Value other
## 1    9.5     2     a
## 2   10.5     3     b

CodePudding user response:

I hope I've understood you correctly. There is a nice solution that repeats the whole rows n-times:

rb.len<- as.data.frame(lapply(rb.len, rep, df$Value))

If you wish to multiply only some columns:

rb.len<- as.data.frame(lapply(rb.len[1:2], rep, df$Value))
  • Related