Home > Blockchain >  Add column with the cumulative number of elements found in a specific row
Add column with the cumulative number of elements found in a specific row

Time:01-09

This table is based on a species sampling procedure that comprises starting at a certain location in a forest and recording the number of species that occur in that exact spot. Then, the surveyor walks and records the distance he traveled until he found a new species. This is the distance between the place where he found the new species and the initial point.

enter image description here

I would like to create a new column the includes the cumulative number of species based on the traveled distance. Here's what the new column should look like.

enter image description here

Example data:

data<-structure(list(id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), binomial = c("Dromicodryas bernieri", 
"Dromicodryas quadrilineatus", "Erymnochelys madagascariensis", 
"Furcifer lateralis", "Furcifer oustaleti", "Hemidactylus mercatorius", 
"Langaha pseudoalluaudi", "Leioheterodon madagascariensis", "Lycodryas pseudogranuliceps", 
"Liophidium torquatum", "Liopholidophis sexlineatus", "Madagascarophis colubrinus", 
"Madatyphlops decorsei", "Madascincus polleni", "Mimophis mahfalensis", 
"Pelusios castanoides", "Phelsuma madagascariensis", "Thamnosophis lateralis", 
"Trachylepis elegans", "Trachylepis gravenhorstii", "Zonosaurus madagascariensis", 
"Hemidactylus frenatus", "Calumma nasutum", "Trachylepis madagascariensis", 
"Amphiglossus macrocercus", "Zonosaurus aeneus", "Phelsuma lineata", 
"Pelomedusa subrufa", "Calumma crypticum", "Furcifer viridis", 
"Lygodactylus blancae", "Calumma gastrotaenia", "Trachylepis boettgeri", 
"Zonosaurus ornatus", "Sanzinia madagascariensis", "Oplurus cyclurus", 
"Leioheterodon modestus", "Oplurus cuvieri", "Madascincus igneocaudatus", 
"Acrantophis dumerili", "Furcifer campani", "Pseudoxyrhopus imerinae", 
"Lygodactylus mirabilis", "Phelsuma barbouri", "Furcifer minor", 
"Compsophis infralineatus", "Pseudoxyrhopus quinquelineatus", 
"Calumma hilleniusi", "Paroedura bastardi", "Brookesia brygooi"
), distance = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 3714.77402549982, 6249.49093233716, 7067.80424387549, 
7715.0303317613, 13769.1057463018, 17206.1480236598, 18733.5237644898, 
21923.789153995, 27314.2085865309, 31154.1890492383, 35460.0864839256, 
35822.0263564291, 36933.3736660544, 39735.6007540156, 40983.6673876956, 
43032.8409122139, 43793.3004333338, 44063.3992480126, 44657.9183000201, 
44723.8214805486, 45184.0884859559, 46785.9008560645, 48994.7048866502, 
55332.621992021, 57746.4142325833, 58866.2845249788, 60839.811988087, 
65560.1987963227)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-50L))

CodePudding user response:

One option is to get the cumsum on a logical vector i.e. where distance not equal to 0 and then add the count of 0 distance to it (sum(distance == 0))

library(dplyr)
data <- data %>%
  mutate(new = cumsum(distance!=0)   sum(distance == 0)) 

Or for this, we can use base R

data$new <- with(data, cumsum(distance!=0)   sum(distance == 0))
  • Related