Home > front end >  Data manipulation in R: Starting a new row if i > i-1
Data manipulation in R: Starting a new row if i > i-1

Time:07-13

I have a long (one row) data file with many values. It needs to be broken up into multiple rows. While the specifics of why I need to do this isn't important, the logic is that column i should always be bigger than column i 1. I.e. The values along a row should be decreasing.

The best way I can think to do this is to break up the data frame into multiple rows with an 'if then' style of function: If column i > i-1, start a new row. If i < i-1, keep this value in the row.

#Example data but with similar format to my real data

df <- data.frame(matrix(ncol = 9, nrow = 1))
df[1,] <- c(3, 2, 1, 2, 1, 1, 3, 2, 1) 

I would like it to end up looking like this.

3 2 1
2 1 
1
3 2 1

I'm not very proficient with functions referring to i position in a data frame and the kind of data manipulation this needs. Any advice would be appreciated.

CodePudding user response:

Here is a tidy solution. Please let me know if this solves your question:

library(tidyverse)

df <- data.frame(matrix(ncol = 9, nrow = 1))
df[1,] <- c(3, 2, 1, 2, 1, 1, 3, 2, 1) 

df %>%
  pivot_longer(cols = everything(), names_to = "vars") %>%
  mutate(smaller_than_prev = value < lag(value) | is.na(lag(value)),
         num_falses = cumsum(smaller_than_prev == FALSE)) %>%
  group_by(num_falses) %>%
  mutate(row_num = row_number()) %>%
  pivot_wider(names_from = row_num, values_from = value, values_fill = NA, names_prefix = "var") %>%
  fill(c(`var1`, `var2`, `var3`), .direction = "downup") %>%
  slice_head(n = 1) %>%
  ungroup() %>%
  select(`var1`, `var2`, `var3`)

CodePudding user response:

Splitting the vector into groups is simple, but how the data are finally stored depends on what you are trying to do with the result. Here is a simple way to split the data:

vect <- unname(unlist(df))    # Convert the data to a simple vector
cut <- which(diff(vect) >= 0) # Find the points for splitting the vector
grps <- rep(1:4, diff(c(0, cut, length(vect))))  # Define the groups created
groups <- split(vect, grps)   # Create a list containing the groups
groups
# $`1`
# [1] 3 2 1
# 
# $`2`
# [1] 2 1
# 
# $`3`
# [1] 1
# 
# $`4`
# [1] 3 2 1

A data frame and a matrix requires that all of the columns are the same length so those are not structures that you can use to save the result. To convert to a matrix we need to pad with missing values:

maxno <- max(sapply(groups, length))  # How long is the longest run?
t(sapply(groups, function(x) c(x, rep(NA, maxno - length(x)))))
#   [,1] [,2] [,3]
# 1    3    2    1
# 2    2    1   NA
# 3    1   NA   NA
# 4    3    2    1
  • Related