Home > other >  Subtract 1 row from each row in a tibble (R, dplyr)
Subtract 1 row from each row in a tibble (R, dplyr)

Time:05-22

This should be simple ... Rowwise ops in dplyr

#tibble
a=tibble(a=1:4,b=1:4,c=1:4)

#one row tibble to be subtracted from first one
b=tibble(a=5,b=5,c=5)

#well, this won't work
a-b

Error in Ops.data.frame(a, b) : 
  ‘-’ only defined for equally-sized data frames

Of course, the workaround is to replicate the tibble's row ... but not elegant

#replicating
c=tibble(a=rep(5,4),b=rep(5,4),c=rep(5,4))

#works
a-c

But shouldn't this work with some rowwise operation?

a %>% rowwise %>% mutate(across(everything(), ~.-b))

it doesn't

# A tibble: 4 × 3
# Rowwise: 
      a     b     c
  <int> <int> <int>
1     0     0     0
2     0     0     0
3     0     0     0
4     0     0     0

CodePudding user response:

If you want to stay in tidyverse you can use pmap_dfr which will help in rowwise operation for multiple columns.

b=tibble(a=5,b=2,c=1)
purrr::pmap_dfr(a, ~. - b)

#   a  b c
#1 -4 -1 0
#2 -3  0 1
#3 -2  1 2
#4 -1  2 3

In base R, you can do

t(t(a) - unlist(b))

 #     a  b c
#[1,] -4 -1 0
#[2,] -3  0 1
#[3,] -2  1 2
#[4,] -1  2 3

Note that I have changed the values of b for clarity.

CodePudding user response:

Generally, using long-formatted data frames makes these types of operations consistent. The trick here is to create a row ID, pivot longer, join, and then pivot wider.

library(tibble)

#tibble
a <- tibble(a=1:4,b=1:4,c=1:4)

#one row tibble to be subtracted from first one
b <- tibble(a=5,b=5,c=5)

library(tidyr)
library(dplyr)

a_pivot <- a |> 
  mutate(id = row_number()) |> # create a row ID so we can pivot_wider
  pivot_longer(cols = c(everything(), -id), values_to = "values_a")

b_pivot <- b |> 
  pivot_longer(cols = everything(), values_to = "values_b")

ab_pivot <- left_join(a_pivot, b_pivot, by = c("name")) |> 
  mutate(values = values_a - values_b) |> 
  select(id, name, values) |> # remove other columns for the pivot_wider
  pivot_wider(names_from = "name", values_from = values) |> 
  select(-id)

CodePudding user response:

A dplyr only solution:

library(dplyr)

x = nrow(a)
b <- b %>% slice(rep(1:n(), each =x))

a - b
   a  b  c
1 -4 -4 -4
2 -3 -3 -3
3 -2 -2 -2
4 -1 -1 -1

CodePudding user response:

Edited to reflect OP's edits...

I think this is what you want:

b=as.numeric(c(5,2,1))

#well, this works
a-b

> a-b
   a  b  c
1 -4 -4 -4
2 -3 -3 -3
3 -2 -2 -2
4 -1 -1 -1

Of course

b=as.numeric(tibble(a=5,b=2,c=1))

works. Or if you actually do want to save b as a tibble for other use...

b=tibble(a=5,b=2,c=1)
a-as.numeric(b)

works too. Note the text of the error message: "Error in Ops.data.frame(a, b) : ‘-’ only defined for equally-sized data frames" tells you this.

  • Related