Let's say I have the following dataset:
Industry Country Year AUS AUS AUT AUT ...
A AUS 1 0.5 0.2 0.1 0.01
B AUS 2 0.3 0.5 2 0.1
A AUT 3 1 1.2 1.3 0.3
B AUT 4 0.5 0 0.8 2
... ... ... ... ... ... ....
VA 11 10 47 55
tot 24 23 50 70
How can I subtract ONLY the last two rows(tot= tot-VA) to get:
Industry Country Year AUS AUS AUT AUT ...
A AUS 1 0.5 0.2 0.1 0.01
B AUS 2 0.3 0.5 2 0.1
A AUT 3 1 1.2 1.3 0.3
B AUT 4 0.5 0 0.8 2
... ... ... ... ... ... ....
VA 11 10 47 55
FI 13 13 3 15
FI/VA 1.2 1.3 0.06 0.27
Where FI is simply tot-VA
CodePudding user response:
You could try this:
check which columns are numeric
use sapply to calculate the new row FI
bind them together
num <- sapply(df, class) == "numeric"
df_tot<- data.frame(as.list(sapply(df[, num], function(x) x[length(x)]-x[length(x)-1])))
df_tot$Industry <- "FI"
df <- data.table::rbindlist(list(df, df_tot), fill = TRUE)
EDIT:
If you just want to sum up all rows but the last one, then you could try this:
num <- sapply(df, class) == "numeric"
df_tot <- data.frame(as.list(sapply(df[1:(nrow(df)-1), num], sum)))
df_tot$Industry <- "FI"
df <- data.table::rbindlist(list(df, df_tot), fill = TRUE)
CodePudding user response:
Here's a tidyverse
approach to the issue of subtracting selected rows across columns:
library(tidyverse)
df %>%
# subtract across the relevant columns:
summarise(across(matches("^AU"), ~(.x[Industry == "tot"] - .x[Industry == "VA"]))) %>%
# add the 'new' column `Industry`:
mutate(Industry = "FI") %>%
# bind result back into `df`:
bind_rows(df,.)
Industry AU1 AU2 AU3
1 A 0.1 0.4 7.0
2 B 0.7 3.0 1.0
3 A 3.0 2.5 0.1
4 VA 11.0 10.0 47.0
5 tot 24.0 23.0 50.0
6 FI 13.0 13.0 3.0
If you no longer need rows #4 and #5, add this to the pipe:
filter(!Industry %in% c("VA", "tot"))
Data:
df <- data.frame(
Industry = c("A","B","A","VA","tot"),
AU1 = c(0.1,0.7,3,11,24),
AU2 = c(0.4,3,2.5,10,23),
AU3 = c(7, 1, 0.1,47,50)
)