Home > Software design >  Why is my sum() summing to zero, having no NAs and only numerical data?
Why is my sum() summing to zero, having no NAs and only numerical data?

Time:07-08

Once again with a very simple question.

I am trying to add all emissions together, basically summing 5 variables per row.

However, it keeps summing to zero, even when I have no NAs and only numeric values.

This is the data I am working with:

df_structure <-
  structure(
    list(
      `Particeles_PM10_[kg]_WTW_whole transport chain` = c(
        0.000440486,
        0.010753239,
        0.0005393157,
        0.0107265319,
        0.200272577,
        0.169998242
      ),
      `SO2_[kg]_WTW_whole transport chain` = c(
        0.0034873728,
        0.1861534833,
        0.01613152798,
        0.185923214,
        3.715316736,
        3.155906431
      ),
      `NOX_[kg]_WTW_whole transport chain` = c(
        0.024214311,
        0.618727269,
        0.053631226,
        0.617528662,
        12.271221,
        10.3988076
      ),
      `NMHC_[kg]_WTW_whole transport chain` = c(
        0.0043159575,
        0.0385331658,
        0.0033238124,
        0.038634107,
        0.7067915367,
        0.59608807
      )
    ),
    row.names = c(NA,-6L),
    class = c("tbl_df", "tbl", "data.frame")
  )

And heres my code:

df_structure %>%
  rowwise() %>% 
  mutate(sum_emissions = sum(as.numeric("Particeles_PM10_[kg]_WTW_whole transport chain",
                         "SO2_[kg]_WTW_whole transport chain",
                         "NOX_[kg]_WTW_whole transport chain",
                         "NMHC_[kg]_WTW_whole transport chain"), na.rm = TRUE)) 
summary(df_structure$sum_emissions)

What am I doing wrong? I can open my data.frame and every column has 5 rows of filled-in data, yet the sum keeps being 0...

Thanks in advance!

CodePudding user response:

You could use across() to select what columns you want to sum up and pass the whole across() into rowSums().

library(dplyr)

df_structure %>%
  mutate(sum_emissions = rowSums(across(everything())))

# # A tibble: 6 × 5
#   `Particeles_PM10_[kg]…` `SO2_[kg]_WTW_…` `NOX_[kg]_WTW_…` `NMHC_[kg]_WTW…` sum_emissions
#                     <dbl>            <dbl>            <dbl>            <dbl>         <dbl>
# 1                0.000440          0.00349           0.0242          0.00432        0.0325
# 2                0.0108            0.186             0.619           0.0385         0.854 
# 3                0.000539          0.0161            0.0536          0.00332        0.0736
# 4                0.0107            0.186             0.618           0.0386         0.853 
# 5                0.200             3.72             12.3             0.707         16.9   
# 6                0.170             3.16             10.4             0.596         14.3   

CodePudding user response:

You need to specify that it is a vector of variables using c() and ``'s. As you output is already numeric, you won't need to specify that.

df_structure %>%
  rowwise() %>% 
  mutate(sum_emissions = sum(c(`Particeles_PM10_[kg]_WTW_whole transport chain`,
                                 `SO2_[kg]_WTW_whole transport chain`,
                                 `NOX_[kg]_WTW_whole transport chain`,
                                 `NMHC_[kg]_WTW_whole transport chain`), na.rm = TRUE)) %>%
  ungroup()

A simpler way might be to use c_across:

df_structure %>%
  rowwise() %>% 
  mutate(sum_emissions = sum(c_across(1:4), na.rm = TRUE)) %>%
  ungroup()

A base solution is to use rowSums directly (or through mutate):

df_structure$sum_emissions <- rowSums(df_structure)
  • Related