Home > Back-end >  Error in rowSums(., na.rm = TRUE) : 'x' must be numeric - despite verifying variables are
Error in rowSums(., na.rm = TRUE) : 'x' must be numeric - despite verifying variables are

Time:04-14

When I tried summing 24 rows for specific columns in my data frame, it spit out

Error in rowSums(., na.rm = TRUE) : 'x' must be numeric 

I tried various methods to determine whether the columns of interest were numeric.

x_isnum <- select_if(x2009, is.numeric)
names(x_isnum)
# Check data type of every variable in data frame
str(x2009)

All columns of interest were listed as numeric. Then I even opened the data frame and hovered over each column to verify they were numeric; they were. I acknowledge that since the df is so large, it's possible I overlooked something. So I subset the data to learn about just the columns in question.

p = x2009[,c(48,49, 70:91)]
is.numeric(p)

FALSE

Since it returned false, I ran

str(p)

'data.frame':   17090 obs. of  24 variables:
 $ poss_cannabis_female_over_64 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_female_under_10: num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_male_over_64   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_male_under_10  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_10_12      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_13_14      : num  0 1 0 0 0 0 1 0 0 0 ...
 $ poss_cannabis_tot_15         : num  0 1 0 3 0 0 0 1 0 0 ...
 $ poss_cannabis_tot_16         : num  1 0 3 2 1 0 2 2 2 1 ...
 $ poss_cannabis_tot_17         : num  1 0 1 3 1 2 0 3 2 1 ...
 $ poss_cannabis_tot_18         : num  0 0 1 2 2 1 1 1 0 0 ...
 $ poss_cannabis_tot_19         : num  0 2 0 4 1 0 3 0 0 0 ...
 $ poss_cannabis_tot_20         : num  0 1 0 2 0 0 2 1 1 3 ...
 $ poss_cannabis_tot_21         : num  0 0 0 1 1 0 0 0 1 0 ...
 $ poss_cannabis_tot_22         : num  0 2 0 1 0 0 2 0 1 0 ...
 $ poss_cannabis_tot_23         : num  1 0 0 3 2 0 1 1 0 0 ...
 $ poss_cannabis_tot_24         : num  1 0 0 0 1 0 0 0 0 0 ...
 $ poss_cannabis_tot_25_29      : num  0 0 2 3 2 1 0 0 1 2 ...
 $ poss_cannabis_tot_30_34      : num  0 0 0 1 0 1 0 1 0 0 ...
 $ poss_cannabis_tot_35_39      : num  1 0 0 1 1 0 0 1 0 0 ...
 $ poss_cannabis_tot_40_44      : num  0 1 0 0 0 0 0 1 0 0 ...
 $ poss_cannabis_tot_45_49      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_50_54      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_55_59      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ poss_cannabis_tot_60_64      : num  0 0 0 0 1 0 0 0 0 0 ...

I also ran

sapply(p, is.numeric)

poss_cannabis_female_over_64 
                         TRUE 
poss_cannabis_female_under_10 
                         TRUE 
   poss_cannabis_male_over_64 
                         TRUE 
  poss_cannabis_male_under_10 
                         TRUE 
      poss_cannabis_tot_10_12 
                         TRUE 
      poss_cannabis_tot_13_14 
                         TRUE 
         poss_cannabis_tot_15 
                         TRUE 
         poss_cannabis_tot_16 
                         TRUE 
         poss_cannabis_tot_17 
                         TRUE 
         poss_cannabis_tot_18 
                         TRUE 
         poss_cannabis_tot_19 
                         TRUE 
         poss_cannabis_tot_20 
                         TRUE 
         poss_cannabis_tot_21 
                         TRUE 
         poss_cannabis_tot_22 
                         TRUE 
         poss_cannabis_tot_23 
                         TRUE 
         poss_cannabis_tot_24 
                         TRUE 
      poss_cannabis_tot_25_29 
                         TRUE 
      poss_cannabis_tot_30_34 
                         TRUE 
      poss_cannabis_tot_35_39 
                         TRUE 
      poss_cannabis_tot_40_44 
                         TRUE 
      poss_cannabis_tot_45_49 
                         TRUE 
      poss_cannabis_tot_50_54 
                         TRUE 
      poss_cannabis_tot_55_59 
                         TRUE 
      poss_cannabis_tot_60_64 
                         TRUE 

Finally, I ran sapply(p, class), which again displayed numeric for each variable. I again hovered over each column in the subsetted data frame, and again, each column said it was numeric

There must be something I am missing if r is telling me it's not numeric. I doubt the code is the problem because I ran it on a smaller, made up df with no issues, but just in case, here is what I ran to sum the rows of specific columns.

x2009 = x2009 %>%
  mutate(poss_cannabis_juv_tot = select(., c(49,71:76))) %>% 
  rowSums(na.rm = TRUE) %>% 
  mutate(poss_cannabis_adult_tot = select(., c(48,70,77:91))) %>%
  rowSums(na.rm = TRUE) %>% 
  relocate(poss_cannabis_juv_tot, .after = poss_cannabis_male_17) %>% 
  relocate(poss_cannabis_adult_tot, .after = poss_cannabis_male_over_64) 

What is going on??

CodePudding user response:

The issue is in creating a column from from select. Instead, select the columns within across and get the rowSums

library(dplyr)
x2009 %>%
    mutate(poss_cannabis_juv_tot = rowSums(across(where(is.numeric)), 
        na.rm = TRUE))

Or if it should be with indexes

x2009 %>%
    mutate(poss_cannabis_juv_tot = rowSums(across(c(49,71:76)), na.rm = TRUE),
     poss_cannabis_adult_tot = rowSums(across(c(48,70,77:91)), na.rm = TRUE)) %>%
    relocate(poss_cannabis_juv_tot, .after = poss_cannabis_male_17) %>% 
    relocate(poss_cannabis_adult_tot, .after = poss_cannabis_male_over_64) 

In the OP's code, the rowSums part is selecting all the columns because the column created with select is a data.frame (in addition to the other non-numeric columns)

> head(iris) %>%
    mutate(new = select(., 2:4)) %>%
    str
'data.frame':   6 obs. of  6 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1
 $ new         :'data.frame':   6 obs. of  3 variables:
  ..$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9
  ..$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7
  ..$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4

head(iris) %>% 
   mutate(new = select(., 2:4)) %>%
  rowSums(na.rm = TRUE)
Error in rowSums(., na.rm = TRUE) : 'x' must be numeric

Instead, with across

head(iris) %>%
    mutate(new = rowSums(across(2:4), na.rm = TRUE))
 Sepal.Length Sepal.Width Petal.Length Petal.Width Species new
1          5.1         3.5          1.4         0.2  setosa 5.1
2          4.9         3.0          1.4         0.2  setosa 4.6
3          4.7         3.2          1.3         0.2  setosa 4.7
4          4.6         3.1          1.5         0.2  setosa 4.8
5          5.0         3.6          1.4         0.2  setosa 5.2
6          5.4         3.9          1.7         0.4  setosa 6.0
  • Related