Home > Back-end >  Error with summing columns in R (invalid 'type' (character) of argument)?
Error with summing columns in R (invalid 'type' (character) of argument)?

Time:09-20

I have the following dataset:

structure(list(Patient_ID = c("1234", "1234", "1234", "1234", 
"1234", "1234", "1234", "1234", "1234"), Unit_Type = c("ABC", 
"ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned", 
"Returned", "Returned", "Returned", "Transfused", "Transfused", 
"Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA, 
-9L))

and have used the following calculation on it:

df <- df %>%
  count(Patient_ID, Unit_Type, Status) %>%
  pivot_wider(names_from = c(Unit_Type, Status), values_from = n)

I want to sum 'ABC_Returned' and 'ABC_Transfused' by Patient_ID (I know the example dataset only has one unique patient ID, but my real dataset has many more), but I keep getting the following error message:

> aggregate(df, by=list(df$ABC_Transfused, df$ABC_Returned), FUN=sum, na.rm = TRUE)
Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument

Anyone know how to bypass this? Ideally, I would like to create a new column called ABC_Ordered That is the sum of 'ABC_Returned' and 'ABC_Transfused', grouped by Patient ID.

CodePudding user response:

I think you look for this:

The reason why you get the error is that with your code you try to sum also the first column which is a character column, subsetting by df[,-1] should work:

aggregate(df[,-1], by=list(df$ABC_Transfused, df$ABC_Returned), FUN=sum, na.rm = TRUE)
  Group.1 Group.2 ABC_Returned ABC_Transfused DEF_Transfused GHI_Transfused
1       1       4            4              1              3              1

CodePudding user response:

We could use

library(dplyr)
df %>% 
    mutate(ABC_ordered = ABC_Returned   ABC_Transfused)

-output

# A tibble: 1 × 6
  Patient_ID ABC_Returned ABC_Transfused DEF_Transfused GHI_Transfused ABC_ordered
  <chr>             <int>          <int>          <int>          <int>       <int>
1 1234                  4              1              3              1           5

CodePudding user response:

Maybe this is what youre looking for?

I added another dataframe with another ID, hopefully to clarify, but I think you want a rowwise operation.

Since the data is already aggregated in the count step I dont think you need to do any additional grouping since one observation is one patient

data <- structure(list(Patient_ID = c("1234", "1234", "1234", "1234", 
"1234", "1234", "1234", "1234", "1234"), Unit_Type = c("ABC", 
"ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned", 
"Returned", "Returned", "Returned", "Transfused", "Transfused", 
"Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA, 
-9L))

data2 <- structure(list(Patient_ID = c("1235", "1235", "1235", "1235", 
"1235", "1235", "1235", "1235", "1235"), Unit_Type = c("ABC", 
"ABC", "ABC", "ABC", "ABC", "DEF", "DEF", "DEF", "GHI"), Status = c("Returned", 
"Returned", "Returned", "Returned", "Transfused", "Transfused", 
"Transfused", "Transfused", "Transfused")), class = "data.frame", row.names = c(NA, 
-9L))

data%>%
  rbind(data2) -> data_full


data_full2 <- data_full %>%
  count(Patient_ID, Unit_Type, Status) %>%
  pivot_wider(names_from = c(Unit_Type, Status), values_from = n)


data_full2%>%
  rowwise()%>%
  mutate(ABC_Ordered = sum(c(ABC_Returned,
                             ABC_Transfused),
                           na.rm = TRUE))%>%
  ungroup() -> data_full3
  •  Tags:  
  • r
  • Related