Why am I having errors with order of functions using %>% in R?-CodePudding

This is the code I am trying to run:

  data_table<-data_table%>%
      merge(new_table, by = 'Sample ID')%>%
      mutate(Normalized_value = ((1.8^(data_table$Ubb - data_table$Ct_adj))*10000))

I want to first add the new column ("Ubb") from "new_table" and then add a calculated column using that new column. However, I get an error saying that Ubb column does not exist. So it's not performing merge before running mutate? When I separate the functions everything works fine:

data_table<-data_table%>%
  merge(new_table, by = 'Sample ID')

data_table<-data_table%>%
  mutate(Normalized_value = ((1.8^(data_table$Ubb - data_table$Ct_adj))*10000))

I would like to keep everything together just for style, but I'm also just curious, shouldn't R perform merge first and then mutate? How does order of operation during piping work?

Thank you!

CodePudding user response：

you dont need to refer to column name with $ sign. i.e. use Normalized_value = ((1.8^(Ubb - Ct_adj))*10000) because it is merged now. with $ sign I believe R, even though does the merge, has original data_table still in memory. because the assignment operator did not work yet. the assignment will take place after all operations.

CodePudding user response：

Try running the code like this:

  data_table<-data_table%>%
      merge(new_table, by = 'Sample ID')%>%
      mutate(Normalized_value = ((1.8^(Ubb - Ct_adj))*10000))

Notice that I'm not using the table name with the $ within the pipe. Your way is telling the mutate column to look at a vector. Maybe it's having some trouble understanding the length of that vector when used with the merge. Just call the variable name within the pipe. It's easiest to think of the %>% as 'and then': data_table() and then merge() and then mutate(). You might also want to think about a left_join instead of a merge.