Home > other >  Why does R treat columns as an object and not a string when passed to functions?
Why does R treat columns as an object and not a string when passed to functions?

Time:10-09

I would like to first preface my question by stating that I am just getting started with learning R and thank you in advance!

My question is why does R treat columns as an object and not a string when passed to functions, specifically with the qplot function apart of ggplot?

For instance, running qplot(x = "X_COLUMN", geom = "histogram", data = dataset, binwidth = 10) where "X_COLUMN" is a column in the data.frame object dataset results in the error StatBin requires a continuous x variable: the x variable is discrete.Perhaps you want stat="count".

However, by simply making "X_COLUMN" an object (qplot(x = X_COLUMN, geom = "histogram", data = dataset, binwidth = 10)), the function works as intended.

But when I attempt to print out the object X_COLUMN, I get Error: object 'X_COLUMN' not found.

What is the reason for this or am I thinking about this wrong?

CodePudding user response:

This is because that's how R's data.frame (and tidyverse's tibble) are designed. Imagine that you are collecting data about some people. The people are your rows. And the aspects (like height, weight, color, body mass index, etc.) are your columns. So, column titles are not strings per se. They are the variables. Therefore, when you write "column_name" in inverted comma, ggplot will be confused. It will assume that you want to supply a string, not the variable name. But it needs a variable in order to function properly. If you are using base R function plot, you should write df$column_name instead of just column_name.

  •  Tags:  
  • r
  • Related