Home > Software design >  How R type identification works
How R type identification works

Time:12-13

I have studied that R uses dynamic typing, but I would like to know something more about it. How can the interpreter understand that an object like the following one is numeric?

var <- 5
str(var) 
OUTPUT: num 5

CodePudding user response:

R, like other dynamically typed languages, store values in special data structures that contain not just the value it self, but also meta information about the type (often called the type “tag”).

This is described in detail in the R internals.

In a nutshell, each expression in R is stored internally as an S-expression, which is represented in C code as a struct called SEXPREC (or some small variation thereof). The actual definition of the SEXPREC struct is complicated for technical reasons, but it basically boils down to a header (metadata), a pointer to attributes, some other pointers and, lastly, the value itself.

Inside the header, the first five bits store a number which specifies what type the expression has. Integers have the number 12, non-integer numeric values have the number 13, character strings have the number 9, and so on.

Every piece of code that uses an expression in R needs to, at some point, inspect this meta-information to determine what the expression type is, and how to perform computations on it. This is part of the reason why dynamic languages are generally slower than statically typed languages: they need to perform additional book-keeping for (almost) every single operation they perform.

CodePudding user response:

@Stefano, I'm not really sure I understand your question, but I'm guessing R's rules for tokenization/syntax allow the interpreter to deduce the bare 5 is a numeric literal, and that type information is saved along with the value in the symbol table (environment), under the var name.

  • Related