I've been curious about how the Ruby language defines base methods such as if
and unless
. I've also cloned the Ruby source code repository and tried to search around, but didn't really understand how they wrote it.
CodePudding user response:
TL;DR
if
and unless
are Ruby keywords, not objects. Below, I provide links to Ruby's keyword and syntax documentation, as well as links to the actual keyword token definitions within the Ruby source code.
Ruby Keywords
While most things in Ruby are objects or expressions, there are some things that are simply language keywords. The if
and unless
keywords are two of them.
While you can certainly find them in the source, for more practical purposes you can find them in the Ruby keywords documentation or syntax documentation generated for each released version.
Links Related to Ruby's Interpreter and Defined Tokens
You may also want to look at Ruby's Ripper module for additional references to Ruby's custom tokenizer, parser, and lexer if you're looking for more information about how the interpreter works. You can also look at parse.y for the actual list of tokens used by the parser. Specifically, the tokens for the conditionals you're asking about are currently located at:
CodePudding user response:
I've been curious about how can the Ruby language define their base method such as If / Unless.
Unlike many other programming languages, Ruby does not have a single formal specification that defines what certain language constructs mean.
There are several resources, the sum of which can be considered kind of a specification for the Ruby programming language.
Some of these resources are:
- The ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification – Note that the ISO Ruby Specification was written around 2009–2010 with the specific goal that all existing Ruby implementations at the time would easily be compliant. Since YARV only implements Ruby 1.9 and MRI only implements Ruby 1.8 and lower, this means that the ISO Ruby Specification only contains features that are common to both Ruby 1.8 and Ruby 1.9. Also, the ISO Ruby Specification was specifically intended to be minimal and only contain the features that are absolutely required for writing Ruby programs. Because of that, it does for example only specify
String
s very broadly (since they have changed significantly between Ruby 1.8 and Ruby 1.9). It obviously also does not specify features which were added after the ISO Ruby Specification was written, such as Ractors. - The Ruby Spec Suite aka
ruby/spec
– Note that theruby/spec
is unfortunately far from complete. However, I quite like it because it is written in Ruby instead of "ISO-standardese", which is much easier to read for a Rubyist, and it doubles as an executable conformance test suite. - The Ruby Programming Language by David Flanagan and Yukihiro 'matz' Matsumoto – This book was written by David Flanagan together with Ruby's creator matz to serve as a Language Reference for Ruby.
- Programming Ruby by Dave Thomas, Andy Hunt, and Chad Fowler – This book was the first English book about Ruby and served as the standard introduction and description of Ruby for a long time. This book also first documented the Ruby core library and standard library, and the authors donated that documentation back to the community.
- The Ruby Issue Tracking System, specifically, the Feature sub-tracker – However, please note that unfortunately, the community is really, really bad at distinguishing between Tickets about the Ruby Programming Language and Tickets about the YARV Ruby Implementation: they both get intermingled in the tracker.
- The Meeting Logs of the Ruby Developer Meetings.
- New features are often discussed on the mailing lists, in particular the ruby-core (English) and ruby-dev (Japanese) mailing lists.
- The Ruby documentation – Again, be aware that this documentation is generated from the source code of YARV and does not distinguish between features of Ruby and features of YARV.
- In the past, there were a couple of attempts of formalizing changes to the Ruby Specification, such as the Ruby Change Request (RCR) and Ruby Enhancement Proposal (REP) processes, both of which were unsuccessful.
- If all else fails, you need to check the source code of the popular Ruby implementations to see what they actually do.
The if
conditional expression is specified in section 11.5.2.2.2 The if
expression of the ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification like this:
11.5.2.2.2 The
if
expressionSyntax
- if-expression ::
if
expression then-clause elsif-clause* else-clause?end
- then-clause ::
- separator compound-statement
| separator?then
compound-statement- else-clause ::
else
compound-statement- elsif-clause ::
elsif
expression then-clauseSemantics
The if-expression is evaluated as follows:
- Evaluate expression. Let V be the resulting value.
- If V is a trueish object, evaluate the compound-statement of the then-clause. The value of the if-expression is the resulting value. In this case, elsif-clauses and the else-clause, if any, are not evaluated.
- If V is a falseish object, and if there is no elsif-clause and no else-clause, then the value of the if-expression is
nil
.- If V is a falseish object, and if there is no elsif-clause but there is an else-clause, then evaluate the compound-statement of the else-clause. The value of the if-expression is the resulting value.
- If V is a falseish object, and if there are one or more elsif-clauses, evaluate the sequence of elsif-clauses as follows:
- Evaluate the expression of each elsif-clause in the order they appear in the program text, until there is an elsif-clause for which expression evaluates to a trueish object.
Let T be this elsif-clause.- If T exists, evaluate the compound-statement of its then-clause. The value of the if-expression is the resulting value. Other elsif-clauses and an else-clause following T, if any, are not evaluated.
- If T does not exist, and if there is an else-clause, then evaluate the compound-statement of the else-clause. The value of the if-expression is the resulting value.
- If T does not exist, and if there is no else-clause, then the value of the if-expression is
nil
.
This is of course pretty much how you expect a conditional expression to work. unless
is defined in the next section, but there are no surprises there.
A less formal specification is given in language/if_spec.rb
of the ruby/spec
.
If you would rather like to know how the conditional expression is implemented than how it is defined, the first question you have to ask yourself is: implemented where? I.e. which Ruby implementation are you interested in?
Personally, I always prefer to look at Rubinius, even though it is unfortunately no longer actively maintained. You can find the implementation of the if
conditional expression in the rubinius-code
subproject in lib/rubinius/code/ast/control_flow.rb
, and it looks something like this (slightly abridged for clarity):
done = g.new_label
else_label = g.new_label
@condition.bytecode(g)
g.goto_if_false else_label
@body.bytecode(g)
g.goto done
else_label.set!
@else.bytecode(g)
done.set!
Again, it looks pretty much like you would expect from a Smalltalk-80 style stack-based byte code VM: we define two labels, one (done
) is at the end of the block, the other (else_label
) in the middle. We splice the byte code of the condition into the generated byte code and execute a GOTO
to the else_label
if the top of the stack is falsey, thus skipping over the byte code of the body, which we splice in after the GOTO
and before the else_label
. After the byte code of the body, we execute a GOTO
to the done
label and splice in the byte code of the else
branch after the else_label
label and before the done
label.
You might notice that there is no mention of elsif
here nor of the possibility that the else
might be missing. This is handled by a transformation pass which transforms an if
without else
into an if
with an empty else
and an if
with elsif
into a series of if
with else
and nested if
.
As an interesting counterpoint to Rubinius, I also like to check out TruffleRuby. Here is what the implementation looks like in src/main/java/org/truffleruby/language/control/IfElseNode.java
:
public IfElseNode(RubyNode condition, RubyNode thenBody, RubyNode elseBody) {
this.condition = BooleanCastNodeGen.create(condition);
this.thenBody = thenBody;
this.elseBody = elseBody;
}
@Override
public Object execute(VirtualFrame frame) {
if (conditionProfile.profile(condition.executeBoolean(frame))) {
return thenBody.execute(frame);
} else {
return elseBody.execute(frame);
}
}
Again, this is pretty much what you would expect from a simple AST-walking interpreter. Now, of course, Truffle is anything but a "simple AST-walking interpreter", but the amazing thing about the Truffle Language Implementation Framework is that you write your language implementation as-if you were writing a simple AST-walking interpreter and Truffle will automatically generate a high-performance compiler for you.
(If you are familiar with the RPython Language Implementation Framework, then you will note some similarities: in RPython, you also write a simple interpreter, in this case, a simple byte code interpreter loop, and RPython automatically generates a JIT compiler for you. There used to be a Ruby implementation on RPython called Topaz, but it is no longer under active development.)
The last Ruby implementation I want to look at is Opal. Its implementation of the if
conditional expression is in lib/opal/nodes/if.rb
. There are some nifty optimizations that it tries to do but if you remove all of the optimizations, then you can essentially see that in the most general case, the Ruby snippet
if condition
then_body
else
else_body
end
is compiled to the ECMAScript snippet
(
await(
async function () {
if (/* compiled version of `condition` */) {
return /* compiled version of `then_branch` */;
} else {
return /* compiled version of `else_branch` */;
}
return nil;
}()
)
)
There are some other Ruby implementations you could look at as well, but I find Rubinius, Opal, and TruffleRuby the most readable (in that order).