check out my new blog at gmarik.info/blog

Thursday, December 27, 2007

Ruby's here document mini tutorial.

Last update: 2007.12.30

Basics

Here Document(or HereDoc) is a way to declare String literal in Ruby programming language:

some_text = <<END_OF_STRING
You can write any text here for your document that's why such
statement is called - HereDoc
END_OF_STRING

That's it! Now some_text variable points to the string object containing the text going between END_OF_STRING
As you may know HereDoc isn't a unique Ruby feature, rather it's a common construct(with some distinctions) for scripting languages "brewed" in Unix environment.

The terminator

By Ruby convention a variable starting with capital letter is a constant. But that's not a case for the END_OF_STRING from previous example, because terminator is just a string which parser treats as the end of HereDoc.
Well if a terminator is a string then can i use arbitrary(say put spaces within) string as the terminator like end of string? The answer is - yes you can!
<<heredoc is interpreted same as <<"heredoc" (please note double quotes around latter heredoc).
But explicit notation(with quotes) is a bit more powerful.

String interpolation

By explicitly enclosing terminator with quotes you may have:
a_text = <<"Ruby, please end my HereDoc once you find this terminator string"
Hello world!
Ruby, please end my HereDoc once you find this terminator string
or
fuzzy_names = <<">>"
foo, baz, bar
>>
Wow, if i can use double quoted string, then probably i can use single quoted string as well:
puts <<'end of string'
1+1=#{1+1}
end of string
prints
1+1=#{1+1}
as single quoted strings aren't interpolated unlike double quoted:
puts <<"end of string"
1+1=#{1+1}
end of string
prints
1+1=2

Indent modifier

By default HereDoc terminator is expected to be placed on the very beginning of the separate line
By using - on HereDoc declaration, you may indent end terminator arbitrary:
greeting = <<-"here document ends"
Hello world
here document ends

Keep in mind that leading spaces are kept.

Advanced usages

a, b = <<'EOA', <<-EOB
string_a
EOA
string_b
EOB
is equal to
a, b = "string_a\n", "string_b\n"

At this point i'm thinking about HereDoc as "placeholder" that gets substituted with the string itself. Why is that important? Because you may then treat HereDoc declaration as the actual string and send messages(call methods):
<<'heredoc'.reverse == "\n!dlrow olleH"
Hello world!
heredoc
is a true statement!

Labels: , , ,

1 Comments:

At 6:56 PM, Blogger Todd Werth said...

Good post, I've used HereDoc many times, but I was never aware of some of the advanced features such as using - for indentation.

It's purely for aesthetic reasons, but sometimes I want to indent the block of text to match surrounding code.

 

Post a Comment

<< Home