Installing Ruby

Installing Ruby

Let’s Get It Started: Installing Ruby

Ruby is a popular programming language, but not many computers have it perfectly installed by default. As an open source language, Ruby has been adapted to run on many different computer platforms and architectures. This means that if you develop a Ruby program on one machine, it’s likely you’ll be able to run it without any changes on a different machine. You can use Ruby, in one form or another, on all the following operating systems and platforms:

  • Microsoft Window 7, 8, and 10
  • Mac OS X
  • Linux (most distributions)
  • BSDs (including FreeBSD and OpenBSD)

 

Any platform for which a full Java Virtual Machine exists (using JRuby)

Before you can start playing with Ruby, you need to get your computer to understand the Ruby language by installing an implementation of Ruby on your system, which I’ll cover first. In some cases, Ruby may already be present on your computer, and we will cover these situations also since you may not need to do anything to get started.

 

Installing Ruby

 

Installing Ruby

Typically, when you install Ruby onto your computer, you’ll get the Ruby interpreter, the program that understands other programs written in the Ruby language, along with a collection of extensions and libraries to make your Ruby more fully featured.

 

To satisfy the majority of readers without referring to external documentation, I’m providing full instructions for using Ruby on Windows, Mac OS X, and Linux, along with links to Ruby implementations for other platforms.

 

Note Ruby comes in multiple versions, such as 1.9.3 and 2.2.2. The code in this blog is primarily aimed at versions 2.0.0 and above, but nearly all of it will work in 1.9.3, too.

 

There are major differences between the 1.8.*, 1.9.*, 2.0.* and 2.2.* versions that can become important when you reach more advanced topics, but at this stage you can choose whichever is easiest to install on your platform. Or, if Ruby is already installed on your machine, simply use that as-is.

 

Installing Ruby on Windows

Installing Ruby on Windows

Ruby was initially designed for use under UNIX and UNIX-related operating systems such as Linux, but Windows users have access to an excellent Windows-specific installer that installs Ruby, a horde of extensions, a source code editor, and various documentation. Ruby on Windows is as reliable and useful as it is on other operating systems, and Windows is a reasonable environment for developing Ruby programs.

To get up and running as quickly as possible, follow these steps:

 

Open a web browser and go to RubyInstaller for Windows.

 

Click the big Download button, then choose the latest version to download. Version 2.2.3 is fine, but if 2.4.x or 3.x are available, choose those. You will have the option to download a 64-bit (x86) version. At this stage, I would recommend staying with the normal, plain (32-bit) version, as not all Ruby libraries are compatible with the 64-bit version yet. There are further details about this on the sidebar of the download page, if you’re interested.

Run the downloaded file to launch the installer. The file is approximately 16MB in size, so it may take a while to download.

 

If Windows gives you a Security Error box, click the Run button to give your approval.

A typical installation program appears with some instructions. On the initial screen, click to accept the license and then click Next.

Work your way through the installation screens. Unless you have a specific reason not to, let the installation program install Ruby in its default location of C:\ruby (or possibly C:\Ruby22) and its default program group. Check the box for “Add Ruby Executables to Your Path” if possible, as well as the “Associate .rb and .rbw Files with this Ruby Installation” option.

 

Installation is complete when the installation program gives you a Finish button to exit it.

If Ruby installed correctly, congratulations! Go to the Start menu and then the Programs or All Programs menu. There should be a program group starting with “Ruby” that contains icons for Interactive Ruby (also called “irb”), an uninstaller, and other bits and pieces. To test that your Ruby installation works correctly, you need to load Interactive Ruby.

 

You should find this in your Start menu, under the Ruby group, called Start Command Prompt with Ruby or similar. If the program loads successfully.

 

If irb started properly, Ruby is installed correctly. Congratulations! Lastly, you need to be familiar with running Ruby and its associated utilities from the command prompt, so go back to the Ruby program group menu in the Start menu and run the option that lets you run Ruby from the command prompt.

 

Throughout this blog, commands that can be used at the command prompt will be given. This is because using a command prompt such as this is a standard technique in operating systems such as Linux and OS X. For example, in Next blog, we’ll look at installing extra features (libraries) for Ruby, and the command prompt will be used for this. Therefore, it’s necessary for you to know how to access it and run programs.

 

If you type irb at this prompt and press Enter, you should see something like the following:

irb(main):001:0>

If you see the preceding line, everything is set up correctly, and you can type exit and press Enter to be returned to the command prompt.

 

Installing Ruby on Mac OS X / macOS

Installing Ruby on Mac OS X

Unlike Windows, most modern Apple machines running OS X come with a version of Ruby already installed. OS X Yosemite (10.10) comes with Ruby 2.0.0 out of the box, which is fine for the purposes of learning Ruby.

 

Testing for a Preinstalled Version of Ruby

If you’re using OS X, you can check whether Ruby is installed by using the Terminal application. Go to Finder, then the Go menu, and then the Applications folder. Once in Applications, go to the Utilities folder, where you’ll find an application called Terminal. Double-click its icon to start it. If Terminal starts correctly.

 

Tip You can also start Terminal by using Spotlight. Press Cmd+Space and type “terminal” and press Enter.

 

Once you’re in Terminal, you’re at what’s called the command prompt or shell. You’re talking directly with your computer, in a technical sense, when you type. The computer will execute the commands that you type immediately once you press Enter.

To see if Ruby is installed, type the following at the command prompt from within Terminal (be sure to press Enter afterward): ruby -v

 

If it’s successful, you should see a result, as shown in Figure, that says what version of Ruby you’re running (which should be 1.9.3 or greater, but ideally 2.0.0 or later). If this works, try to run the Ruby interactive interpreter called “irb” by typing the following at the command prompt: irb

If you need to install a newer version of Ruby on OS X, continue to the next section.

 

Installing Ruby on OS X

If you choose not to use the system-provided version of Ruby, there are two primary ways to install your own version of Ruby on OS X, but both should only be used by advanced users. The first is to use a packaging system such as Homebrew (http://brew.sh/)—if you do, you can install ruby with the brew install ruby command.

 

The second option is to compile and install Ruby directly from its source code.

 

Installing Ruby from Source on Mac OS X

Installing Ruby directly from source code on OS X is similar to Linux, so continue on to the later Linux section entitled “Installing Ruby from Source Code.” Note that as opposed to the pre-installed Ruby distribution that includes many Ruby libraries (gems), including Rails, when you install Ruby by source, all you get is Ruby. You need to install components such as Rails separately later.

 

Note To compile the Ruby sources on OS X, you need to install the Xcode developer tools that Apple makes available for free at the Mac App Store.

 

Installing Ruby on Linux or Ubuntu

Installing Ruby on Linux or Ubuntu

As an open source programming language, Ruby is already installed with many Linux distributions. It’s not universal, though, but you can check if Ruby is installed by following the instructions in the next section. If this fails, there are further instructions to help you install it.

Checking If Ruby Is Installed on Linux

Try to run the Ruby interpreter from the command prompt (or terminal window), as follows: ruby –v

 

If Ruby is installed, it will give an output such as the following:

ruby 2.2.2p95 (2015-04-13 revision 50295) [i686-linux]

 

This means that Ruby 2.2.2 is installed on the machine. This blog requires 1.9.3 as a bare minimum (with 2.0+ being preferred), so if the version is earlier than 1.9.3, you’ll need to continue onward in this blog and install a more recent version of Ruby. However, if Ruby appears to be installed and up to date, try to run the irb interactive Ruby interpreter, as follows: irb

 

Tip On some systems, irb might have a slightly different name. For example, on Ubuntu it can be named irb1.9 or irb2.0, and you’ll need to run it as such. The Ruby interpreter’s executable may also be named ruby1.9 (or ruby2.0). Make sure to try all of these possibilities before continuing. You might also try using your shell’s tab completion feature by typing ruby and pressing Tab to see if there are any alternatives available.

 

Once you’ve run irb, you should get the following output: irb(main):001:0>

If running irb results in a similar output(You might wish to type exit and press Enter to get back to the command line!) Otherwise, read on to install a fresh version of Ruby.

 

Installing Ruby with a Package Manager

Installing Ruby with a Package Manager

The installation procedure for Ruby on Linux varies between different Linux distributions. Some distributions, such as Debian, Arch Linux, and Red Hat, provide “package managers” to make installation of programs easy. Others require that you install directly from source or install a package manager beforehand.

  • If you’re comfortable with using emerge, rpm, or apt-get, you can install Ruby quickly with the following methods:

Yum (on Red Hat, CentOS, and Fedora): Install as follows: sudo yum install -y ruby

  • Pacman (on Arch Linux): Install as follows: sudo pacman -S ruby

Debian and Ubuntu-based distributions: Use apt-get, as such: sudo apt-get install ruby-full

 

If one of these methods works for you, try to run Ruby and irb as shown in the preceding section if you’re ready. Alternatively, you can search your distribution’s package repository for Ruby, as the name of the Ruby package in your distribution might be nonstandard or changing over time. However, if all else fails, you can install Ruby directly from its source code in the next section.

 

Installing Ruby from Source Code

Installing Ruby from its source code is a great option if you don’t mind getting your hands dirty. The process is similar on all forms of UNIX (not just Linux—this will work on OS X too). Here are the basic steps:

 

Make sure that your Linux distribution can build applications from source by searching for the “make” and “gcc” tools (on OS X, Xcode allows you to install these). From the terminal, you can use which gcc and which make to see if the development tools are installed. If not, you need to install these development tools (on Ubuntu, try apt-get install build-essential, on Red Hat or CentOS, try sudo yum groupinstall "Development Tools").

 

  • On the download page, scroll down to Compiling Ruby – Source Code and download the archive file containing the latest version. At the time of writing, this is Ruby 2.3.1. This downloads a tar.gz file containing the source code for the latest stable version of Ruby
  • Uncompress the tar.gz file. If you’re at a command prompt or terminal window, go to the same directory as the ruby-2.x.x.tar.gz file and run tar xzvf ruby-2.x.x.tar.gz (where ruby-2.x.x.tar.gz is the name of the file you just downloaded).

 

  • Go into the Ruby folder that was created during decompression. If you’re not using a command prompt at this stage, open a terminal window and go to the directory.

 

  • Run ./configure to generate the makefile and config.h files. If you receive numerous errors, particularly about no C compiler being available, you have not installed the development tools properly on your operating system and should search for further help online on how to achieve this.

 

Run make to compile Ruby from source. This might take a while.

Run make install to install Ruby to its correct location on the system. You need to do this as a superuser (such as root), so you might need to run it as sudo make install and type in your password if you are not logged in as a superuser already.

 

If there are errors by this stage, read the README file that accompanies the source code files for pointers. Otherwise, try to see what version of Ruby is now installed with ruby -v.

 

If the expected version of Ruby appears at this point, you’re ready to move to blog 2 and begin programming. If you get an error complaining that Ruby can’t be found, or the wrong version of Ruby is installed, the place where Ruby was installed might not be in your path (the place your operating system looks for files to run).

 

To fix this, scroll up and find out exactly where Ruby was installed (often in /usr/local/bin or /usr/bin) and add the relevant directory to your path. The process to do this varies by distribution and shell type, so refer to your Linux documentation on changing your path.

 

Once you can check which version of Ruby is running and it’s 2.0 or over (although 1.9.3 will be usable, with complications), and you can run irb and get a Ruby interpreter prompt, your Ruby installation is complete.

 

Other Platforms

Other Platforms

If you’re not using Windows, OS X, or Linux, it is possible you may be able to use a variant or port of Ruby. Up until version 2.0, the official Ruby interpreter supported a variety of other platforms (including BeOS, MS-DOS, and even the Atari ST), but it is now primarily focused on mainstream operating systems, so in this edition we will not be providing any pointers, as they are now out of date.

 

In many cases, the versions of Ruby for some operating systems might be out of date or unsupported. If this is the case, and you’re confident about being able to compile your own version of Ruby directly from the Ruby source code, the source code is available to download from Ruby Programming Language.

 

To test that Ruby is installed sufficiently to continue with this blog, you want to check which version of Ruby is installed by asking Ruby for its version, as follows:

ruby -v You also need access to Ruby’s interactive prompt, irb. You access this simply by running irb (if it’s in your path) as follows:

irb If Ruby and irb do not work without complaint, you need to seek assistance for your specific platform.

 

Summary

Although Ruby is an easy language to learn and develop with, it’s easy to become overwhelmed with the administration of Ruby itself, its installation, and its upgrades. As Ruby is a language constantly in development, it’s possible that points covered in this blog will go out of date and other ways to install Ruby will come along.

 

Tour of Ruby and Object Orientation

Tour of Ruby and Object Orientation

Ruby is one of the easiest programming languages to learn, so that leaves motivation and commitment. In this blog, you learned about several important concepts not only for programming in Ruby, but for programming in general. If these concepts seem logical to you already, you’re well on the way to being a solid Ruby developer. Let’s recap the main concepts before moving on:

 

Class:

A class is a definition of a concept in an object-oriented language such as Ruby. We created classes called Pet, Dog, Cat, Snake, and Person. Classes can inherit features from other classes, but still have unique features of their own.

 

Object:

An object is a single instance of a class (or, as can be the case, an instance of a class itself). An object of class Person is a single person. An object of class Dog is a single dog. Think of objects as real-life objects. A class is the classification, whereas an object is the actual object or “thing” itself.

 

Object orientation:

Object orientation is the approach of using classes and objects to model real-world concepts in a programming language, such as Ruby.

 

Variable:

In Ruby, a variable is a placeholder for a single object, which may be a number, string, list (of other objects), or instance of a class that you defined, such as, in this blog, a Pet.

 

Method:

A method represents a set of code (containing multiple commands and statements) within a class and/or an object. For example, our Dog class objects had a bark method that printed “Woof!” to the screen.

 

Methods can also be directly linked to classes, as with fred = Person.new, where new is a method that creates a new object based on the Person class. Methods can also accept data—known as arguments or parameters—included in parentheses after the method name, as with puts("Test").

 

Arguments/parameters:

These are the data passed to methods in parentheses (or, as in some cases, following the method name without parentheses, as in puts "Test"). Technically, you pass arguments to methods, and methods receive parameters, but for pragmatic purposes, the terms are interchangeable.

 

Kernel:

Some methods don’t require a class or module name to be usable, such as puts. These are usually built-in, common methods that don’t have an obvious connection to any classes or modules. Many of these methods are included in Ruby’s Kernel module, a module that provides functions that work from anywhere within Ruby code without being explicitly referred to (a global “grab bag” of useful methods, if you will).

 

Experimentation:

One of the most fulfilling things about programming is that you can turn your dreams into reality. The amount of skill you need varies with your dreams, but generally if you want to develop a certain type of application or service, you can give it a try.

 

Most software comes from necessity or a dream, so keeping your eyes and ears open for things you might want to develop is important. It’s even more important when you first get practical knowledge of a new language, as you are while reading this blog.

If an idea crosses your mind, break it down into the smallest components that you can represent as Ruby classes and see if you can put together the building blocks with the Ruby you’ve learned so far. Your programming skills can only improve with practice.

 

Baby Steps

You focused on installing Ruby so that your computer can understand the language. At the end of the blog, you loaded a program called irb.

irb: Interactive Ruby

irb stands for “Interactive Ruby.” “Interactive” means that as soon as you type something and press Enter, your computer will immediately attempt to process it. Sometimes this sort of environment is called an immediate or interactive environment.

 

Start irb and make sure a prompt appears, like so:

irb(main):001:0>

This prompt is not as complex as it looks. All it means is that you’re in the irb program, you’re typing your first line (001), and you’re at a depth of 0. You don’t need to place any importance on the depth element at this time.

 

Type this after the preceding prompt and press Enter:

1 + 1
The result should come back quickly: 2. The entire process looks like this:
irb(main):001:0> 1 + 1
=> 2
irb(main):002:0>


Ruby is now ready to accept another command or expression from you.

As a new Ruby programmer, you’ll spend a lot of time in irb testing concepts and building up insights into Ruby. It provides the perfect environment for tweaking and testing the language.

 

irb’s interactive environment gives you the benefit of immediate feedback—an essential tool when learning. Rather than writing a program in a text editor, saving it, getting the computer to run it, and then looking through the errors to see where you went wrong, you can just type in small snippets of code, press Enter, and immediately see what happens.

 

If you want to experiment further, try other arithmetic such as 100 * 5, 57 + 99, 10 – 50, or 100 / 10 (if the last one seems alien to you, in Ruby, the forward slash character, /, is the operator for division).

 

Ruby Is “English for Computers”

At the lowest level, computer processors are made out of transistors that respond to and act on electronic signals, but thinking about performing operations at this level is time consuming and complicated, so we tend to use higher-level “languages” to communicate our intentions, much as we do with natural languages like English. Computers can understand languages, though in a rather different fashion than how most people do.

 

Being logical devices that cannot understand subtlety or ambiguity, languages such as English and French aren’t appealing to computers. Computers require languages with logical structures and a well-defined syntax so that there’s a logical clarity in what you’re telling the computer to do.

 

Clarity is required because almost everything you relay to the computer while programming is an instruction (or command). Instructions are the basic building blocks of all programs, and for the computer to perform (or execute) them properly, the programmer’s intentions must be clear and precise. Many hundreds of these instructions are tied together into programs that perform certain tasks, which means there’s little room for error.

 

You also need to consider that other programmers might need to maintain computer programs you’ve written. This won’t be the case if you’re just programming for fun, but it’s important that your programs are easy to understand, so you can understand them when you come back to them later on.

 

Why Ruby Makes a Great Programming Language

Why Ruby Makes a Great Programming Language

Although English would make a bad programming language, due to its ambiguity and complexity, Ruby can feel surprisingly English-like at times. Ruby is just one of hundreds of programming languages, but it’s special because it feels a lot like a natural language to many programmers, while having the clarity required by computers. Consider this example code:

10.times do print "Hello, world!" end

 

Read through this code aloud (it helps, really!). It doesn’t flow quite as well as English, but the meaning should be immediately clear. It asks the computer to “10 times” “print” “Hello, world!” to the screen. It works. If you’ve got irb running, type in the preceding code and press Enter to see the results:

Hello, world!Hello, world!Hello, world!Hello, world!Hello, world!Hello, world!Hello, world!Hello, world!Hello, world!Hello, world! => 10

 

If you read the code aloud, the resulting output (“Hello, world!” printed ten times) should not be a surprise.

The => 10 on the end might seem more confusing, however, but we’ll be covering the meaning of that later.

 

Note Experienced programmers might wonder why there’s no semicolon at the end of the previous code example. Unlike many other languages, such as Perl, PHP, C, and C++, a semicolon is not needed at the end of lines in Ruby (although it won’t hurt if you do use one). This can take a little while to get used to at first, but for new programmers it makes Ruby even easier to learn.

 

Here’s a much more complex example that might occur in a real-world web app:

user = User.find_by_email('me@privacy.net')

user.country = 'Belgium'

Note Don’t copy and paste this code. It won’t work outside the context of a particular application.

 

This code is nowhere near as obvious as the “Hello, world!” example, but you should still be able to take a good guess at what it does. First, it tells the computer you want to work with a concept called User. Next, it tries to find a user with a specified e-mail address. Last, it changes the user’s country data to Belgium. Don’t worry about how the data is stored for users at this point; that comes later.

 

This is a reasonably advanced and abstract example, but demonstrates a single concept from a potentially complex application where you can deal with different concepts such as “users.” By the end of this blog, you’ll see how you can create your own real-life concepts in Ruby and operate on them in a similar way to this example. Your code can be almost as easy to read as English too.

 

Trails for the Mind

good programmers

Learning can be a fun activity in its own right, but merely reading about something won’t make you an expert at it. I’ve read a few cookblogs, but this doesn’t seem to improve my cooking when I attempt it from time to time. The missing ingredient is experimentation and testing, as without these your efforts are academic, at best.

 

With this in mind, it’s essential to get into the mood of experimenting and testing from day one of using Ruby. Throughout the blog, I’ll ask you to try out different blocks of code and to play with them to see if you get the results you want. You’ll occasionally surprise yourself and sometimes chase your code into dead ends; this is all part of the fun.

 

Whatever happens, all good programmers learn from experimentation, and you can only master a language and programming concepts by experimenting as you go along.

 

This blog will lead you through a forest of code and concepts, but without testing and proving the code is correct to yourself, you can quickly become lost. Use irb and the other tools I’ll cover frequently, and experiment with the code as much as possible so that the knowledge will stick.

 

Type in the following code at your irb prompt and press Enter:

print "test"

The result is, simply:

test => nil

 

Logically, print "test" results in test being printed to the screen. However, the => nil suffix is the result of your code as an expression. This appears because all lines of code in Ruby are made up of expressions that return values. However, print displays data to the screen rather than return any value as an expression, so you get nil. More about this in blog 3. It is perfectly okay to be semi-confused about this at this stage.

Let’s try something else:

print "2+3 is equal to " + 2 + 3

 

This command seems logical on the surface. If 2 + 3 is equal to 5 and you’re adding that to the end of "2+3 is equal to ", you should get "2+3 is equal to 5", right? Unfortunately, you get this error instead:

TypeError: no implicit conversion of Fixnum into String from (irb):45:in `+'

from (irb):45

from :0

 

Ruby complains when you make an error, and here it’s complaining that you can’t convert a number into a string (where a “string” is a collection of text, such as this very sentence). Numbers and strings can’t be mixed in this way. Deciphering the reason isn’t important yet, but experiments such as this along the way will help you learn and remember more about Ruby than reading this blog alone.

 

When an error like this occurs, you can use the error message as a clue to the solution, whether you find it in this blog, on the Internet, or by asking another developer.

 

As a quick side activity, copy and paste the “no implicit conversion of Fixnum into String” error into Google and see what comes up. If you are like most programmers, you will do this a lot over your programming career. Not every article you find will be useful, but sometimes you can get out of tricky situations by seeing what other people suggest online.

 

An interim solution to the preceding problem would be to do this:

print "2+3 is equal to "
print 2 + 3
Or this:
print "2+3 is equal to " + (2 + 3).to_s
Try them both.
Let’s try one more example. What about 10 divided by 3?
irb(main):002:0> 10 / 3
=> 3

Computers are supposed to be precise, but anyone with basic arithmetic skills will know that 10 divided by 3 is 3.33 recurring, rather than 3!

 

The reason for the curious result is that, by default, Ruby assumes a number such as 10 or 3 to be an integer—a whole number. Arithmetic with integers in Ruby gives integer results, so it’s necessary to provide Ruby with a floating point number (a number with decimal places) to get a floating point answer such as 3.33. Here’s an example of how to do that:

Irb(main):001:0> 10.0 / 3

=> 3.3333333333333

Unobvious outcomes such as these make testing and experimentation not only a good learning tool, but essential tactics when developing larger programs. That’s enough of the errors for now. Let’s make something useful!

 

Turning Ideas into Ruby Code

Turning Ideas into Ruby Code

 

Part of the artistry of programming is in being able to turn your ideas into computer programs. Once you become proficient with a programming language, you can turn your ideas directly into code. However, before you can do this, you need to see how Ruby understands real-world concepts, and how you can relay your ideas into a form that Ruby appreciates.

 

How Ruby Understands Concepts with Objects and Classes

Ruby is an object-oriented programming language. In the simplest sense, this means that your Ruby programs can define and operate on concepts in a fashion that mimics how we might deal with concepts in the real world. Your program can contain concepts such as “people,” “boxes,” “tickets,” “maps,” or any other concept you want to work with.

 

Object-oriented languages make it easy to implement these concepts in a way that you can create objects based on them. As an object-oriented language, Ruby can then act on and understand the relationships between these concepts in any way you can define.

 

For example, you might want to create an application that can manage the bloging of tickets for sports events. The concepts involved include “events,” “people,” “tickets,” “venues,” and so forth. Ruby lets you put these concepts directly into your programs, create object instances of them (instances of an “event” might be the Super Bowl or the final of the 2018 World Cup), and perform operations on and define relationships between them.

 

With all these concepts in your program, you can quickly relate “events” to “venues” and “tickets” to “people,” meaning that your code forms a logical system from the outset.

 

With non–object-oriented languages, the programmer has to take a more manual approach for handling concepts and the relationships between them and while this adds more control, it also introduces extra complexity.

 

The Making of a Person

 Making of a Person

Let’s jump directly into some source code demonstrating a simple concept, a person:

class Person

attr_accessor :name, :age, :gender

end

Ruby seemed a lot like English before, but it doesn’t seem much like English when defining concepts.

Let’s go through it step by step:

class Person

This line is where you start to define the concept of a “person.” When we define concepts in Ruby (or in most other object-oriented languages, for that matter) we call them classes. A class is the definition of a single type of object. Class names in Ruby always start with a capital letter, so your programs will end up with classes with names like User, Person, Place, Topic, Message, and so forth.

attr_accessor :name, :age, :gender

 

The preceding line provides three attributes for the Person class. An individual person has a name, an age, and a gender, and this line creates those attributes. attr stands for “attribute,” and accessor roughly means “make these attributes accessible to be set and changed at will.” This means that when you’re working with a Person object in your code, you can change that person’s name, age, and gender (or, more accurately, the object’s name, age, and gender attributes).

end

The end line should be of obvious utility. It matches up with the class definition on the first line and tells Ruby that you’re no longer defining the Person class.

 

To recap, a class defines a concept (such as a Person), and an object is a single thing based on class (such as a “Chris” or a “Mrs. Smith”). So let’s experiment with our Person class. Go to your irb prompt and type in the Person class found earlier. Your efforts should look like this:

irb(main):001:0> class Person
irb(main):002:?> attr_accessor :name, :age, :gender
irb(main):003:?> end
=> nil
irb(main):004:0>

 

You’ll notice that irb recognizes when you were “inside” a class definition because it automatically indents your code.

Once you’ve finished your class definition and Ruby has processed it, nil is returned, because defining a class results in no return value, and nil is Ruby’s way of representing “nothing.” As there were no errors, your Person class now exists within Ruby, so let’s do something with it:

person_instance = Person.new

#<Person:0x007fbb0c625f88>

 

What the first line does is create a “new” instance of the Person class, so you’re creating a “new person” and assigning it to person_instance—a placeholder representing the new person, known as a variable. The second line is Ruby’s response to creating a new person and isn’t important at this stage. The 0x358ea8 bit will be different from computer to computer, and only represents an internal reference that Ruby has assigned to the new person. You don’t have to take it into account at all.

 

Let’s immediately do something with person_instance:

person_instance.name = "Christine"

 

In this basic example, you refer to person_instance’s name attribute and give it a value of "Christine". You’ve just given your person a name. The Person class has two other attributes: age and gender. Let’s set those:

person_instance.age = 52

person_instance.gender = "female"

Simple. You’ve given person_instance a basic identity. What about printing the person’s name back to the screen?

puts person_instance.name

Christine appears when you press Enter. Try the same with the age and the gender.

 

Note In previous examples, you’ve used print to put things on the screen. In the preceding example, you used puts. The difference between print and puts is that puts automatically moves the output cursor to the next line (that is, it adds a newline character to start a new line), whereas print continues printing text onto the same line as the previous time. Generally, you’ll want to use puts, but I used print to make the earlier examples more intuitive when reading them out loud.

 

Basic Variables

Basic Variables

In the previous section, you created a person and assigned that person to a variable (computer terminology for a “placeholder”) called person_instance.

 

Variables are an important part of programming, and they’re easy to understand, especially if you have the barest of knowledge of algebra. Consider this:

x = 10

This code assigns the value 10 to the variable x. Since x now equals 10, you can do things like this:

x * 2

20

Note Some new programmers can be confused by the definition of = as an assignor of value, rather than an indicator of equality. When we say x = 10, we do not mean that x and 10 are equal, but that x should now be considered to refer to the value 10.

 

Variables in Ruby can refer to any value-related concept that Ruby understands, such as numbers, text, and other data structures I’ll cover throughout this blog. In the previous section, person_instance was a variable that referred to an object instance of the Person class, much like x is a variable containing the number 10. More simply, consider person_instance as a name that refers to a particular, unique Person object.

 

When you want to store something and use it over multiple lines within a program, you’ll use variables as temporary storage places for the data you’re working with.

 

From People to Pets

person_instance variable

Previously, you created a simple class (Person), created an object of that class, assigned it as the person_instance variable, and gave it an identity (we called it Christine) that you queried. If these concepts seem simple to you, well done—you understand the bare basics of object orientation! If not, reread the previous section and make sure you follow along on your computer, but also read this section, as I’m going to go into a little more depth.

 

You started out with a Person class, but now you need something a bit more complex, so let’s create some “pets” to live inside Ruby. You’ll create some cats, dogs, and snakes. The first step is to define the classes. You could do something like this:

class Cat
attr_accessor :name, :age, :gender, :color
end
class Dog
attr_accessor :name, :age, :gender, :color
end
class Snake
attr_accessor :name, :age, :gender, :color
end

It’s just like creating the Person class, but multiplied for the three different animals. You could continue by creating animals with code such as lassie = Dog.new or sammy = Snake.new and setting the attributes for the pets with code such as lassie.age = 12 or sammy.color = "Green". Type in the preceding code and give it a try if you like. However, creating the classes in this way would miss out on one of the more interesting features of object-oriented programming: inheritance.

 

Inheritance allows different classes to relate to one another and group concepts by their similarities. In this case, cats, dogs, and snakes are all pets. Inheritance allows you to create a “parent” Pet class, and then let your Cat, Dog, and Snake classes inherit (“is-a”) the features that all pets have.

 

Almost everything in real life exists in a similar structure to your classes. Cats can be pets, which are, in turn, animals; which are, in turn, living things; which are, in turn, objects that exist in the universe. A hierarchy of classes exists everywhere, and object-oriented languages let you define those relationships in code.

 

Structuring Your Pets Logically

Structuring Your Pets Logically

Now that we’ve come up with some ideas to improve our code, let’s retype it from scratch. To totally cleanse out and reset what you’re working on, you can restart irb. irb doesn’t remember information between the different times you use it. So restart irb (to exit irb, type exit and press Enter) and rewrite the class definitions like so:

class Pet
attr_accessor :name, :age, :gender, :color
end
class Cat < Pet
end
class Dog < Pet
end
class Snake < Pet
end

Note In the code listings in this blog, any code that’s within classes is indented, as with the attr_ accessor line in the preceding Pet class. This is only a matter of style, and it makes the code easier to read.

 

When you type it into irb it’s not necessary to replicate the effect, as it will do some indentation for you. You can simply type what you see. Once you start using a text editor to write longer programs, you’ll want to indent your code to make it easier to read, too, but it’s not important yet.

 

First, you create the Pet class and define the name, age, gender, and color attributes available to Pet objects. Next, you define the Cat, Dog, and Snake classes that inherit from the Pet class (the < operator, in this case, denotes which class is inherited from).

 

This means that cat, dog, and snake objects will all have the name, age, gender, and color attributes, but because the functionality of these attributes is inherited from the Pet class, the functionality doesn’t have to be created specifically in each class. This makes the code easier to maintain and update if you wanted to store more information about the pets, or if you wanted to add another type of animal.

 

What about attributes that aren’t relevant to every animal? What if you wanted to store the length of snakes, but didn’t want to store the length of dogs or cats? Luckily, inheritance gives you lots of benefits with no downside. You can still add class-specific code wherever you want. Reenter the Snake class like so:

class Snake < Pet

attr_accessor :length

end

 

The Snake class now has a length attribute. However, this is added to the attributes Snake has inherited from Pet, so Snake has name, age, gender, color, and length attributes, whereas Cat and Dog only have the first four attributes. You can test this like so (some output lines have been removed for clarity):

irb(main):001:0> snake = Snake.new
irb(main):002:0> snake.name = "Sammy"
irb(main):003:0> snake.length = 500
irb(main):004:0> lassie = Dog.new
irb(main):005:0> lassie.name = "Lassie"
irb(main):006:0> lassie.age = 20
irb(main):007:0> lassie.length = 10
NoMethodError: undefined method 'length=' for #<Dog:0x32fddc @age=20, @name="Lassie">

Here you created a dog and a snake. You gave the snake a length of 500, before trying to give the dog a length of 10 (the units aren’t important). Trying to give the dog a length results in an error of undefined method 'length=', because you only gave the Snake class the length attribute.

 

Try playing with the other attributes and creating other pets. Try using attributes that don’t exist and see what the error messages are.

Note You might be wondering why we’re using such artificial examples as cats, dogs, and snakes here. They have been chosen to provide a simple to understand and easily mentally visualized model of how classes work.

 

In your eventual apps, you’ll work with things like different types of users, events, products, photos, and so forth, and they will work in a somewhat similar way. Feel free to create your own classes using concepts relevant to your planned programs and follow along using those instead, substituting the names of the classes where appropriate.

 

Controlling Your Pets

So far, you’ve been creating classes and objects with various changeable attributes. Attributes are data related to individual objects. A snake can have a length, a dog can have a name, and a cat can be of a certain color. What about the instructions I spoke of earlier? How do you give your objects instructions to perform? You define methods for each class.

 

Methods are important in Ruby. They enable you to tell objects to perform actions. For example, you might want to add a bark method to your Dog class, which, if called on a Dog object, prints Woof! to the screen. You could write it like so:

class Dog < Pet
def bark
puts "Woof!"
end
end
After entering this code, any dogs you create can now bark. Let’s try it out:
irb(main):0> a_dog = Dog.new
irb(main):0> a_dog.bark
Woof!

Eureka! You’ll notice that the way you make the dog bark is simply by referring to the dog (a_dog, in this case) and including a period (.) followed by the bark method’s name, whereby your dog “barks.” Let’s dissect exactly what happened.

 

First, you added a bark method to your Dog class. The way you did this was by defining the method. To define a method, you use the word def followed by the name of the method you wish to define. This is what the def bark line means. It means, “I’m defining the bark method within this class until I say end.”

 

The following line then simply puts the word “Woof!” on the screen, and the last line of the method ends the definition of that method. The last end ends the class definition (this is why indentation is useful, so you can see which end lines up with which definition). The Dog class then contains a new method called bark, as you used earlier.

 

Think about how you would create methods for the other Pet classes or for the Pet class itself. Are there any methods that are generic to all pets? If so, they’d go in the Pet class. Are there methods specific to cats? They’d go in the Cat class.

 

Everything Is an Object

Everything Is an Object

In this blog, we’ve looked at how Ruby can understand concepts in the form of classes and objects. We created virtual cats and dogs, gave them names, and triggered their methods (the bark method, for example). These basic concepts form the core of object-oriented programming, and you’ll use them constantly throughout this blog.

 

Dogs and cats are merely an example of the flexibility object orientation offers, but the concepts we’ve used so far could apply to most concepts, whether we’re giving a “ticket” a command to change its price or a “user” a command to change his or her password. Begin to think of the programs you want to develop in terms of their general concepts and how you can turn them into classes you can manipulate with Ruby.

 

Among even object-oriented programming languages, Ruby is reasonably unique in that almost everything in the language is an object, even the concepts relating to the language itself. Consider the following line of code:

puts 1 + 10

 

If you typed this into irb and pressed Enter, you’d see the number 11 in response. You’ve asked Ruby to print the result of 1 + 10 to the screen. It seems simple enough, but believe it or not, this simple line uses two objects. 1 is an object, as is 10. They’re objects of class Fixnum, and this built-in class has methods already defined to perform operations on numbers, such as addition and subtraction.

 

We’ve considered how concepts can be related to different classes. Our pets make a good example. However, even defining the concepts that programmers use to write computer programs as classes and objects makes sense. When you write a simple sum such as 2 + 2, you expect the computer to add two numbers together to make 4.

 

In its object-oriented way, Ruby considers the two numbers (2 and 2) to be number objects. 2 + 2 is then merely shorthand for asking the first number object to add the second number object to itself. In fact, the + sign is actually an addition method! (It’s true, 2.+(2) will work just fine!)

 

You can prove that everything in Ruby is an object by asking the things of which class they’re a member. In the pet example earlier, you could have made a_dog tell you what class it’s a member of with the following code:

puts a_dog.class

Dog

class isn’t a method you created yourself, such as the bark method, but one that Ruby supplies by default to all objects. This means that you can ask any object which class it’s a member of by using its class method. So a_dog.class equals Dog.

 

What about if you ask a number what its class is? Try it out:

puts 2.class

Fixnum

 

The number 2 is an object of the Fixnum class. This means that all Ruby has to do is implement the logic and code for adding numbers together in the Fixnum class, much like you created the bark method for your Dog class, and then Ruby will know how to add any two numbers together! Better than that, though, is that you can then add your own methods to the Fixnum class and process numbers in any way you see fit.

 

Kernel Methods

 

Kernel Methods

 

Kernel is a special class whose methods are made available in every class and scope throughout Ruby. You’ve used a key method provided by Kernel already.

 

Consider the puts method. You’ve been using the puts method to print data to the screen, like so:

puts "Hello, world!"

 

However, unlike the methods on your own classes, puts isn’t prefixed by the name of a class or object on which to complete the method. It would seem logical that the full command should be something like Screen.puts or Display.puts, as puts places text on the screen. However, in reality, puts is a method made available from the Kernel module—a special type of class packed full of standard, commonly used methods, making your code easier to read and write.

 

Note The Kernel module in Ruby has no relationship to kernels in operating systems or the Linux kernel. As with a kernel and its operating system, the Kernel module is part of Ruby’s “core,” but there is no connection beyond that. The word “kernel” is used merely in a traditional sense.

 

When you type puts "Hello, world!", Ruby can tell that there’s no class or object involved, so it looks through its default, predefined classes and modules for a method called puts, finds it in the Kernel module, and does its thing. When you see lines of code where there’s no obvious class or object involved, take time to consider where the method call is going.

To guarantee that you’re using the Kernel puts method, you can refer to it explicitly, although this is rarely done with puts:

Kernel.puts "Hello, world!"

 

Passing Data to Methods

Passing Data to Methods

Asking a dog to bark or asking an object its class is simple with Ruby. You simply refer to a class or object and follow it with a period ( .) and the name of the method, such as a_dog.bark, 2.class, or Dog.new. However, there are situations where you don’t want to issue a simple command, but you want to associate some data with it, too.

 

Let’s create a very simple class that represents a dog:

class Dog
def bark
puts “Woof!”
end
end
Now we can simply make a dog bark by calling the relevant method:
my_dog = Dog.new
my_dog.bark
Woof!

 

That’s simple, but what about if we have an action where some user input would be useful? We can write methods to accept data when they are called. For example:

class Dog
def bark(i)
i.times do
puts “Woof!”
end
end
end
This time we can make the dog bark a certain number of times by passing a value to the bark method:
my_dog = Dog.new
my_dog.bark(3)
Woof!
Woof!
Woof!

When we specify the argument of 3 in my_dog.bark(3), it is passed to the bark method and is placed into the defined parameter i. We can then use i as a source value for running the puts command three times (or, more accurately, i times) using a times block.

 

There are a couple of other things to be aware of at this early stage. First, you can specify many different parameters that can be accepted by a method. For example:

class Dog
def say(a, b, c)
puts a
puts b
puts c
end
end
Now we can pass three arguments:
my_dog = Dog.new
my_dog.say(“Dogs”, “can’t”, “talk!”)
Dogs
can't
talk!

 

You should also be aware that parentheses around the arguments on the end of the method call are optional when there’s only a single argument and the method call is not joined to any others. For example, you’ve previously seen code like this:

puts "Hello"

But you could just as easily write:

puts("Hello")

 

You will continue to see many examples of calling methods and passing arguments to them throughout this blog. Keep your eyes peeled for the various ways this occurs, with and without arguments, and with and without parentheses.

 

Using the Methods of the String Class

You’ve played with dogs and numbers, but lines of text (strings) can be interesting to play with, too:

puts "This is a test".length

14

 

You’ve asked the string "This is a test", which is an object of the String class (confirm this with "This is a test".class), to print its length onto the screen using the length method. The length method is available on all strings, so you can replace "This is a test" with any text you want and you’ll get a valid answer.

Asking a string for its length isn’t the only thing you can do. Consider this:

puts "This is a test".upcase

 

THIS IS A TEST

The String class has many methods, which I’ll cover in the next blog, but experiment with some of the following: capitalize, downcase, chop, next, reverse, sum, and swapcase. Table demonstrates some of the methods available to strings.

 

The Results of Using Different Methods on the String “Test” or “test”

Expression Output
"Test" + "Test" TestTest
"test".capitalize Test
"Test".downcase test
"Test".chop Tes
"Test".next Tesu
"Test".reverse tseT
"Test".sum 416
"Test".swapcase tEST
"Test".upcase TEST
"Test".upcase.reverse TSET
"Test".upcase.reverse.next TSEU

Some of the examples in Table are obvious, such as changing the case of the text or reversing it, but the last two examples are of particular interest. Rather than processing one method against the text, you process two or three in succession.

 

The reason you can do this is that methods will return the original object after it’s been adjusted by the method, so you have a fresh String object upon which to process another method. "Test".upcase results in the string TEST being returned, upon which the reverse method is called, resulting in TSET, upon which the next method is called, which “increments” the last character, resulting in TSEU.

 

Using Ruby in a Non–Object-Oriented Style

Using Ruby in a Non–Object-Oriented Style

So far in this blog, we’ve looked at several reasonably complex concepts. With some programming languages, object orientation is almost an afterthought, and beginners’ blogs for these languages don’t cover object orientation until readers understand the basics of the language (particularly with Perl and PHP, other popular web development languages).

 

However, this doesn’t work for Ruby because Ruby is a pure object-oriented language, and you can gain significant advantages over users of other languages by understanding these concepts right away.

 

Ruby has its roots in other languages, though. Ruby has been heavily influenced by languages such as Perl and C, both usually considered procedural non–object-oriented languages (although Perl has some object-oriented features).

 

As such, even though almost everything in Ruby is an object, you can use Ruby in a similar way as a non–object-oriented language if you like, even if it’s less than ideal. Essentially you’d be “ignoring” Ruby’s object-oriented features, even though they’d still be in operation under the hood.

 

A common demonstration program for a language such as Perl or C involves creating a subroutine (essentially a sort of method that has no associated object or class) and calling it, much like you called the bark method on your Dog objects. Here’s a similar program, written in Ruby:

def dog_barking

puts "Woof!"

end

dog_barking

 

This looks a lot different from your previous experiments. Rather than appearing to define a method within a class, it looks as if you’re defining it on its own, totally independently. The method is a general one and doesn’t appear to be tied to any particular class or object.

 

In a language such as Perl or C, this method would be called a procedure, function, or subfunction, as method is a word generally used to refer to an action that can take place on an object. In Ruby, this method is still being defined on a class (the Object class), but we can ignore that within this context.

 

After the method is defined—it’s still called a method, even though other languages would consider it to be a subroutine or function—it becomes available to use immediately without using a class or object name, like how puts is available without referring directly to the Kernel module.

You call the method simply by using its name on its own, as on the last line of the preceding example. Typing the preceding code into irb results in the dog_barking method being called, giving the following result:

Woof!

 

In Ruby, almost everything’s an object, and that includes the magical space where classless methods end up! Understanding exactly where isn’t important at this stage, but it’s always useful to bear Ruby’s object-oriented ways in mind even when you’re trying not to use object- oriented techniques!

Note If you want to experiment, you’ll find dog_barking at Object.dog_barking.

 

Developing Your First Ruby Application

Developing Your First Ruby Application

Once we’ve developed and tested the basic application, we’ll look at different ways to extend it to become more useful. On our way, we’ll cover some new facets of development that haven’t been mentioned so far.

First, let’s look at the basics of source code organization before moving on to actual programming.

 

Working with Source Code Files

So far in this blog, we’ve focused on using the irb immediate Ruby prompt to learn about the language. However, for developing anything you wish to reuse over and over, it’s essential to store the source code in a file (or often multiple files) that can be kept on your hard drive, sent over the Internet, kept on a drive, and so forth.

 

The mechanism by which you create and manipulate source code files on your system varies by operating system and personal preference. On Windows, you might be familiar with the included Notepad software for creating and editing text files. At a Linux prompt, you might be using vi, Emacs, pico, or nano.

 

Mac users have TextEdit or Xcode at their disposal. Whatever you use, you need to be able to create new files and save them as plain text so that Ruby can use them properly. In the next few sections, you’re going to look at some available tools that tie in well with Ruby development.

 

Creating a Test File

Creating a Test File

The first step to developing a Ruby application is to get familiar with your text editor. If you’re already familiar with text editors and how they relate to writing and saving source code, skip down to the section titled “A Simple Source Code File.”

 

Visual Studio Code

In 2015, Microsoft released a free, cross-platform code editor called Visual Studio Code—not to be confused with their professional Visual Studio suite at Visual Studio Code , you can download Visual Studio Code for Windows, Mac OS X, and Linux, and quickly install and use it as an editor for your future Ruby code.

 

After installing and running Visual Studio Code, you can simply type or paste Ruby code, and use the File ➤ Save menu option to save your text to a location on your drive. It would probably be good to create a folder called ruby within your home or user folder and save your initial Ruby source code there (using a filename such as myapp.rb), as this is what the instructions assume in the next section.

 

If you would prefer a full IDE (integrated development environment) experience that goes beyond what even Visual Studio Code offers, you could use RubyMine by JetBrains, although it is a commercial product. You can find it at https://www.jetbrains.com/ruby/.

 

Alternatives to Linux

Visual Studio Code is available for Linux, but desktop Linux distributions typically come with at least one text editor already which you may prefer to use. If you’re working entirely from the shell or terminal, you might be familiar with vi, Emacs, pico, or nano, and all of these are suitable for editing Ruby source code. Some editors (such as vi and Emacs) have extensions available that are specifically designed to make working with Ruby easier.

 

At this stage, it would be a good idea to create a folder in your home directory called “ruby”, or something similar, so that you can save your Ruby code there and have it in an easily remembered place.

 

A Simple Source Code File

Simple Source Code File

Once you’ve got an environment where you can edit and save text files, enter the following code:

x = 2

print "This program is running okay if 2 + 2 = #{x + x}"

 

Save the code with a filename of example1.rb in a folder or directory of your choice. It’s advisable that you create a folder called ruby located somewhere that’s easy to find. On Windows this might be directly off of your C drive, and on OS X or Linux this could be a folder located in your home directory.

Now you’re ready to run the code.

 

Running Your Source Code

Once you’ve created the basic Ruby source code file, example1.rb, you need to get Ruby to execute it.

As always, the process by which to do this varies by operating system. Read the particular following section that matches your operating system. If your operating system isn’t listed, the OS X and Linux instructions are most likely to match those for your platform.

 

Whenever this blog asks you to “run” your program, this is what you’ll be doing each time.

Use your judgment to jump between these two methods of development. irb is extremely useful for testing small concepts and short blocks of code without the overhead of jumping back and forth between a text editor and the Ruby interpreter.

 

Windows

Running Ruby

Running Ruby from the command line provides the most flexibility and the most predictable behavior. To do this, load the command prompt using the item in the Start menu within the Ruby menu. This will ensure that the ruby command will work directly from the prompt. Once the command prompt is loaded, you’ll need to navigate to the folder containing example1.rb using the cd command, and then type ruby example1.rb.

 

Mac OS X / macOS

The simplest method to run Ruby applications on OS X is from Terminal, much in the same way as irb is run. If you followed the preceding instructions, continue like so:

 

Launch Terminal (found in Applications/Utilities, or use Spotlight to launch it).

Use cd to navigate to the folder where you placed example1.rb, like so: cd ~/ruby. This tells Terminal to take you to the ruby folder located in your home user folder.

 

Type ruby example1.rb and press Enter to execute the example1.rb Ruby script.

If you get an error such as ruby: No such file or directory -- example1.rb (LoadError), you aren’t in the same folder as the example1.rb source file, and you need to establish where you have saved it.

 

If you get a satisfactory response from example1.rb, you’re ready to move on to the “Our Application:

A Text Analyzer” section.

Alternatively, if you’re using Visual Studio Code or Sublime Text, there are other ways you can run your code directly from the editor. However, it may not always be an option, so it’s essential to at least be familiar with how to run Ruby scripts from the terminal, too.

 

Our Application: A Text Analyzer

The application you’re going to develop in this blog will be a text analyzer. Your Ruby code will read in text supplied in a separate file, analyze it for various patterns and statistics, and print out the results for the user.

 

It’s not a 3D graphical adventure or a fancy web site, but text processing programs are the bread and butter of systems administration and most application development. They can be vital for parsing log files and user-submitted text on web sites, and manipulating other textual data. Ruby is well suited for text and document analysis with its regular expression features, along with the ease of use of scan and split, and you’ll be using these heavily in your application.

 

Required Basic Features

Required Basic Features

Your text analyzer will provide the following basic statistics:

  • Character count
  • Character count (excluding spaces)
  • Line count
  • Word count
  • Sentence count
  • Paragraph count
  • Average number of words per sentence
  • Average number of sentences per paragraph

In the last two cases, the statistics are easily calculated from each other. That is, once you have the total number of words and the total number of sentences, it becomes a matter of a simple division to work out the average number of words per sentence.

 

Building the Basic Application

When starting to develop a new program, it’s useful to think of the key steps involved. In the past it was common to draw flow charts to show how the operation of a computer program would flow, but it’s easy to experiment, change things about, and remain agile with modern tools such as Ruby. Let’s outline the basic steps as follows:

 

Load a file containing the text or document you want to analyze.

As you load the file line by line, keep a count of how many lines there were (one of your statistics taken care of).

 

Put the text into a string and measure its length to get your character count.

  • Temporarily remove all whitespace and measure the length of the resulting string to get the character count excluding spaces.
  • Split out all the whitespace to find out how many words there are.
  • Split on full stops to find out how many sentences there are.
  • Split on double newlines to find out how many paragraphs there are.
  • Perform calculations to work out the averages.

Create a new, blank Ruby source file and save it as analyzer.rb in your Ruby folder. As you work through the next few sections, you’ll be able to fill it out.

 

Obtaining Some Dummy Text

Before you start to code, the first step is to get some test data that your analyzer can process. The first blog of Oliver Twist is an ideal piece of text to use, as it’s copyright free and easy to obtain. It’s also of a reasonable length. You can find the text at http://www.rubyinside.com/blog/oliver.txt for you to copy into a local text file. Save the file in the same folder you saved example1.rb, and call it text.txt. Your application will read from text.txt by default.

 

Loading Text Files and Counting Lines

Now it’s time to get coding! The first step is to load the file. Ruby provides a comprehensive set of file manipulation methods via the File class. Whereas other languages can make you jump through hoops to work with files, Ruby keeps the interface simple. Here’s some code that opens up your text.txt file:

 

File.open("text.txt").each { |line| puts line }

Type this into analyzer.rb and run the code. If text.txt is in the current directory, the result is that you’ll see the entire text file flying up the screen. You’re asking the File class to open up text.txt, and then, much like with an array, you can call the each method on the file directly, resulting in each line being passed to the inner code block one by one, where puts sends the line as output to the screen.

 

Edit the code to look like this instead:

line_count = 0

File.open("text.txt").each { |line| line_count += 1 } puts line_count

 

You initialize line_count to store the line count, and then open the file and iterate over each line while incrementing line_count by 1 each time. When you’re done, you print the total to the screen. You have your first statistic!

 

You’ve counted the lines, but still don’t have access to the contents of the file to count the words, paragraphs, sentences, and so forth. This is easy to fix. Let’s change the code a little and add a variable, text, to collect the lines together as one as we go:

text=''
line_count = 0
File.open("text.txt").each do |line|
line_count += 1
text << line
end
puts "#{line_count} lines"

 

Compared to your previous attempt, this code introduces the text variable and adds each line onto the end of it in turn. When the iteration over the file has finished—that is, when you run out of lines—text contains the entire file in a single string ready for you to use.

 

That’s a simple-looking way to get the file into a single string and count the lines, but File also has other methods that can be used to read files more quickly. For example, you can rewrite the preceding code like this:

lines = File.readlines("text.txt")
line_count = lines.size
text = lines.join
puts "#{line_count} lines"

Much simpler! File implements a readlines method that reads an entire file into an array, line by line. You can use this both to count the lines and join them all into a single string.

 

Counting Characters

The second easiest statistic to work out is the number of characters in the file. As you’ve collected the entire file into the text variable, and text is a string, you can use the length method that all strings supply to get the exact size of the file, and therefore the number of characters.

 

To the end of the previous code in analyzer.rb, add the following:

total_characters = text.length
puts "#{total_characters} characters"
If you ran analyzer.rb now with the Oliver Twist text, you’d get output like this:
127 lines
6376 characters

 

The second statistic you wanted to get relating to characters was a character total excluding whitespace. strings have a gsub method that performs a global substitution (like a search and replace) upon the string. For example:

"this is a test".gsub(/t/, 'X')

Xhis is a XesX

You can use gsub to eradicate the spaces from your text string in the same way, and then use the length method to get the length of the newly “de-spacified” text. Add the following code to analyzer.rb:

total_characters_nospaces = text.gsub(/\s+/, '').length
puts "#{total_characters_nospaces} characters excluding spaces"
If you run analyzer.rb in its current state against the Oliver Twist text, the results should be similar to the following:
127 lines
6376 characters
5140 characters (excluding spaces)

 

Counting Words

A common feature offered by word processing software is a “word counter.” All it does is count the number of complete words in your document or a selected area of text. This information is useful to work out how many pages the document will take up when printed. Many assignments also have requirements for a certain number of words, so knowing the number of words in a piece of text is certainly useful.

 

You can approach this feature in a couple of ways:

Count the number of groups of contiguous letters using scan to create an array of those groups and then use the length of the array.

Split the text on whitespace and count the resulting fragments using split and size. Let’s look at each method in turn to see what’s best. For example:

puts "this is a test".scan(/\w/).join

thisisatest

 

In this example, scan looked through the string for anything matching \w, a special term representing all alphanumeric characters (and underscores), and placed them into an array that you’ve joined together into a string and printed to the screen.

You can do the same with groups of alphanumeric characters, you learned that to match multiple characters with a regular expression, you could follow the character with +. So let’s try again:

puts "this is a test".scan(/\w+/).join('-')

 

this-is-a-test

This time, scan has looked for all groups of alphanumeric characters and placed them into the array that you’ve then joined together into a string using - as the separation character.

 

To get the number of words in the string, you can use the length or size array methods to count the number of elements rather than join them together:

puts "this is a test".scan(/\w+/).length

 

Excellent! So what about the split approach?

The split approach demonstrates a core tenet of Ruby (as well as some other languages, particularly Perl): there’s always more than one way to do it! Analyzing different methods to solve the same problem is a crucial part of becoming a good programmer, as different methods can vary in their efficacy.

 

Let’s split the string by spaces and get the length of the resulting array, like so:

puts "this is a test".split.length

4

As it happens, by default, split will split by whitespace (single or multiple characters of spaces, tabs, newlines, and so on), and that makes this code shorter and easier to read than the scan alternative.

 

So what’s the difference between these two methods? Simply, one is looking for words and returning them to you for you to count, and the other is splitting the string by that which separates words—whitespace—and telling you how many parts the string was broken into. Interestingly, these two approaches can yield different results:

text = "First-class decisions require clear-headed thinking."
puts "Scan method: #{text.scan(/\w+/).length}"
puts "Split method: #{text.split.length}"
Scan method: 7
Split method: 5

 

Interesting! The scan method is looking through for all blocks of alphanumeric characters, and, sure enough, there are seven in the sentence. However, if you split by spaces, there are five words. The reason is the hyphenated words. Hyphens aren’t “alphanumeric,” so scan is seeing “first” and “class” as separate words.

 

Returning to analyzer.rb, let’s apply what we’ve learned here. Add the following:

word_count = text.split.length
puts "#{word_count} words"
Running the complete analyzer.rb gets these results:
127 lines
6376 characters
5140 characters (excluding spaces)
1111 words

 

Counting Sentences and Paragraphs

Counting Sentences and Paragraphs

Once you understand the logic of counting words, counting the sentences and paragraphs becomes easy. Rather than splitting on whitespace, sentences and paragraphs have different splitting criteria.

 

Sentences end with full stops, question marks, and exclamation marks. They can also be separated with dashes and other punctuation, but we won’t worry about these rare cases here. The split is simple. Instead of asking Ruby to split the text on one type of character, you simply ask it to split on any of three types of characters, like so:

sentence_count = text.split(/\.|\?|!/).length

 

The regular expression looks odd here, but the full stop, question mark, and exclamation mark are clearly visible. Let’s look at the regular expression directly:

/\.|\?|!/

 

The forward slashes at the start and the end are the usual delimiters for a regular expression, so those can be ignored. The first section is \., and this represents a full stop. The reason why you can’t just use . without the backslash in front is because . represents “any character” in a regular expression, so it needs to be escaped with the backslash to identify itself as a literal full stop.

 

This also explains why the question mark is escaped with a backslash, as a question mark in a regular expression usually means “zero or one instances of the previous character”. The ! is not escaped, as it has no other meaning in terms of regular expressions.

 

The pipes (| characters) separate the three main characters, which means they’re treated separately so that split can match one or another of them. This is what allows the split to split on periods, question marks, and exclamation marks all at the same time. You can test it like so:

puts "Test code! It works. Does it? Yes.".split(/\.|\?|!/).length

4

Paragraphs can also be split apart with regular expressions. Whereas paragraphs in a printed blog, such as this one, tend not to have any spacing between them, paragraphs that are typed on a computer typically do, so you can split by a double newline (as represented by the special combination \n\n—simply, two newlines in succession) to get the paragraphs separated. For example:

text = %q{
This is a test of
paragraph one.
This is a test of
paragraph two.
This is a test of
paragraph three.
}
puts text.split(/\n\n/).length
3
Let’s add both these concepts to analyzer.rb:
paragraph_count = text.split(/\n\n/).length
puts "#{paragraph_count} paragraphs"
sentence_count = text.split(/\.|\?|!/).length
puts "#{sentence_count} sentences"

 

Calculating Averages

Calculating Averages

The final statistics required for your basic application are the average number of words per sentence and the average number of sentences per paragraph. You already have the paragraph, sentence, and word counts available in the variables word_count, paragraph_count, and sentence_count, so only basic arithmetic is required, like so:

puts "#{sentence_count / paragraph_count} sentences per paragraph (average)"
puts "#{word_count / sentence_count} words per sentence (average)"
The calculations are so simple that they can be interpolated directly into the output commands rather than pre-calculated. When run now, we’d see this:
127 lines
6376 characters
5140 characters excluding spaces
1111 words
paragraphs
sentences
2 sentences per paragraph (average)
words per sentence (average)

 

The Source Code So Far

You’ve been updating the source code as you’ve gone along, and in each case you’ve put the logic next to the puts statement that shows the result to the user. However, for the final version of your basic application, it would be tidier to separate the logic from the presentation a little and put the calculations in a separate block of code before everything is printed to the screen.

 

There are no logic changes, but the finished source for analyzer.rb looks a little cleaner this way:

lines = File.readlines("text.txt")
line_count = lines.size
text = lines.join
word_count = text.split.length
character_count = text.length
character_count_nospaces = text.gsub(/\s+/, '').length
paragraph_count = text.split(/\n\n/).length
sentence_count = text.split(/\.|\?|!/).length
puts "#{line_count} lines"
puts "#{character_count} characters"
puts "#{character_count_nospaces} characters excluding spaces"
puts "#{word_count} words"
puts "#{paragraph_count} paragraphs"
puts "#{sentence_count} sentences"
puts "#{sentence_count / paragraph_count} sentences per paragraph (average)"
puts "#{word_count / sentence_count} words per sentence (average)"
When run, the result will be somewhat like the following:
127 lines
6376 characters
5140 characters excluding spaces
1111 words
paragraphs
sentences
2 sentences per paragraph (average)
words per sentence (average)

If you’ve made it this far and everything’s making sense, congratulations are due. Let’s look at how to extend our application a little further with some more interesting statistics.

 

Adding Extra Features

Adding Extra Features

 

Your analyzer has a few basic functions, but it’s not particularly interesting. Line, paragraph, and word counts are useful statistics, but with the power of Ruby, you can extract significantly more interesting data from the text. The only limit is your imagination, but in this section, you’ll look at a couple other features you can implement, and how to do so.

 

Percentage of “Useful” Words

Most written material, including this very blog, contains a large number of words that, although providing context and structure, are not directly useful or interesting. In the last sentence, the words that, and, are, and or are not of particular interest, even if the sentence would make less sense to a human without them.

 

These words are typically called stop words, and are often ignored by computer systems whose job is to analyze and search through text, because they aren’t words most people are likely to be searching for (as opposed to nouns, for example). Google is a perfect example of this, as it doesn’t want to have to store information that takes up space and that’s generally irrelevant to searches.

 

It can be argued that more “interesting” text should have a lower percentage of stop words and a higher percentage of useful or interesting words. You can easily extend your application to work out the percentage of non–stop words in the supplied text.

The first step is to build up a list of stop words. There are hundreds of possible stop words, but you’ll start with just a handful. Let’s create an array to hold them:

stopwords = %w{the a by on for of are with just but and to the my I has some in}

 

This code results in an array of stop words being assigned to the stopwords variable.For demonstration purposes, let’s write a small, separate program to test the concept:

text = %q{Los Angeles has some of the nicest weather in the country.}
stopwords = %w{the a by on for of are with just but and to the my in I has some}
words = text.scan(/\w+/)
keywords = words.select { |word| !stopwords.include?(word) }
puts keywords.join(' ')
When you run this code, you get the following result:
Los Angeles nicest weather country

Cool, right? First you put some text into the program, then the list of stop words. Next you get all the words from text into an array called words. Then you get to the magic:

keywords = words.select { |word| !stopwords.include?(word) }

This line first takes your array of words, words, and calls the select method with a block of code to process for each word. The select method is available to all arrays and hashes that return the elements of that array or hash that match the expression in the code block.

 

In this case, the code in the code block takes each word via the variable word, and asks the stopwords array whether it includes any elements equal to word. This is what stopwords.include?(word) does.

 

The exclamation mark (!) before the expression negates the expression (an exclamation mark negates any Ruby expression). The reason for this is you don’t want to select words that are in the stopwords array. You want to select words that aren’t. In closing, then, you select all elements of words that are not included in the stopwords array and assign them to keywords. Don’t read on until that makes sense, as this type of single-line construction is common in Ruby programming.

 

After that, working out the percentage of non–stop words to all words uses some basic arithmetic:

((keywords.length.to_f / words.length.to_f) * 100).to_i

The reason for the .to_f’s is so that the lengths are treated as floating decimal point numbers, and the percentage is worked out more accurately. When you work it up to the real percentage (out of 100), you can convert back to an integer once again.

 

Here’s a look at how we can bring these concepts together with our other program fragments so far:

stopwords = %w{the a by on for of are with just but and to the my I has some in}
lines = File.readlines(“text.txt”)
line_count = lines.size
text = lines.join
Count the words, characters, paragraphs and sentences word_count = text.split.length
character_count = text.length
character_count_nospaces = text.gsub(/\s+/, '').length paragraph_count = text.split(/\n\n/).length sentence_count = text.split(/\.|\?|!/).length
Make a list of words in the text that aren't stop words,
count them, and work out the percentage of non-stop words
against all words
all_words = text.scan(/\w+/)
good_words = all_words.reject{ |word| stopwords.include?(word) }
good_percentage = ((good_words.length.to_f / all_words.length.to_f) * 100).to_i
Give the analysis back to the user puts "#{line_count} lines"
puts "#{character_count} characters"
puts "#{character_count_nospaces} characters (excluding spaces)" puts "#{word_count} words"
puts "#{sentence_count} sentences" puts "#{paragraph_count} paragraphs"
puts "#{sentence_count / paragraph_count} sentences per paragraph (average)" puts "#{word_count / sentence_count} words per sentence (average)"
puts "#{good_percentage}% of words are non-fluff words"
With these results:
127 lines
6376 characters
5140 characters (excluding spaces)
1111 words
sentences
paragraphs
2 sentences per paragraph (average)
words per sentence (average) 76% of words are non-fluff words

 

Summarizing by Finding “Interesting” Sentences

Microsoft Word

Word processors such as Microsoft Word generally have summarization features that can take a long piece of text and seemingly pick out the best sentences to produce an “at-a-glance” summary. The mechanisms for producing summaries have become more complex over the years, but one of the simplest ways to develop a summarizer of your own is to scan for sentences with certain characteristics.

 

One technique is to look for sentences that are of about average length and that look like they contain nouns. Tiny sentences are unlikely to contain anything useful, and long sentences are likely to be simply too long for a summary. Finding nouns reliably would require systems that are far beyond the scope of this blog, so you could “cheat” by looking for words that indicate the presence of useful nouns in the same sentence, such as “is” and “are” (for example, “Noun is,” “Nouns are,” “There are x nouns”).

 

Let’s assume that you want to throw away two-thirds of the sentences—a third that are the shortest sentences and a third that are the longest sentences—leaving you with an ideal third of the original sentences that are ideally sized for your task. For ease of development, let’s create a new program from scratch and transfer your logic over to the main application later. Create a new program called summarize.rb and use this code: text = %q{

 

Ruby is a great programming language. It is object oriented and has many groovy features. Some people don't like it, but that's not our problem! It's easy to learn. It's great. To learn more about Ruby, visit the official Ruby web site today. }

sentences = text.gsub(/\s+/, ' ').strip.split(/\.|\?|!/)
sentences_sorted = sentences.sort_by { |sentence| sentence.length } one_third = sentences_sorted.length / 3
ideal_sentences = sentences_sorted.slice(one_third, one_third + 1)
ideal_sentences = ideal_sentences.select { |sentence| sentence =~ /is|are/ } puts ideal_sentences.join(". ")

 

And for good measure, run it to see what happens:

Ruby is a great programming language. It is object oriented and has many groovy features Seems like a success! Let’s walk through the program. First you define the variable text to hold the long string of multiple sentences, much like in analyzer.rb.

 

Next you split text into an array of sentences like so: sentences = text.gsub(/\s+/, ' ').strip.split(/\.|\?|!/)

This is slightly different from the method used in analyzer.rb. There is an extra gsub in the chain, as well as strip. The gsub gets rid of all large areas of whitespace and replaces them with a single space (\s+ meaning “one or more whitespace characters”). This is simply for cosmetic reasons. The strip removes all extra whitespace from the start and end of the string. The split is then the same as that used in the analyzer.

 

Next you sort the sentences by their lengths, as you want to ignore the shortest third and the longest third:

sentences_sorted = sentences.sort_by { |sentence| sentence.length }

 

Arrays and hashes have the sort_by method, which can rearrange them into almost any order you want. sort_by takes a code block as its argument, where the code block is an expression that defines what to sort by.

 

In this case, you’re sorting the sentences array. You pass each sentence in as the sentence variable, and choose to sort them by their length, using the length method on the sentence. After this line, sentences_sorted contains an array with the sentences in length order.

 

Next you need to get the middle third of the length-sorted sentences in sentences_sorted, as these are the ones you’ve deemed to be probably the most interesting. To do this, you can divide the length of the array by 3 to get the number of elements in a third, and then grab that number of elements from one third into the array (note that you grab one extra element to compensate for rounding caused by integer division).

 

This is done like so:

one_third = sentences_sorted.length / 3

ideal_sentences = sentences_sorted.slice(one_third, one_third + 1)

 

The first line takes the length of the array and divides it by 3 to get the quantity that is equal to “a third of the array.” The second line uses the slice method to “cut out” a section of the array to assign to ideal_sentences. In this case, assume that the sentences_sorted is six elements long. 6 divided by 3 is 2, so a third of the array is 2 elements long.

 

The slice method then cuts from element 2 for 2 (plus 1) elements, so you effectively carve out elements 2, 3, and 4 (remember that array elements start counting from 0). This means you get the “inner third” of the ideal-length sentences you wanted.

 

The penultimate line checks to see if the sentence includes the word is or are, and only accepts each sentence if so:

ideal_sentences = ideal_sentences.select { |sentence| sentence =~ /is|are/ }

It uses the select method, as the stop-word removal code in the previous section did. The expression in the code block uses a regular expression that matches against sentence, and only returns true if is or are are present within sentence. This means ideal_sentences now only contains sentences that are in the middle third lengthwise and contain either is or are.

 

The final line simply joins the ideal_sentences together with a full stop and space between them to make them readable:

puts ideal_sentences.join(". ")

Analyzing Files Other Than text.txt

 

So far your application has the filename text.txt hard-coded into it. This is acceptable, but it would be a lot nicer if you could specify, when you run the program, what file you want the analyzer to process.

 

Note This technique is only practical to demonstrate if you’re running analyzer.rb from a command prompt or shell, or if your IDE supports passing in command-line arguments.

Typically, if you’re starting a program from the command line, you can append parameters onto the end of the command, and the program will process them. You can do the same with your Ruby application.

 

Ruby automatically places any parameters that are appended to the command line when you launch your Ruby program into a special array called ARGV. To test it out, create a new script called argv.rb and use this code: puts ARGV.join('-')

 

From the command prompt, run the script like so: ruby argv.rb

The result will be blank, but then try to run it like so:

ruby argv.rb test 123

test-123

 

This time the parameters are taken from ARGV, joined together with a hyphen, and displayed on the screen. You can use this to replace the reference to text.txt in analyzer.rb by replacing "text.txt" with ARGV[0] or ARGV.first (which both mean exactly the same thing— the first element of the ARGV array). The line that reads the file becomes the following:

 style="margin:0;width:948px;height:49px">lines = File.readlines(ARGV[0])
To process text.txt now, you’d run it like so:
ruby analyzer.rb text.txt

Note If you ran the above but specified a file that did not exist, the program would still run but File.readlines would throw an error. We look at ways to tackle this issue later.