Ruby File Open and Read

Ruby file Open and Ruby File Read

Ruby file Open and Ruby File Read

Interaction, in computer terms, relates to the input and output of data, or I/O for short. Most programming languages have built-in support for I/O, and Ruby’s is well designed and easy to use. 

 

I/O streams are the basis for all input and output in Ruby. An I/O stream is a conduit or channel for input and output operations between one resource and another. Usually, this will be between your Ruby program and the keyboard, or between your Ruby program and a file. Along this stream, input and output operations can take place.

 

In some cases, such as when using the keyboard, I/O only works in one direction, as you can’t send data to a keyboard, and data can only be sent to, and not from, a display. In this blog, we’re going to look at using the keyboard, using files, and other forms of I/O in Ruby and how they can be used. In this blog, we also discuss Ruby file open and Ruby File Read in more detail. 

 

Keyboard Input

Keyboard Input

The simplest way to get external data into a program is to use the keyboard. For example:

  • a = gets
  • puts a

gets accepts a single line of data from the standard input—the keyboard in this case—and assigns it to a. You then print it, using puts, to the standard output—the screen in this case.

 

STANDARD INPUT AND OUTPUT

The standard input is a default stream available in many operating systems that relate to the standard way to accept input from a user or external process.

 

In our case, the standard input is the keyboard, but if, for example, you were to redirect data to a Ruby program from a UNIX-like operating system, such as Linux or Mac OS X, the standard input would be the data being directed to it. For example, let’s assume we put the preceding code example into a file called test.rb and then ran it like so:

ruby test.rb < somedata.txt

The output provided this time would be the first line of somedata.txt, as gets would retrieve a single line from the standard input that, in this case, would be the contents of the file somedata.txt. Essentially, the file is now the input, not the keyboard.

 

Conversely, standard output is usually referring to the screen or display, but if the results of your Ruby script are being redirected to a file or another program, that destination file or program becomes the target for the standard output.

Alternatively, you can read multiple lines in one go by using readlines:

  • lines = readlines
  • puts lines.length

 

Readlines accepts line after line of input until a terminator, most commonly known as EOF (End Of File), is found. You can create EOF on most platforms by pressing Ctrl+D. When the terminating line is found, all the lines of input given are put into an array that’s assigned to lines. This is particularly ideal for programs that accept piped or redirected input on standard input.

 

Note that on the second line above we then look at the length of the lines array. So if the above code were in a file called line-count.rb and you passed in a text file containing ten lines: ruby line count.rb < textfile.txt

You’d get this result: 10

In reality, however, this mechanism is rarely used, unless writing shell scripts for use at a UNIX prompt. In most cases, you’ll be writing to and from files directly, and you’ll require only minimal keyboard input that you can get with gets.

 

File I/O

File I/O

You used the File class to open a text file so you could read in the contents for your program to process. The File class is used as an abstraction to access and handle file objects that can be accessed from a Ruby program. The File class lets you write to both plain text and binary files (there’s not really an inherent difference—they’re both just sets of data) and offers a collection of methods to make handling files easy.

 

Opening and Reading Files

Reading Files

The most common file-related procedure is reading a file’s data for use within a program. As you saw in

File.open("text.txt").each { |line| puts line }

The File class’s open method is used to open the text file, text.txt, and upon that File object, the each method returns each line one by one. You can also do it this way:

File.new("text.txt", "r").each { |line| puts line }

 

This method clarifies the process involved. By opening a file, you’re creating a new File object that you can then use. The second parameter, "r", defines that you’re opening the file for reading.

 

This is the default mode, but when using File.new, it can help to clarify what you want to do with the file (as ‘new’ might imply the creation of a file, which is not usually the case). This becomes important later when you write to files or create new ones from scratch.

 

For opening and reading files, File.new and File.open are identical, but File.open has one, extra feature. File.open can accept a code block, and once the block is finished, the file will be closed automatically.

 

However, File.new only returns a File object referring to the file. To close the file, you have to use its close method. Let’s compare the two methods. First, look at File.open:

  • File.open("text.txt") do |f|
  • puts f.gets
  • end

 

This code opens text.txt and then passes the file handle into the code block as f.

puts f.gets takes a line of data from the file and prints it to the screen. Now, have a look at the File.

  • new approach:
  • f = File.new("text.txt", "r")
  • puts f.gets
  • f.close

In this example, a file handle/object is assigned to f directly. You close the file handle manually with the close method at the end.

 

Both the code block and file handle techniques have their uses. Using a code block is a clean way to open a single file quickly and perform operations in a single location. However, assigning the File object with File.new (or File.open, if you choose) makes the file reference available throughout the entire current scope without needing to contain file-manipulation code within a single block.

 

Note You might need to specify the location of files directly, as text.txt might not appear to be in the current directory. Simply replace f = File.new("text.txt", "r") with f = File.new("c:\ full\ path\ here\text.txt", "r"), including the full path as necessary (this example demonstrates a Windows-style path). Alternatively, use the result of Dir::pwd to see what the current working directory is and put text.txt there.

 

You could also choose to assign the file handle to a class or instance variable:

 style="margin:0;width:967px;height:183px">class MyFile
attr_reader :handle
def initialize(filename)
@handle = File.new(filename, "r")
end
def finished
@handle.close
end
end
f = MyFile.new("text.txt")
puts f.handle.gets
f.finished

 

More File-Reading Techniques

File-Reading Techniques

In the previous section, you used a File object’s each method to read each line one by one within a code block. However, you can do a lot more than that. Let’s assume your text.txt file contains this dummy data:

  • Elon Joggers, Manager, Male,45
  • Alba Smith, Cook, Female,23
  • Debbie Watts, Professor, Female,38

Next, we’ll look at some of the different techniques you can use to read the file, along with their outputs.

 

First, you can read an I/O stream line by line using each:

 style="margin:0;width:961px;height:64px">File.open("text.txt").each { |line| puts line }
Elon Jogers,Manager,Male,45
Alba Smith,Cook,Female,23
Debbie Watts,Professor,Female,38

Note each technically reads from the file delimiter by a delimiter, where the standard delimiter is a “newline” character. You can change this delimiter. You can read an I/O stream with each using a custom delimiter of your choosing:

 style="margin:0;width:962px;height:196px">File.open("text.txt").each(',') { |line| puts line }
Elon Jogers,
Manager,
Male,
45
Alba Smith,
Cook,
Female,
23
Debbie Watts,
Professor,
Female,
38

In this case, you passed an optional argument to each that specified a different delimiter from the default “newline” delimiter. Commas delimit the input.

Tip You can override the default delimiter by setting the special variable $/ to any delimiter you choose. You can read an I/O stream byte by byte with each_byte:

 style="margin:0;width:969px;height:136px">File.open("text.txt").each_byte { |byte| puts byte }
70
114
101
100
...many lines skipped for brevity...
51
56
10

Note When reading byte by byte, you get the single byte values of each character rather than the characters themselves, much like when you do something like puts "test"[0]. To convert into text characters, you can use the chr method.

 

There’s also an alternative called each_char that lets you read character by character. In some character sets, characters may be represented by more than one byte, so this can be useful:

 style="margin:0;width:965px;height:119px">File.open("text.txt").each_char { |byte| puts byte }
...many lines skipped for brevity...
Here’s how to read an I/O stream line by line using gets:
File.open("text.txt") do |f|
2.times { puts f.gets }
end
Elon Jogers,Manager,Male,45
Alba Smith,Cook,Female,23

gets isn’t an iterator like each or each_byte. Therefore, you have to call it multiple times to get multiple lines. In this example, it was used twice, and pulled out the first two lines of the example file. Like each, however, gets can accept an optional delimiter:

 

 style="margin:0;width:975px;height:208px">File.open("text.txt") do |f|
2.times { puts f.gets(',') }
end
Elon Jogers,
Manager,
There’s also a noniterative version of each_byte called getc:
File.open("text.txt") do |f|
2.times { puts f.getc }
end
You can also read an entire file into an array, split by lines, using readlines:
puts File.open("text.txt").readlines.join("--")
Elon Jogers,Manager,Male,45
--Alba Smith,Cook,Female,23
--Debbie Watts,Professor,Female,38

 

Note The “newline” characters that are present at the end of each line of the file are not removed, meaning that a newline occurs before each instance of --.

Lastly, you can choose to read an arbitrary number of bytes from a file into a single variable using read:

 style="margin:0;width:961px;height:64px">File.open("text.txt") do |f|
puts f.read(6)
end
Elon B

Note You can use all these methods on any file, such as binary files (images, executables, and so on), not just text files. However, on Windows, you might need to open the file in binary mode. This is covered in the section “Writing to Files.”

 

The File class makes some convenient methods available so that you don’t need to do things like File.

 style="margin:0;height:93px;width:965px">open("text.txt").read to be able to read a file into a string. Instead, you can do this:
data = File.read("text.txt")
This acts as a shorthand for opening the file, using the standard read method, and then closing the file again.
You can also do this:
array_of_lines = File.readlines("text.txt")
Simple!

Generally, you should try to use these shortcut methods wherever possible, as they result in shorter, easier-to-read code, and you don’t have to worry about closing the files. Everything is taken care of for you in one step. Of course, if reading a file line by line is necessary (perhaps if you’re working with extremely large files), then you can use the techniques demonstrated earlier in this blog for reading line by line.

 

Your Position Within a File

 Position Within a File

When reading a file, it can be useful to know where you are within that file. The pos method gives you access to this information:

 

 style="margin:0;width:966px;height:109px">f = File.open("text.txt")
puts f.pos
puts f.gets
puts f.pos
0
Elon Jogers,Manager,Male,45
28

Before you begin to read any text from the file, the position is shown as 0. Once you’ve read a line of text, the position is shown as 28. This is because pos returns the position of the file pointer (that is, the current location within the file that you’re reading from) in the number of bytes from the start of the file.

 

However, pos can work both ways, as it has a sister method, pos=:

 style="margin:0;width:961px;height:90px">f = File.open("text.txt")
f.pos = 8
puts f.gets
puts f.pos
ggs,Manager,Male,45
28

 

In this instance, the file pointer was placed eight bytes into the file before reading anything. This meant that “Fred Blo” was skipped, and only the rest of the line was retrieved.

 

Writing to Files

Writing to Files

The ability to jump easily around files, read lines based on delimiters, and handle data byte by byte makes Ruby ideal for manipulating data, but I haven’t yet covered how to write new information to files or how to make changes to existing files.

 

Generally, you can mirror most of the techniques used to read files when writing to files. For example:

  • File.open("text.txt", "w") do |f|
  • f.puts "This is a test"
  • end

 

This code creates a new file (or overwrites an existing file) called text.txt and puts a single line of text within it. Previously, you’ve used puts on its own to output data to the screen. However, when used with a File object, puts writes the data to the file instead. Simple!

 

The "w" passed as the second argument to File.open tells Ruby to open the file for writing only and to create a new file or overwrite what is already in the file. This is in contrast with the "r" mode used earlier when opening a file for reading only.

 

File Mode Properties of the I/O Stream

I/O Stream

  • Read-only. The file pointer is placed at the start of the file.
  • r+ Both reading and writing are allowed. The file pointer is placed at the start of the file.

 

  • Write-only. A new file is created (or an old one overwritten as if new).
  • w+ Both reading and writing are allowed, but File.new creates a new file from scratch

 

  • Write (in append mode). The file pointer is placed at the end of the file and writes will make the file longer.
  • a+ Both reading and writing are allowed (in append mode). The file pointer is placed at the end of the file and writes will make the file longer.
  • Binary file mode. You can use it in conjunction with any of the other modes listed.

 

Using the append mode described in Table, it’s trivial to create a program that appends a line of text to a file each time it’s run:

  • f = File.new("logfile.txt", "a")
  • f.puts Time.now
  • f.close

 

If you run this code multiple times, logfile.txt will contain several dates and times, one after the other. Append mode is particularly ideal for log file situations where new information has to be added at different times.

 

The read and write modes work in a simple manner. If you want to open a file in a mode where it can be read from and written to at the same time, you can do just that:

 style="margin:0;width:957px;height:79px">f = File.open("text.txt", "r+")
puts f.gets
f.puts "This is a test"
puts f.gets
f.close

The second line of this code reads the first line of text from the file, meaning the file pointer is waiting at the start of the second line of data. However, the following f.puts statement then puts a new line of text into the file at that position. Unfortunately, this action will not push the previously existing second line to the third line of the file.

 

All it does is overwrite the equivalent number of bytes, so you end up with a broken third line! This behavior means you really need to think carefully before writing data into the middle of an existing file, as you may not get the outcome you thought you would!

 

Whereas puts outputs lines of text, you can perform the writing equivalents of getc and read with out and write:

 style="margin:0;width:962px;height:121px">f = File.open("text.txt", "r+")
f.putc "X"
f.close
This example opens text.txt for reading and writing, and changes the first character of the first line to X.
Similarly:
f = File.open("text.txt", "r+")
f.write "123456"
f.close

This example overwrites the first six characters of the first line with 123456. Note It’s worth noticing that putc and write overwrite existing content in the file rather than inserting it.

 

Character Sets and Encodings

Character Sets and Encodings

Ruby 1.9 and later come with built-in support for automatically handling alternative character encodings when reading files. 

 

Note If you are not already familiar with character encodings or are using Ruby 1.8, this section will be of no use to you, so you can skip forward to “Renaming and Deleting Files,” and use this section as a reference later, as needed.

 

Whereas strings have just “internal” encodings, I/O objects also have “external” encodings, since I/O objects deal with data coming from, or going to, somewhere else.

 

In all of the previous file reading examples in this blog, Ruby used the default encoding to represent data that is read in, even though this may be incorrect. Specifying an external encoding when opening a file requires that you append any supplied file mode with a colon and then specify the encoding’s name. For example, if you want to read a file that uses the UTF-8 encoding scheme:

 

File.new("text.txt", "r:utf-8").each { |line| puts line }

In this example, we’re reading a file (as specified by the "r" mode), but we’re also telling the File object to treat the data as if it’s in the UTF-8 encoding (whether it actually is or not). This encoding is then applied for all data read from (or written to, if you’re in the right mode) the file.

 

It is possible to determine the external encoding of an I/O object (such as those of the File class) using its external_encoding method:

 style="margin:0;width:966px;height:78px">p File.open("text.txt", "r:iso-8859-1").external_encoding p File.open("text.txt", "r").external_encoding

#<Encoding:ISO-8859-1>

#<Encoding:UTF-8>

 

Note If your default encoding is not UTF-8, the second line will return whatever your default encoding actually is, since no external encoding was specified when creating the File object.

 

Another function of Ruby I/O encoding support is in transcoding from one encoding to another. For example, you might be opening a file in the UTF-8 encoding system, but want Ruby to “translate” it to another encoding on the fly as the data is read. This is achieved by adding another colon and encoding the name to the file mode parameter:

 style="margin:0;width:957px;height:97px">File.open("text.txt", "r:utf-8:iso-8859-1") do |f|
f.external_encoding first_line = f.gets
first_line.encoding
end
#<Encoding:UTF-8>
#<Encoding:ISO-8859-1>

 

The transcoding feature will be useful if you want to represent all text within your application in a certain encoding but need to read files of varying encodings. In each case, use the relevant external coding, but get Ruby to convert everything into UTF-8!

 

Renaming and Deleting Files

Renaming and Deleting Files

If you want to change the name of a file, you could create a new file with the new name and read into that file all the data from the original file. However, this isn’t necessary, and you can simply use File. rename like so:

 style="margin:0;width:963px;height:81px">File.rename("file1.txt", "file2.txt")
Deleting a file is just as simple. You can delete either one file at a time or many at once:
File.delete("file1.txt")
File.delete("file2.txt", "file3.txt", "file4.txt")
File.unlink("file1.txt")

Note File.unlink does exactly the same thing as File.delete.

 

File Operations

The File class offers you more than just the ability to read and write files. You can also perform a number of checks and operations upon files.

 

Creating Filenames Platform-Independently

File Operations

Windows and UNIX-related operating systems have different ways of denoting filenames. Windows filenames look like c:\directory\filename.ext, whereas UNIX-style filenames look like /directory/ filename.ext. If your Ruby scripts work with filenames and need to operate under both systems, the File class provides the join method.

 

Under both systems, filenames (and complete paths) are built up from directory names and local filenames. For example, in the preceding examples, the directory is called directory, but on Windows, backslashes are used as opposed to forward slashes.

 

Note In modern versions (1.9+) of Ruby on Windows, it’s fine to use UNIX-style pathnames using forward slashes as directory separators, rather than having to format filenames in a Windows style with backslashes. However, this section is included for completeness, or for instances where you need to work with libraries that don’t respect UNIX-style pathnames on other operating systems.

 

On Windows, you can use File.join to put together a filename using directory names and a final filename:

  • File.join('full', 'path', 'here', 'filename.txt')
  • full\path\here\filename.txt

 

Note Depending on how your system is set up, you might even see a forward-slash version of the preceding code on Windows, although that is technically a UNIX-style path.

 

On UNIX-related operating systems, such as Linux, the code is the same:

  • File.join('full', 'path', 'here', 'filename.txt')
  • full/path/here/filename.txt

 

The File.join method is simple to use, and it allows you to write the same code to run on both systems rather than choosing between backslashes and forward slashes in your code.

 

The separator itself is stored in a constant called File:: SEPARATOR, so you can easily turn a filename into an absolute filename (with an absolute path) by appending the directory separator to the start, like so:

 style="margin:0;height:85px;width:954px">File.join(File::SEPARATOR , 'full', 'path', 'here', 'filename.txt')
/full/path/here/filename.txt
Similarly, you can use File.expand_path to turn basic filenames into complete paths. For example:
File.expand_path("text.txt")
/Users/peter/text.txt

 

Note The result of File.expand_path will vary according to the operating system the code is run under. As text.txt is a relative filename, it converts it to an absolute filename and references the current working directory.

 

Seeking

Seeking

In a previous example, you changed the position of the file pointer using pos. However, this only allows you to specify the exact position of the file pointer. If you want to move the pointer forward by a certain offset or move the pointer to a certain position backward from the end of the file, you need to use seek.

 

seek has three modes of operation:

IO::SEEK_CUR: Seeks a certain number of bytes ahead of the current position.

 

IO::SEEK_END: Seeks to a position based on the end of the file. This means that to seek to a certain position from the end of the file, you’ll probably need to use a negative value.

IO::SEEK_SET: Seeks to an absolute position in the file. This is identical to pos=.

 

Therefore, to position the file pointer five bytes from the end of the file and change the character to an X, you would use seek as follows:

 style="margin:0;width:973px;height:63px">f = File.open("text.txt", "r+")
f.seek(-5, IO::SEEK_END)
f.putc "X"
f.close

 

Note Notice that because you’re writing to the file, you use the r+ file mode to enable writing as well as reading.

Or you could do this to print every fifth character in a file:

 style="margin: 0px; height: 137px; width: 972px;">f = File.open("text.txt", "r")

while a = f.getc

puts a.chr

f.seek(5, IO::SEEK_CUR)

end

 

Finding Out When a File Was Last Modified

File Modified

To establish when a file was last modified, use File.mtime:

 style="margin:0;width:974px;height:153px">puts File.mtime("text.txt")
2015-11-22 00:05:02 +0000
The time is returned as a Time object, so you can get more information directly:
t = File.mtime("text.txt")
puts t.hour
puts t.min
puts t.sec
00
05
02

 

Checking Whether a File Exists

 File Exists

It’s useful to check whether a file actually exists, particularly if your program relies on that file or if a user supplied the filename. If the file doesn’t exist, you can raise a user-friendly error or exception. Invoke the File.exist? method to check for the existence of a file:

puts "It exists!" if File.exist?("file1.txt")

File.exist? returns true if the named file exists. You could edit the MyFile class created in a previous example to check for the existence of a file before opening it to avoid a potential exception being thrown, like so:

 style="margin:0;width:957px;height:153px">class MyFile
attr_reader :handle
def initialize(filename)
if File.exist?(filename)
@handle = File.new(filename, "r")
else
return false
end
end
end

 

Getting the Size of a File

File.size returns the size of a file in bytes. If the file doesn’t exist, an exception is thrown, so it would make sense to check its existence with File.exist? first.

puts File.size("text.txt")

 

How to Know When You’re at the End of a File

In previous examples, either you’ve used iterators to give you all the lines or bytes in a file, or you’ve pulled only a few lines from a file here and there. However, it would be useful to have a foolproof way to know when the file pointer is at or has gone past, the end of the file. The eof? method provides this feature:

 style="margin:0;width:971px;height:82px">f = File.new("test.txt", "r")
while !f.eof?
puts f.gets
end
f.close

 

This example uses an “infinite” loop that will only conclude once f.eof? is true. This specific example is not particularly useful, as f.each could have performed a similar task, but in situations where you might be moving the file pointer around manually, or making large jumps through a file, checking for an “end of file” situation is useful.

 

Directories

Directories

All files are contained within various directories, and Ruby has no problem handling these. Whereas the File class handles files, directories are handled with the Dir class.

 

Navigating Through Directories

To change directory within a Ruby program, use Dir.chdir:

 style="margin:0;width:953px;height:121px">Dir.chdir("/usr/bin")
This example changes the current directory to /usr/bin.
You can find out what the current directory is with Dir.pwd. For example, here’s the result on my installation:
puts Dir.pwd
/Users/peter
Dir.chdir("/usr/bin")
puts Dir.pwd
/usr/bin

 

You can get a list of the files and directories within a specific directory using Dir.entries:

puts Dir.entries("/usr/bin").join(' ')

Dir.entries returns an array with all the entries within the specified directory. Dir.foreach provides the same feature, but as an iterator:

 style="margin:0;width:970px;height:110px">Dir.foreach("/usr/bin") do |entry|
puts entry
end
An even more concise way of getting directory listings is by using Dir’s class array method:
Dir["/usr/bin/*"]
["/usr/bin/a2p", "/usr/bin/aclocal", "/usr/bin/aclocal-1.6",
"/usr/bin/addftinfo", "/usr/bin/afmtodit", "/usr/bin/alias", "/usr/bin/amlint", "/usr/bin/ ant", ...items removed for brevity... ]

 

In this case, each entry is returned as an absolute filename, making it easy to use the File class’s methods to perform checks on each entry if you wished.

 

Creating a Directory

Creating a Directory

You use Dir.mkdir to create directories, like so:

Dir.mkdir("mynewdir")

Once the directory has been created, you can navigate to it with Dir.chdir. You can also specify absolute paths to create directories under other specific directories:

  • Dir.mkdir("/mynewdir")
  • Dir.mkdir("c:\test")

 

However, you cannot create directories under directories that don’t yet exist themselves. If you want to create an entire structure of directories, you must create them one by one from the top down.

 

Note On UNIX-related operating systems, Dir.mkdir accepts a second optional argument: an integer specifying the permissions for the directory. You can specify this in octal, as with 0666 or 0777, representing modes 666 and 777, respectively.

 

Deleting a Directory

Deleting a directory is similar to deleting a file:

Dir.delete("mynewdir")

 

NoteDir.unlink and Dir.rmdir perform exactly the same function and are provided for convenience. As with Dir.mkdir, you can use absolute pathnames.

 

One thing you need to consider when deleting directories is whether they’re empty. If a directory isn’t empty, you cannot delete it with a single call to Dir.delete. You need to iterate through each of the subdirectories and files and remove them all first. You can do that iteration with Dir.foreach, looping recursively through the file tree by pushing new directories and files to remove onto an array.

 

Alternatively, you can use the rm_f method of the FileUtils library that comes with Ruby:

require 'fileutils'

FileUtils.rm_f(directory_name)

CautionIf you choose to use rm_f, tread carefully, as you might accidentally delete the wrong thing!

 

Creating Files in the Temporary Directory

Temporary Directory

Most operating systems have the concept of a “temporary” directory where temporary files can be stored. Temporary files are those that might be created briefly during a program’s execution but aren’t a permanent store of information. Dir.tmpdir provides the path to the temporary directory on the current system, although the method is not available by default. To make Dir.tmpdir available, it’s necessary to use require 'tmpdir':

  • require 'tmpdir'
  • puts Dir.tmpdir
  • /tmp

 

Note On Mac OS X, the result might be somewhat more esoteric.

For example, I was given the temporary directory of /var/folders/80/80DFegkBHLmcQjJHdZ5SCE+++TI/-Tmp-. On Windows, I got C:/Users/ username/AppData/Local/Temp.

 

You can use Dir.tmpdir with File.join to create a platform-independent way of creating a temporary file:

 style="margin:0;width:957px;height:94px">require 'tmpdir'

tempfilename = File.join(Dir.tmpdir, "myapp.dat") tempfile = File.new(tempfilename, "w") tempfile.puts "This is only temporary" tempfile.close

File.delete(tempfilename)

 

This code creates a temporary file, writes data to it, and deletes it. Ruby’s standard library also includes a library called temp file that can create temporary files for you:

 style="margin:0;width:957px;height:93px">require 'tempfile'
f = Tempfile.new('myapp')
f.puts "Hello"
puts f.path
f.close
/tmp/myfile1842.0

Unlike creating and managing your own temporary files, temp files automatically deletes the files it creates after they have been used. This is an important consideration when choosing between the two techniques.

 

Basic Databases

Databases

Many applications need to store, access, or manipulate data. In some cases, this is by loading files, making changes to them, and outputting data to the screen or back to a file. In many situations, however, a database is required.

 

A database is a system for organizing data on a computer in a systematic way. A database can be as simple as a text file containing data that can be manipulated by a computer program, or as complex as many gigabytes of data spread across hundreds of dedicated database servers. You can use Ruby in these scenarios and for those in between. First, we’re going to look at how to use simple text files as a form of organized data.

 

Text File Databases

Text File Databases

One simple type of database can be stored in a text file in a format commonly known as CSV. CSV stands for comma-separated values and means that for each item of data you’re storing, you can have multiple attributes separated with commas. The dummy data in your text.txt file in the previous section used CSV data. To recap, text.txt initially contained this code:

  • Elon Joggers, Manager, Male,45
  • Alba Smith, Cook, Female,23
  • Debbie Watts, Professor, Female,38

 

Each line represents a different person, and commas separate the attributes relating to each person.

The commas allow you to access (and change) each attribute separately. Ruby’s standard library includes a library called csv that allows you to use text files containing CSV data as simple databases that are easy to read, create, and manipulate.

 

Reading and Searching CSV DataThe CSV class provided by the csv standard library will manage the manipulation of CSV data for you:require 'csv'

 style="margin:0;width:959px;height:92px">CSV.open('text.txt').each do |person|
person
end
["Elon Jogers", "Manager", "Male", "45"]
["Alba Smith", "Cook", "Female", "23"]
["Debbie Watts", "Professor", "Female", "38"]

 

You open the text.txt file by using CSV.open, and each line (that is, each individual “person” in the file) is passed into the block one by one using each. The inspect method demonstrates that each entry is now represented in array form. This makes it easier to read the data than when it was in its plain text form.

 

You can also use CSV alongside the File class:

 style="margin:0;height:125px;width:957px">require 'csv'
people = CSV.parse(File.read('text.txt'))
puts people[0][0]
puts people[1][0]
puts people[2][0]
Elon Jogers
Alba Smith
Debbie Watts

 

This example uses the File class to open and read in the contents of a file, and CSV.parse immediately uses these to convert the data into an array of arrays. The elements in the main array represent each line in the file, and each element in those elements represents a different attribute (or field) of that line. Therefore, by printing out the first element of each entry, you get the people’s names only.

An even more succinct way of loading the data from a CSV-formatted file into an array is with CSV.read:

 

 style="margin:0;width:965px;height:48px">require 'csv'
p CSV.read('text.txt')
[["Fred Jogers", "Manager", "Male", "45"], ["Alba Smith", "Cook", "Female", "23"], ["Debbie Watts", "Professor", "Female", "38"]]

 

The find and find_all methods (also known as detect and select, respectively) provided by the Enumerable module to Array make it easy for you to perform searches on the data available in the array. For example, you’d use this code if you wanted to pick out the first person in the data called Alba:

 style="margin:0;width:970px;height:66px">require 'csv'
people = CSV.read('text.txt')
Alba = people.find { |person| person[0] =~ /Alba/ } p Alba
["Alba Smith", "Cook", "Female", "23"]

 

Using the find (or detect) method with a code block that looks for the first matching line where the name contains “Alba” gives you back the data you were looking for. Where find returns the first matching element of an array or hash, find_all (or select) returns all valid matches. Let’s say you want to find the people in your database whose ages are between 20 and 40:

 style="margin:0;width:965px;height:76px">young_people = people.find_all do |p|
p[3].to_i.between?(20, 40)
end
p young_people
[["Alba Smith", "Cook", "Female", "23"], ["Debbie Watts", "Professor", "Female", "38"]]

 

This operation provides you with the two matching people contained within an array that you can iterate through.

 

Saving Data Back to the CSV File

CSV File

Once you can read and query data, the next step is being able to change it, delete it, and rewrite your CSV file with a new version of the data for future use. Luckily, this is as simple as reopening the file with write access and “pushing” the data back to the file. The CSV module handles all of the conversion.

 style="margin:0;width:972px;height:124px">require 'csv'
people = CSV.read('text.txt')
Alba = people.find { |person| person[0] =~ /Alba/ } Alba[0] = "Lauren Smith"
CSV.open('text.txt', 'w') do |csv|
people.each do |person|
csv << person
end
end

 

You load in the data, find a person to change, change her name, and then open the CSV file and rewrite the data back to it. Notice, however, that you have to write the data person by person. Once complete, text. txt is updated with the name change. This is how to write back CSV data to file.

 

Storing Objects and Data Structures

Data Structures

Working with CSV is easy, but it doesn’t feel very smooth. You’re always dealing with arrays, so rather than getting nice names such as name, age, or job for the different attributes, you have to remember in which element and at which position each attribute is located.

 

You’re also forced to store simple arrays for each separate entry. There’s no nesting, no way to relate one thing to another, no relationship to object orientation, and the data is “flat.” This is sufficient for basic data, but what if you simply want to take data that already exists in structures like arrays and hashes and save that data to disk for later use?

 

PStore

PStore is a core Ruby library that allows you to use Ruby objects and data structures as you normally would, and then store them in a file. Later on, you can reload the objects back into memory from the disk file. This technique is known as object persistence and relies on a technique called marshaling, where standard data structures are turned into a form of flat data that can be stored to disk or transmitted over a network for later reconstruction.

 

Let’s create a class to represent the structure of the data you were using in the CSV examples:

 style="margin:0;width:962px;height:153px">class Person
attr_accessor :name, :job, :gender, :age
end
You can re-create your data like so:
Elon = Person.new
http://fred.name = "Elon Jogers"
fred.age = 45
Alba = Person.new
http://Alba.name = "Alba Smith"
Alba.age = 23

 

Rather than have your data in arrays, you now have your data available in a fully object-oriented fashion. You could create methods within the Person class to help you manipulate your objects and so forth.

 

This style of storing and manipulating data is true to the Ruby way of things and is entirely object-oriented. However, until now, your objects have only lasted until the end of a program, but with PStore it’s easy to write them to a file:

 style="margin:0;width:960px;height:112px">require 'pstore'
store = PStore.new("storagefile")
store.transaction do
store[:people] ||= Array.new
store[:people] << fred
store[:people] << Alba
end

 

In this example, you create a new PStore in a file called storage file. You then start a transaction (data within a PStore file can only be read or updated while inside a “transaction” to prevent data corruption), and within the transaction, you make sure the: people element of the store contains something or gets assigned to be an array.

 

Next, you push the Fred and Alba objects to the: people element of the store and then end the transaction.

 

The reason for the hash syntax is because a PStore is, effectively, a disk-based hash. You can then store whatever objects you like within that hash. In this example, you’ve created an array within store[: people] and pushed your two Person objects to it.

 

Later on, you can retrieve the data from the PStore database:

 style="margin:0;width:969px;height:210px">require 'pstore'
store = PStore.new("storagefile")
people = []
store.transaction do
people = store[:people]
end
At this point the Person objects inside people can be treated
as totally local objects.
people.each do |person|
puts http://person.name
end
Elon Jogers
Alba Smith

 

Note It’s necessary for the Person class to be defined and ready to use before loading the Person objects from the PStore file, so if you ran the previous example separately from the first, make sure you include the Person class definition again.

 

With only a simple storage and retrieval process, PStore makes it easy to add storage facilities to existing Ruby programs by allowing you to store existing objects into a PStore database. Object persistence is not ideal for many types of data storage, but if your program is heavily dependent on objects, and you want to store those objects to disk for later use, PStore provides a simple method to use.

 

YAML

YAML

YAML (standing for YAML Ain’t Markup Language) is a special text-based markup language that was designed as a data serialization format that’s readable by humans. You can use it in a similar way to PStore to serialize data structures, but unlike store's data, humans can easily read YAML data and even directly edit it with a text editor and a basic knowledge of YAML syntax.

 

The YAML library comes as part of Ruby’s standard library, so it’s easy to use. Unlike PStore, though, the YAML library converts data structures to and from YAML and doesn’t provide a hash to use, so the technique is a little different. This example writes an array of objects to disk:

 style="margin:0;width:963px;height:259px">require 'yaml'
class Person
attr_accessor :name, :age
end
Elon = Person.new
http://fred.name = "Elon Jogers"
fred.age = 45
Alba = Person.new
http://Alba.name = "Alba Smith"
Alba.age = 23
test_data = [ fred, Alba ]
puts test_data.to_yaml
---
!ruby/object:Person age: 45
name: Elon Jogers
!ruby/object:Person name: Alba Smith age: 23

 

You can use the to_yaml method to convert your Person object array into YAML data, which, as you might agree, is extremely readable! YAML.load performs the operation in the other direction, turning YAML code into working Ruby objects. For example, let’s modify the YAML data a little and see if it translates back into working objects:

 style="margin:0;width:960px;height:234px">require 'yaml'
class Person
attr_accessor :name, :age
end
yaml_string = <<END_OF_DATA
---
!ruby/object:Person age: 45
name: Jimmy
!ruby/object:Person
age: 23
name: Alba Smith
END_OF_DATA
test_data = YAML.load(yaml_string)
puts test_data[0].name
puts test_data[1].name
Jimmy
Alba Smith

 

Here YAML.load converts the YAML data back into the test_data array of Person objects successfully. You can use YAML to convert between most types of Ruby objects (including basic types such as Array and Hash) and YAML. This makes it an ideal intermediary format for storing data (such as configuration files) your applications need to access.

 

Note When dealing with serialized objects, you must still have the classes used by those objects defined within the program somewhere; otherwise, they won’t be usable. As plain text, you can safely transmit YAML via e-mail, store it in normal text files, and move it around more easily than the binary data created by libraries such as PStore.

 

To learn more about YAML formatting, read its Wikipedia entry at Wikipedia, the free encyclopedia YAML, or refer to the official YAML website at The Official YAML Web Site.

 

Relational Databases and SQL

SQL

In the previous section, you created some extremely simplistic “databases” using text files and object persistence. Text files, of course, have their limitations. They’re not reliable if many processes are using them at the same time, and they’re slow. Loading a CSV file into memory is fine when the dataset is small, but when it grows, the process of working directly with files can soon become sluggish.

 

When developing more robust systems, you pass database filing and management off to a separate application or system, and applications simply connect to a database system to pass data back and forth. In the previous section, you were working with database files and the data within them quite directly, and that’s unacceptable when performance and reliability are necessary.

 

Relational Database Concepts

Relational Database Concepts

One major benefit of using a dedicated database system is getting support for relational databases. A relational database is composed of data grouped into one or more tables that can be linked together.

 

A table stores information about one type of thing. For example, an address blog database might be made up of a people table, an addresses table, and a phone numbers table. Each table stores information about people, addresses, and phone numbers, respectively.

 

The people table would likely have a number of attributes (known as columns, in database land) such as name, age, and gender. Each row of the table—that is, an individual person—would then have information in each column.

 

 In relational databases, it’s standard procedure to have an id column on most tables to identify each row uniquely. Although you could look up and retrieve data based on other columns, such as name, numeric IDs are useful when you’re creating relationships between tables.

 

However, when dealing with relational databases at a lower level, it’s common to use all lowercase names for column and table names. This explains why the text and later code examples in this blog refer to table and column names in lowercase only.

 

One benefit of relational databases is the way rows in different tables can be related to one another. For example, your people table could have an address_id column that stores the ID of the address associated with this user. If you want to find out the address of a particular person, you can look up his or her address_ id, and then look up the relevant row of the addresses table.

 

The reason for this sort of relationship is that many people in your people database might share the same address, and rather than store the address separately for each person, it’s more efficient to store a reference instead. This also means that if you update the address in the future, it updates for all the relevant users at the same time.

 

The relationship functionality also supports the definition of many-to-many relationships. You could create a separate table called related_people that has two columns, first_person_id, and second_person_ id. This table could store pairs of ID numbers that signify two people are related to each other.

 

To work out to whom a person is related, you can simply look for any rows mentioning his or her ID number, and you’d get back the ID numbers of that person’s related people. This sort of relationship is used in most databases and is what makes relational databases so useful.

 

MySQL, PostgreSQL, and SQLite

MySQL

Three well-known relational database systems available today that work on both Windows and UNIX operating systems are MySQL, PostgreSQL, and SQLite. Each has significantly different features from the others, and therefore has different uses.

 

Most web developers will be familiar with MySQL, as it comes with most web hosting packages and servers, making it easily the most commonly used database engine on the Internet.

 

For our purposes in the next few sections of this blog, we’ll be using a system called SQLite. Unlike MySQL or PostgreSQL, SQLite doesn’t run as a “server,” so it doesn’t require any special resources. Whereas MySQL and PostgreSQL both run as permanent server applications, SQLite is “on-demand” and works entirely on your local machine.

 

Despite this, it’s still fast and reliable and is ideal for local database purposes. You can easily carry much of the knowledge you learn with SQLite across to other systems. SQLite is also the default database engine used with Ruby on Rails apps.

 

Nonetheless, toward the end of this blog, we’ll look at how you can connect to databases using these other architectures so that you can get direct access to any existing databases you might have from your Ruby applications.

 

Installing SQLite

Installing SQLite

The first step to getting a database system up and running quickly is to install SQLite3—the latest version of SQLite. Mac OS X comes with SQLite 3 by default, as do some Linux distributions. On Ubuntu or Debian Linux, you can run apt-get install sqlite3 libsqlite3-dev.

 

Once the SQLite3 libraries are installed at the operating system level, you can install the Ruby library that gives Ruby access to SQLite3 databases. It’s packaged as a gem called sqlite3-ruby and can be installed on all systems with gem install sqlite3 or sudo gem install sqlite3 on UNIX-related operating systems if you aren’t running as a superuser.

 

You can check that everything was installed okay with this code:

  • require 'sqlite3'
  • puts "It's all okay!" if defined?(SQLite3:: Database)
  • It's all okay!

 

If the installation didn’t progress smoothly, links to SQLite resources.

 

A Crash Course in Basic Database Operations and SQL

To manage databases with any of the various database systems at a basic level, knowledge of several SQL commands is required. In this section, we’re going to look at how to create tables, add data to them, retrieve data, delete data, and change data. Throughout this section, think entirely in terms of databases separately from Ruby. A demonstration of how Ruby can use SQL to manipulate a database is covered in detail in the later section “Using SQLite with Ruby.”

 

Note If you’re already familiar with SQL, you can skip the next few sections and jump straight to the section “Using SQLite with Ruby” to see SQL in action alongside Ruby.

 

What Is SQL?

What Is SQL

Structured Query Language (SQL) is a special language, often known as a query language, used to interact with database systems. You can use SQL to create, retrieve, update, and delete data, as well as create and manipulate structures that hold that data.

 

Its basic purpose is to support the interaction between a client and a database system. In this section, I’m going to give you a primer on SQL’s syntax and how you can use it from Ruby. Be aware that this section is only a very basic introduction to SQL, as a full and deep explanation of SQL is beyond the scope of this blog.

 

Note that the way different database systems use and implement SQL can vary wildly, which is why the following sections will only cover that which is reasonably standard and enables you to perform basic data operations.

 

If you want to play along at home, you can use the command-line sqlite3 client to create a database and perform SQL queries upon it without getting involved with Ruby at all. Just run the sqlite3 test.db, where test.db is your chosen database filename. You can then type SQL and press Enter to execute it. To leave the client, you can type .quit on a separate line and press Enter.

 

Note There are also libraries that remove the necessity of writing SQL in order to work with databases. We mention some of these at the end of the blog. Regardless, at least reading about how SQL works are going to be beneficial to you in the long term.

 

CREATE TABLE

Before you can add data into a database, it’s necessary to create one or many tables to hold it. To create a table, you need to know what you want to store in it, what you want to call it, and what attributes you want to store.

 

For your people table, you want to have a name, job, gender, and age columns, as well as a unique id column for possible relationships with other tables. To create a table, you use a syntax like so:

 style="margin:0;width:967px;height:96px">CREATE TABLE table_name (
column_name data_type options,
column_name data_type options,
...,
...
);

Note SQL commands are typically written in capital letters for clarity (and it’s somewhat traditional). However, you don’t have to do this. Table names and attributes, however, can be case-sensitive with some database systems, so stick to lowercase for those!

Therefore, for your people table, you’d use this syntax:

 style="margin:0;height:95px;width:964px">CREATE TABLE people (
id integer primary key,
name varchar(50),
job varchar(50),
gender varchar(6),
age integer);

This SQL command creates a people table and gives it five columns. The data types for the name, job, and gender columns are all VARCHARs, meaning they’re variable-length character fields. In basic terms, it means they can contain strings. The number in brackets refers to the maximum length of that string, so the name column can hold a maximum of 50 characters.

 

Note SQLite is a reasonably pragmatic database, and it ignores most conventions relating to data types in SQL. Almost any form of data will fit into any type of column. SQLite ignores the maximum lengths for these VARCHAR columns. This is one reason why SQLite is great for quick and easy development, but not so great for crucial systems!

 

The id column has the words primary key as its options. This means that the id column is the primary reference to each row and that the ID must be unique for each row. This means SQLite will automatically assign a unique ID to each row, so you don’t need to specify one yourself each time you add a new row.

 

INSERT INTO

INSERT command

You use the INSERT command to add rows to tables:

INSERT INTO people (name, age, gender, job) VALUES ("Chris Scott", 25, "Male", « "Technician");

First, you specify the table you want to add a row too, and then list the columns you wish to fill out, before passing in the values with which to fill the row.

 

You can omit the list of columns if the data passed after VALUES is in the correct order:

INSERT INTO people VALUES ("Chris Scott", 25, "Male", "Technician");

CautionThis particular INSERT would cause an error on your people table! It’s missing the id column. However, it’s safer and more convenient if you specify the columns beforehand, as in the first example. The second example clearly demonstrates why this is the case, as it’s hard to tell which item of data relates to which column.

 

Columns that don’t have any data specified for them will be filled in automatically with the defaults specified in the CREATE TABLE statement for that table. In the case of the people table, the id column will automatically receive a unique ID number for each row added.

 

SELECT

SELECT command

You use the SELECT command to retrieve data from tables. You specify which columns you want to retrieve (or use * as a wildcard to retrieve them all) and the table you want to retrieve data from, and optionally include a condition upon which to base the retrieval. For example, you might only want to choose a particular row or rows that match certain criteria.

 

This SQL statement retrieves the data from all columns for all rows in the people table:

 

SELECT * FROM people;

This SQL retrieves all the values from just the name column of rows in the people table (for example, “Fred Joggers,” “Chris Scott,” “Alba Smith”):

 

SELECT name FROM people;

This SQL retrieves rows with an id column equal to 2 from the people table (usually, because id is a column containing unique values, only one row would be returned for such a query):

SELECT * FROM people WHERE id = 2;

 

This SQL retrieves any rows that have a name column equal to “Chris Scott”:

SELECT * FROM people WHERE name = "Chris Scott";

 

This SQL retrieves all rows of people whose ages are between 20 and 40, inclusive:

SELECT * FROM people WHERE age >= 20 AND age <= 40;

 

The conditions used in SQL are somewhat similar to those used in Ruby and other programming languages, except that logical operator such as AND and OR, are written as plain English. Also, as in Ruby, you can use parentheses to group expressions and build up more complex requests.

 

It’s also possible to have the results returned in a certain order by appending an ORDER BY clause such as ORDER column_name to the SQL query. You can further append ASC to the column name to sort in an ascending fashion, or DESC to sort in a descending fashion.

 

For example, this SQL returns all rows from the people table ordered by the name column in descending order (so names starting with Z come before those beginning with A):

 

SELECT * FROM people ORDER BY name DESC;

 

This SQL returns all rows of those people between the ages of 20 and 40 in order of age, youngest first:

SELECT * FROM people WHERE age >= 20 AND age <= 40 ORDER BY age ASC;

 

Another useful addition to a SELECT command is LIMIT. LIMIT allows you to place a limit on the number of rows returned on a single query:

SELECT * FROM people ORDER BY name DESC LIMIT 5;

 

In conjunction with ORDER, you can use LIMIT to find extremes in the data. For example, finding the oldest person is easy:

SELECT * FROM people ORDER BY age DESC LIMIT 1;

 

This sorts the rows in descending order by age and returns the first result: the highest. To get the youngest person, you could use ASC instead of DESC on the ordering.

Note Database engines sort columns automatically by their data type. Strings of text are formatted alphanumerically, whereas integer and other number columns are sorted by their numeric value.

 

DELETE

DELETE

The DELETE SQL command deletes rows from tables. You can delete rows based on an SQL condition. For example:

 style="margin:0;width:963px;height:82px">DELETE FROM people WHERE name="Chris";
DELETE FROM people WHERE age > 100;
DELETE FROM people WHERE gender = "Male" AND age < 50;
As with SELECT, you can place limits on the number of deletions:
DELETE FROM people WHERE age > 100 LIMIT 10;

 

In this case, only ten rows with an age over 100 would be deleted.

Think of the DELETE command to be like SELECT, but instead of returning the rows, it erases them. The format is otherwise reasonably similar.

 

UPDATE

UPDATE

UPDATE provides the ability to update and amend information within the database. As with DELETE, the syntax for UPDATE is similar to that of SELECT. Consider this:

 

SELECT * FROM people WHERE name = "Chris";

UPDATE people SET name = "Christopher" WHERE name = "Chris";

UPDATE first accepts the name of a table whose row(s) might be updated, then accepts the column(s) to be changed along with the new data, and finally accepts an optional condition for the change. Some examples follow.

 

This SQL changes the name column to “Christopher” on all rows where the name column is currently equal to “Chris”:

UPDATE people SET name = "Christopher" WHERE name = "Chris";

 

This SQL changes the name column to “Christopher” and the age column to 44 where the name column is currently equal to “Chris”:

UPDATE people SET name = "Christopher", age = 44 WHERE name = "Chris";

 

This SQL changes the name column to “Christopher” where the name column is “Chris” and the age column equals 25. Therefore, a row where the name is Chris and the age is 21 will not be updated by this example query:

UPDATE people SET name = "Christopher" WHERE name = "Chris" AND age = 25;

This SQL changes the name column to “Christopher” on every row of the people table. This demonstrates why it pays to be careful when building SQL queries, as short statements can have big ramifications!

 

UPDATE people SET name = "Christopher";

 

Using SQLite with Ruby

Now that you’ve installed SQLite and we’ve covered the basics of how SQL works, let’s put together a basic demonstration of how it all works in conjunction with Ruby. To do this, you’re going to write a program that allows you to manipulate a database based on the people table that we’ve talked about so far in this blog.

 

The first step is to write the basic code that can load or create a database. The SQLiteRuby gem makes this simple with the SQLite3::Database.new method. For example:

  • require 'sqlite3'
  • $db = SQLite3::Database.new("dbfile")
  • $db.results_as_hash = true

From this point, you can use $db in a similar way to the file handles you used earlier in this blog. For example, $db.close will similarly close the database file, just as you closed regular files. The $db.results_as_hash = true line forces SQLite to return data in a hash format rather than as an array of attributes (as with CSV). This makes the results easier to access.

 

Note The database handle has been assigned to a global variable, $db, so that you can split your program into multiple methods without creating a class. You can, therefore, access the database handle, $db, from anywhere you wish. This isn’t what you’d do in a large program, but for learning to use SQLite3 here, it will suffice.

To cope with the closing situation, you’ll create a method specifically for disconnecting the database and ending the program:

 style="margin:0;width:958px;height:83px">def disconnect_and_quit
$db.close
puts "Bye!"
exit
end

 

Note Remember that you must define methods before you use them, so put these separate methods at the top of your source file.

Now let’s create a method that will use the CREATE TABLE SQL statement to create the table where you’ll store your data:

 style="margin:0;width:956px;height:167px">def create_table
puts "Creating people table"
$db.execute %q{
CREATE TABLE people (
id integer primary key,
name varchar(50),
job varchar(50),
gender varchar(6),
age integer)
}
end

 

A database handle will allow you to execute arbitrary SQL with the execute method. All you need to do is pass the SQL as an argument, and SQLite will execute the SQL upon the database.

 style="margin:0;width:951px;height:183px">Next, let’s create a method that asks for input from the user to add a new person to the database:
def add_person
puts "Enter name:"
name = gets.chomp
puts "Enter job:"
job = gets.chomp
puts "Enter gender:"
gender = gets.chomp
puts "Enter age:"
age = gets.chomp
$db.execute("INSERT INTO people (name, job, gender, age) VALUES (?, ?, ?, ?)", « name, job, gender, age)
end

Note The chomp method added to gets removes the newline characters that appear at the end of keyboard output retrieved with gets.

 

The start of the add_person method is mundane. You ask for each of the person’s attributes in turn and assign them to variables. However, $db.execute is more intriguing this time. In the previous section, the INSERT SQL was shown with the data in the main statement, but in this method, you’re using question marks (?) as placeholders for the data.

 

Ruby performs an automatic substitution from the other parameters passed to execute into the placeholders. This acts as a way of securing your database. The reason is that if you interpolated the user’s input directly into the SQL, the user might type some SQL that could break your query. However, when you use the placeholder method, the SQLite-Ruby library will clean up the supplied data for you and make sure it’s safe to put into the database.

 

Now you need a way to be able to access the data entered. Time for another method! This code example shows how to retrieve the associated data for a given name and ID:

 style="margin:0;width:958px;height:202px">def find_person
puts "Enter name or ID of person to find:"
id = gets.chomp
person = $db.execute("SELECT * FROM people WHERE name = ? OR id = ?", id, id.to_i).first
unless person
puts "No result found"
return
end
puts %Q{Name: #{person['name']}
Job: #{person['job']}
Gender: #{person['gender']}
Age: #{person['age']}}
end

The find_person method asks the user to enter either the name or the ID of the person he or she is looking for. The $db.execute line cleverly checks both the name and id columns at the same time. Therefore, a match on either the id or name will work. If no match is found, the user will be told, and the method will end early. If there’s a match, the information for that user will be extracted and printed on the screen.

 

You can tie it up with the main routine that acts as a menu system for the four methods described earlier.

You already have the database connection code in place, so creating a menu is simple:

 style="margin:0;width:958px;height:280px">loop do
puts %q{Please select an option:
Create people table
Add a person
Look for a person
Quit}
case gets.chomp
when '1'
create_table
when '2'
add_person
when '3'
find_person
when '4'
disconnect_and_quit
end
end
If the code is put together properly and then run, a typical first session could go like this:
Please select an option:
1. Create people table
2. Add a person
3. Look for a person
4. Quit
1
Creating people table
Please select an option:
1. Create people table
2. Add a person
3. Look for a person
4. Quit
2
Enter name:
Elon Jogers
Enter job:
Manager
Enter gender:
Male
Enter age:
48
Please select an option:
1. Create people table
2. Add a person
3. Look for a person
4. Quit
3
Enter name or ID of person to find:
1
Name: Elon Jogers
Job: Manager
Gender: Male
Age: 48
Please select an option:
1. Create people table
2. Add a person
3. Look for a person
4. Quit
3
Enter name or ID of person to find:
Jane Smith
No result

Your quick and basic application provides a way to add data and retrieve data from a remote data source in only a handful of lines!

 

Note You should note that we have broken some of the best practices highlighted through this blog in the previous program. We used global variables and applied almost no structure to the code at all. The goal here was solely to use SQLite3 quickly, but consider how you could dramatically improve the structure of the program now that it works.

 

Connecting to Other Database Systems

In the previous section, we looked at SQL and how to use it with the SQLite library, a library that provides a basic database system on the local machine. More commonly, however, you might want to use more elaborate databases or connect to databases located on other machines (and potentially not even run by you).

 

Sequel (The Database Toolkit for Ruby) is a “database toolkit” for Ruby that uses a DSL (Domain Specific Language) to abstract away some of the details of using a database and interfaces with the libraries used to talk to various database systems. If you write your code in a certain way, using Sequel, you can, as long as you do not use any database-specific features, switch that code between, say, MySQL and PostgreSQL and it would continue to work.

 

The sequel has “adapters” for a wide variety of database systems, the most popular including MySQL, IBM DB, Oracle, PostgreSQL, and SQLite3. It also supports a variety of common database features like prepared statements, stored procedures, and transactions, so if you’re already familiar with using databases, it’s a library well worth checking out.

 

Installing Sequel is easy:

gem install sequel

Once it’s installed, you’ll want to make sure you have the underlying driver library for your database of choice installed too. For example, for MySQL, you could install the mysql2 library. For PostgreSQL, install the pg library:

 

gem install pg

While this isn’t going to be a complete tour of Sequel, once you have things installed, you can begin to write code like this:

 style="margin:0;height:277px;width:961px">require 'sequel'
require 'pg'
DB = Sequel.connect('http://postgres://user:password@localhost/dbname')
DB.create_table :people do
primary_key :id
String :first_name
String :last_name
Integer :age
end
people = DB[:people]
people.insert( :first_name => "Fred", :last_name => "Jogers", :age => 32 )
puts "There are #{people.count} people in the database"
people.each do |person|
puts person[:first_name]
end
DB.fetch("SELECT * FROM people") do |row|
puts row[:first_name]
end

In a relatively short program, we’ve seen how we can create a table, populate that table with data, then query both the length of the table and look up rows within that table in two different ways. As you may notice, this is a lot more straightforward than working with a database driver library directly, as we did with the sqlite3 library earlier!

 

Note In the program above, you could require in sqlite3 and then change the first main line of code to DB = Sequel.sqlite to create a temporary, in-memory SQLite database. This will let you run the code if you don’t have access to a PostgreSQL server. Refer to The Database Toolkit for Ruby for more about using Sequel.

 

ActiveRecord: A Sneak Peek

So far in this blog, you’ve worked directly with databases and had to learn a whole new language: SQL. Working with a database with SQL in mind can make things more efficient and reliable than putting data into text files, say, as you did earlier, but ActiveRecord makes it easier still (and even easier than Sequel).

 

ActiveRecord abstracts away the details of SQL and makes it possible to relate to items within databases in an object-oriented fashion, as you did with PStore.

 

ActiveRecord gives you objects that correspond to rows and classes that correspond to tables, and you can work with the data using Ruby syntax, like so:

  • person = Person.where(name: "Chris").first
  • person.age = 50
  • person.save

 

This code looks through the people table for a row whose name column matches “Chris,” and puts an object relating to that row into a person. ActiveRecord makes attributes available for all that row’s columns, so changing the age column is as easy as assigning to the object’s attribute. However, once the object’s value has been changed, you issue the same method to save the changes back to the database.

 

Note The pluralization from a Person class to a people table is an automatic part of ActiveRecord’s functionality.

The previous code could replace SQL such as this:

SELECT * FROM people WHERE name = "Chris"; UPDATE people SET age = 50 WHERE name = "Chris";

 

Even SQL gurus familiar with Ruby tend to find Ruby’s syntax more natural, particularly in the scope of a Ruby program. There’s no need to mix two different languages in one program if both sets of features can be provided in Ruby alone.