f l a m e . o r g

organized flames

Ruby Regular Expression Gotchas

Posted on February 26, 2009

I love Ruby. I love Ruby on Rails. Rarely have I found a language or a framework that just works.

However, you still have to know the finer details sometimes. I recently made a model for a DNS zone. The name in the model is the “front part” of a fully qualified domain name. For instance, if zone.name = “foo” then I would write the name into my name server’s configuration files as “foo.example.com.”

Knowing that people were evil, I saw that if a user put a string in like "example.com. NS hackerz-will-someday-rule-the-earth.ru.\nfoo" I would happily write out two strings, one being rather bad.

Knowing how easy this sort of data validation is in Rails, I made my model look like:

1 class Zone < ActiveRecord::Base
2   validates_presence_of :name
3   validates_uniqueness_of :name
4   validates_format_of :name,
5     :with => /^[a-zA-Z0-9\-\_\.]+$/,
6     :message => "contains invalid characters."
7 end

Happy, I ran a few tests using my browser and found that I could not insert names with spaces, colons, tabs, etc. Then, several days later, I decided it was time to write tests for this.

 1 require 'test_helper'
 2 class ZoneTest < ActiveSupport::TestCase
 3   def test_name_with_newline_fails
 4     z = Zone.new(:name => "test\nzone")
 5     assert !z.valid?
 6     assert z.errors.on(:name)
 7   end
 8 
 9   def test_name_with_space_fails
10     z = Zone.new(:name => "test zone")
11     assert !z.valid?
12     assert z.errors.on(:name)
13   end
14 end

Imagine my surprise when test_name_with_space_fails() passed, and the one I was most worried about, test_name_with_newline_fails(), did not!

Not all regular expressions are alike.

The problem is in what I thought ^ and $ actually matched. I thought these meant “match the beginning and ending of the string.” However, it turns out it means “match the beginning and ending of each line contained in the string,” where lines are divided by newlines. Ooops.

Changing ^ into \A and $ into \Z fixed this problem. Now I’m auditing all the code in this application to see if there are other problems like this.

This is just one thing to add to an ever-growing security checklist for my Rails work. It’s also a very typical security hole: programmer error.