The Race Condition

(Still not exactly sure what cupcakin’ means but super catchy nonetheless.)

From what I’ve come to understand, a race condition is a term used in computing when a time delay in operations could result in a different outcome. I was first introduced to this idea last Tuesday when I went to Bennington College to watch my friend Alexa give her final project presentation for the class Data Structures in C. Her project was an image-processing command line tool (for her astrophysics research) using threads and a queue data structure. To demonstrate its functionality, her program read in a large image and inverted the pixels. The threads were used to perform the pixel inversion on multiple bands of the image at the same time, greatly decreasing the run time. At the end of the presentation, Alexa cheekily announced, “Wait! I forgot to tell you about my race condition.” She had noticed that two threads could be competing for the same band of the image in the queue. In the time it takes thread 1 to check if the queue was empty or not and dequeue a band, thread 2 could also check the queue and dequeue it, and this would cause a crash if it was the last band. One of the parts of the project was that she made the queue thread-safe with a mutex, but she realized that the check (queue empty?) and the dequeue were in different areas. She jokingly said a solution to this might be to make the threads perform one at a time, but that would defeat the purpose of using threads in the first place.

Coincidentally, the next day at work I started writing my first piece of code for Bandcamp and I discovered I had a race condition of my own.

Liquid is the name of a templating language Bandcamp uses. It can be used similarly to other web scripting languages that are integrated into the HTML of a page. Liquid enables us for example to easily send email receipts with a fixed format, but with different information in each email. Liquid is the code that I’m going to need to change if I want to add anything to the email receipts. But first I’m going to need a method to test the changes I’m making.

This is where the local version of the Bandcamp website that I installed on my first day will come in useful. Since my local site has no users, we use a part of the Bandcamp codebase called Faker.rb to quickly generate a test band account, fan account, both, or make anonymous sales. In order to test the receipt that would be sent out, first I would need to generate a fake band and then create anonymous sales for it. But, the code for making a fake band just creates digital albums for it, and I need to test shipping addresses, which means I’m going to need to make sales for merchandise. You know what that means: I get to write some code!

By looking at the ruby code in the Faker class that makes albums (and a little help from debugport, a sort of irb backdoor to a running ruby program), I was able to create a similar function called make_package that makes a fake piece of merchandise for the first album it makes — a t-shirt!

Screen Shot 2013-06-09 at 10.45.58 AM

In the test interface it worked! But the code wasn’t quite right yet.

Essentially all that this code needed to do was update the two database tables where the package information is recorded. By the way, something I just learned is that inserting strings taken directly from the user can be dangerous! Certain key characters in SQL like , , and =, if directly inserted into a SQL statement, can alter the query and be used for nefarious purposes. This is called a SQL injection attack. Scary, eh? I had never even considered this could become a fault, let alone be a security problem. To remedy this, there is a way to escape special characters in SQL using the backslash character to turn special characters into literal strings. So in practice, all variables should be ‘escaped’ before being inserted into a SQL query.

With the escape statements inserted, now my SQL queries are safe and I’m getting the output I want. Then I found my race condition. So, just for fun, here is my code. See if you can find the race condition — think about what could happen in the database if two processes call make_package at the same time.

def make_package(album_id, band_id)
    q_band_id, q_now = SQL.escape(band_id, Time.now.utc)
    title = "#{@key} fun t-shirt"
    q_title, q_album_id = SQL.escape(title, album_id)

    # create package
    SQL.update("
        INSERT INTO packages
        SET title = #{q_title}, type_id = 11,
        new_date = #{q_now}, mod_date = #{q_now},
        shipping_local = 10, shipping_regional = 10,
        shipping_intl = 10, fulfillment_days = 1, private = NULL,
        band_id = #{q_band_id}, is_set_price = 1, price = 10,
        quantity = 20, sku = 'PFB6-PFA1-PFTS',
        new_desc_format = 1, album_id = #{q_album_id}
    ")

    # generate new index for this package (where it is in the list)
    index = SQL.query("
        SELECT MAX(tp.index) AS max_index
        FROM tralbum_packages AS tp
        WHERE tp.tralbum_id = #{q_album_id}
    ").collect.first.max_index || 0
    index += 1
    q_index = SQL.escape(index)

    # create association between album and package
    SQL.update("
        INSERT INTO tralbum_packages 
        SET tralbum_type = 'a', tralbum_id = #{q_album_id}, 
        package_id = #{q_package_id}, `index` = #{q_index}
    ")
end

Find it? Look at where I first assign the variable index using a SQL query. If there is more than one item being sold that is associated with the same album, it needs to have a different index. So, I grab the maximum index in one query, add one to it, then insert a row using the new value. The race condition is this: an index may be duplicated if a package is added to or deleted from the album in the time it takes this code to update the index, escape it, and perform the SQL insert statement. Using this code you could either insert an index that is too big or one that already exists in the table. You have to consider that a huge number of changes can occur between every line of your code. In this particular case the race condition isn’t much of a problem because this is only test code that’s used in the development environment where there isn’t exactly heavy traffic. But it is important (and pretty cool) to be able to recognize these types of bugs.

Terrific! Now that I’ve updated the testing functions I can test the code I will write to update the receipts. Wish me luck!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s