Sunday, November 16, 2008

Sinatra speaks a different language, Perl

So, if it has become obvious to some, I have a thing for web frameworks. I like to understand how and why they work, and for me that usually involves more than just reading the source code. My first attempt was different variations of Puddy -- Rails/Merb like framework in Perl. The second iteration has led me down the path of the Ruby Sinatra framework.

Sinatra is a minimalist framework for creating web applications. Its scope really only extends into realm of controllers. It does support views, but it is still far from the enormous support that Rails would provide in options. The controller aspect allows you to define the route in method call of the action -- often causing your entire application to be just one file. I have used it to experiment with building APIs for some of my projects.

In short, its a light, quick, and not at all a memory hog, so of course I wanted to dissect it. This lead me to just reading the source code, but I wanted to do more to understand how it worked. I should inform you that I do indeed know Ruby. I've used it on a daily basis for the past 2 years for my job(s) and have used well beyond just the Rails environment. Now, I have also been a big supporter of Perl, mainly because it was my first language beyond good old Q-BASIC.

With that in mind, I would like introduce a project that I have been working on Sinatra for Perl. Yes, it is a work in progress, but I have put a lot of effort into making sure that this code base was some what solid before I released it to the world. In the repository, a working example can be found and run on your local machine of the feature set.

I like the ease of of extending Sinatra in either Ruby and now Perl. I have added simple extensions to support page caching, running tasks from command line, and a simple background job server. Many of these are lacking optimizations, but I just wanted to show the ease of extending the framework.

The simplest example of a Sinatra app:


#!/usr/bin/env perl
require 'lib/sinatra.pl';

use strict;

get('', {}, sub{
return 'Hello World';
});

get(':name', {}, sub{
my $r = shift;
return 'Hello, ' . $r->params->{name};
});
A simple DSL that defines what action to perform on a route. LOVE IT!

This has been an educational experience (albeit a nerdy one) and I hope to continue on this project. I think I will probably have to go through a name change in the near feature as not to confuse people on the Ruby version.

Monday, November 10, 2008

rails plugin to white label CDNs

This is a continuation of my previous post talking about using S3 as a CDN. This post covers some of the issues that were faced with making Rails work nice with a CDN.

Rails is made to do great things, but as many before have me said, handling concurrent connections is not one of them. Once a request comes into Rails, that process (Mongrel, FastCGI, etc.) is blocked till the request is done. Actions like sending emails, transferring files, large calculations need to be pushed away from the user request into another process. There are many solutions such as BackgrounDRb, Starling, etc, which allow you to load long running tasks from blocking Rails.

The task of handling files on a remote server is always a tricky one. Each CDN has there own interface on how to interact with the files -- delete, update, move, etc. This proved to be a problem when trying to test which one would work cleanly with our setup -- widgets stored in the database, which are updated immediately to all users on the Chumby network.

I took a top down approach to the problem. Designing how I wanted the widgets to move from our servers to a another server by building the API of methods. These methods were just a skeleton and did nothing. It just allowed to me to write the code I expected instead of working around another CDN or modules API. The methods could then be filled with the appropriate code to make it work with the CDN.

Originally, the CDN of choice was S3 -- not a true CDN, but suited our purposed of unloading our servers of loading dynamic content from our servers. The necessary API calls were filled to support the functionality of S3. Its then I realized that with the top down approach that the work I had could be easily modified to work with any CDN method. I've written extensions for rsync, ssh, and just S3, but it has supported our needs to be able to test multiple environments and services quickly.

This plugin data_fu is currently a work in progress, but I hope to modify for wide spread use and finalize it.

NOTE: At the time of writing this software that there are many solutions of CDNs that handle transparent proxy and caching of content. The reason these solutions were not used because the content of widgets need to be updated instantly. Since the solution was to use vanilla CDN mechanism so we could change CDN solutions if one went down and switch over was needed immediately.

Monday, September 15, 2008

project euler repo

I have recently gotten back into the Project Euler problems. The website provides a bunch of math and logic problems that can be solved any way -- paper and pen or programming. I like the challenge every once and awhile, so I thought I might share some of my results of the code I have written to help solve the problems. There are comments in most of these. They are just quick dirty hacks to help me get the answer. Sometimes the output will not be the answer and you might need to go look for it, but the logic is there.

Saturday, September 13, 2008

new job and location

I have neglected by duties as the maintainer of this blog for a few months. Its has been for the better though. About a month ago, I left my position at Chumby and moved to San Francisco to start working at [context]. I had a great time and experience working at Chumby, but I felt that San Francisco is where I needed to be for both work, but also experience. I grew up in one of the largest (and best) cities in the world, and I missed the lifestyle that came with it. San Diego has great weather, but I hated driving everywhere.

Now I am in San Francisco. Trying to sell my car. And enjoying my new job, people, and experience.

If you are in the area, please contact me, nice to meet some readers. :)

Flash on S3

This is a continuation of my previous post talking about using S3 as a CDN. This post discusses some of the issues that occurred with hosting Flash content on S3, and the solutions for them.

Problems started to occur once Flash SWFs were loaded from the Chumby device and the Chumby website. SWFs have a built in security policy known as cross domain policies that allow the owner of a domain to specify what domains have access to the domain. Think of it as a robots.txt for Flash SWFs.

With S3 there are two ways to access content from a bucket -- AWS based URL or a CNAME from your domain that points to AWS (Amazon Web Services). When the Flash content is on S3, the Flash player looks for the crossdomain going through the AWS URL path. We setup a CNAME 'swf.chumby.com' and placed a crossdomain.xml that could be accessed via http://swf.chumby.com/crossdomain.xml and also one on the top level http://chumby.com/crossdomain.xml. This allowed to control what SWF movies could load the widgets.

Playing the SWF as a stand alone Flash movie never showed any problems. When it was loaded via http://swf.chumby.com the SWF would claim its domain to be swf.chumby.com instead of an AWS domain. From the chumby website there is a way to preview the content that will appear on your Chumby -- the Virtual Chumby. The SWF for the Virtual Chumby exists on the main chumby.com website. With the crossdomain, it was able to load the widgets from swf.chumby.com no problem, but when it wanted to send parameters to the widget a problem occurred.

Flash apparently has various sandbox models for the SWF files. This is good because allows SWF to maintain a state security and ensures your data is protected. This bit us in the ass though. Since a SWF can grant only certain (sub)domains to ability to send it parameters we had 1000s of widget that we could play, but they didn't have access to any information that made it work well within the Virtual Chumby. There were two possible solutions. Change every widget to have the code allowDomain, which would take weeks to contact 3rd party developers, countless resources, etc. The second solution is even tougher it would require moving the Virtual Chumby SWF over to the swf.chumby.com domain and updating the links to it on our website. :)

s3 as a CDN

I worked for a company that provides widgets as a primary resource for our product the chumby. These widgets are purely static content in the form of Flash SWF files and an associated jpeg thumbnail. This content is provided from our servers from both dynamic (database) and static (file servers) resources. These resources are ready to scale to certain calculated amount before we have to worry about more servers, bandwidth, etc... We try to stay ahead of the curve with growth.

The scaling numbers show that we can do one of two things -- expand our servers and utilize more bandwidth or use a CDN to provide our content utilizing caching. In short, the most cost effective solution is S3. Our content, widgets that can change instantaneously when someone uploads a new one, needs to be provided to all users with in a reasonable time. A normal CDN could take minutes-hours to propagate and take time for integration. Expanding our servers would mean more time and maintenance on our end.

The architecture we have decided is to have a two tier distribution, which will provide with redundancy for widgets. The widgets will exists on our servers in the database and on S3. Our database server is used to hold the widgets because its easy to backup, restore and replicate. With our current system, when a user uploads/updates a widget, it is saved in the database directly, so the newest version can be pushed to users as soon as it gets approved.

Transferring files to S3 has proven to be quite simple to implement. The main problem has been adjusting our architecture to adapt to external URLs. Frontend (website) facing, obviously changing URLs is pretty trivial and all browsers support cross domain loading of content.

Pushing widgets to the database is easy. A simple create/update with ActiveRecord and you're done. When a user uploads a widget, in the same POST request the file is saved to the database, so there is no delay and problems and errors with the file are reported in real time. A blocking operation for Rails, but with size limits imposed on the database, model, and web server it shouldn't be too slow.

To transfer the widgets to S3 from our database in 'real time' is a tricky part. This is a blocking that depends on factors beyond our control. The S3 servers could be done, our bandwidth pipe could be saturated with web hits so upload to outside server is slow, etc. This is a blocking operation no matter what, but one we don't want the the user to have to wait for when they upload a new widget. The solution was to push the transfer of a widget to S3 to a job server, whose main purpose is to queue long running tasks. The job server was built using BackgroundRB that integrates well with Ruby On Rails.

This post is to be continued in follow up posts. There is still so much more to cover with the problems we had with Flash and the framework built to white label CDNs.

Thursday, August 28, 2008

very accurate description

Its not to often that I read something that describes a certain type of people. This blog posting describes a particular type of programmer personality. Programmers come in many shapes, sizes, and mentalities. I am not saying this is a perfectly accurate description of who I am, but there is one paragraph that I read and was like woah! I am not going to quote the paragraph because I really think this is a blog posting all should read about me. :)