If Twitter Go The Advertising Model Way, They Might Hit The Jackpot

There are wild guesses all over the blogosphere as to where Twitter is headed when it comes to its business model. It has certainly become an obsession to talk about it in some circles, and the air is filled with speculations — from Calacanis’s advertising/subscription model, through TechCrunch’s pro/business or sponsored suggested accounts model, to the hilarious downtime advertising model.

Wherever it may eventually go — and my bet is it will be a mixture of pro accounts and advertising — the advertising model in Twitter might be something very innovative, both in targeting and delivery.

Targeting – Easy Semantics and The Realtime Factor

Targeting a Twitter user is very convenient. They don’t have to assume or speculate anything about a user, they don’t have to track cookies and collect Behavioral Targeting data and try to determine if a user is action prone or not, or try all the other good old targeting tricks, such as geo-targeting and figuring household income. The simple nature of Twitter messages, which is very short and informative, makes them very easy to process semantically — aside from lolspeak and l33t, sentences are simple, usually noun/pronoun-verb-adverb-adjective. And the best part is, the user gives information about himself voluntarily — location, likes and dislikes, actions performed, etc.

Take me for example, in this imaginary (but could-be-true) scenario:

  • In SFO, waiting to board flight for JFK. Coffee anybody?
  • Just saw the new Pearl Jam album, love them!
  • @johndoe Let’s meet later this evening for dinner

Do you realize how much Twitter knows about me in the 5 minutes I tweeted these three? They can advertise to me flight tickets, coffee in SFO, Pearl Jam and similar music, and places to eat dinner in New York.

Now, you might say that Facebook or other social networks can provide similar targeting, but the realtime nature of Twitter makes it even more powerful. I might ditch Pearl Jam in a month in favor of Soundgarden, and Twitter will know that immediately. I might be in a conference and suddenly crave a steak. The relevance of the advertising is much better when my realtime wish or craving is in the equation.

Delivery – Unobtrusiveness and Flow

The way Twitter will deliver the ads will have a very high impact on the user responsiveness to the advertising. And the fact that Twitter has an open API and a lot of users using 3rd party clients to access it, will force them to embed the ads in the Twitter stream. They could be “full tweet ads” or they could be “tweet-appended ads”.

For example:

  • In SFO, waiting to board flight for JFK. Coffee anybody?
    • From twitter: Drink Coffee at Starbucks@SFO!
    • From twitter: Next time, save on airfare with MyImaginaryTravelAgent!
  • Just saw the new Pearl Jam album, love them!
  • @johndoe Let’s meet later this evening for dinner

Or:

  • In SFO, waiting to board flight for JFK. Coffee anybody?
  • Just saw the new Pearl Jam album, love them!
  • @johndoe Let’s meet later this evening for dinner
    • From johndoe: sure let’s eat a steak! (From twitter: check out the SteakHouse on 5th and 34)

I’m For It, The World Is Ready

I would not care receiving both these forms of advertising, if they are well integrated in the flow of my Twitter stream, and especially if they are so well targeted. I believe the semantics tools today are good enough to process meaning from a user’s short and simple under-140-characters sentences, since there is no hard contextual analysis to perform. And the realtime factor makes the targeting temporal-aware — they know what I need when I need it. Regardless of what payment model they choose (CPC, CPA, CPM), the targeting and delivery methods are the winners here.

There, I’ve contributed my part to the Twitter business model obsession.

Connecting Several PHP Processes To The Same MySQL Transaction

The following concept is still under examination, but my initial tests proved successful, so I thought it’s time to share.

Here is a problem: There is a batch job reading remote data through XML-RPC, and updating a local MySQL database according to the XML-RPC responses. The db is InnoDB, and the entire batch job should be transacted. That is, in any case of failure, there should be rollback, and on success there should be commit.

So, the simple way is of course a single process script that uses a linear workflow:

  1. START TRANSACTION locally.
  2. Make XML-RPC request, fetch data.
  3. Insert data into db as needed.
  4. Repeat 2-3 until Error or Finish.
  5. If Error ROLLBACK, if Finish COMMIT.

This works, but you may notice a bottleneck, being the XML-RPC request. It’s using http, and it’s connecting to a remote server. Sometimes the XML-RPC server also takes time to perform the work that generates the response. Add the network latency, and you get a single process that most of the time sits idle and waits for response.

So if we have a process that just sits and waits most of the time, let’s spread its work over several processes, and assume that while most of the processes will be waiting, at least one can be free to deal with the local database. This way we will get maximum utilization of our resources.

So the multi-process workflow:

  1. START TRANSACTION locally.
  2. Fork children as necessary.
  3. From child, make XML-RPC request, fetch data.
  4. From child, acquire database access through semaphore.
  5. From child, insert data into db as needed.
  6. From child, release database access through semaphore.
  7. From child, repeat 3-6 until Error or Finish.
  8. From parent, monitor children until Error or Finish.
  9. From parent, if Error ROLLBACK, if FINISH COMMIT.

Now, the workflow seems all and well in theory, but can it work in practice? Can we connect to the same transaction from several different PHP processes?

I was surprised to find out that the answer is positive. As long as all processes share the same connection resource, they all use the same connection. And in MySQL, the same connection means the same transaction, given that a transaction was started and not yet committed or rolled back (either explicitly or implictly).

The secret is to create the connection resource with the parent, and when forking children, they have a reference to the same connection. The caveat is that they must access the resource atomically, otherwise unexpected behavior occurs (usually the connection hangs, I am guessing that it is when one child tries to read() from the socket and the other to write() to it). So in order to streamline the access to the db connection, we use a semaphore. Each child can access the connection only when it’s available, and it’s blocking if not available.

In the end of the workflow, our parent process acts much like a Transaction Manager in an XA Transaction, and according to what the children report, decides whether to commit or rollback.

Here is a proof of concept code (not tested in this version, but similar code tested and succeeded):

The DBHandler Class

class DBHandler
{
	private $link;
	private $result;
	private $sem;

	const SEMKEY = '123456';

	public function __construct($host, $dbname, $user, $pass, $new_link = false, $client_flags = 0)
	{
		$this->link = mysql_connect($host, $user, $pass, $new_link, $client_flags);
		if (!$this->link)
			throw new Exception ('Could not connect to db. MySQL error was: '. mysql_error());
		$isDb = mysql_select_db($dbname,$this->link);
		if (!$isDb)
			throw new Exception ('Could not select db. MySQL error was: '. mysql_error());
	}

	private function enterSemaphore()
	{
		$this->sem = sem_get(self::SEMKEY,1);
		sem_acquire($this->sem);
	}

	private function exitSemaphore()
	{
		sem_release($this->sem);
	}


	public function query($sql)
	{
		$this->enterSemaphore();

		$this->result = mysql_unbuffered_query($sql, $this->link);
		if (!$this->result)
			throw new Exception ('Could not query: {' . $sql . '}. MySQL error was: '. mysql_error());
		if ($this->result === true)
		{
			// INSERT, UPDATE, etc..., no result set
			$ret = true;
		}
		else
		{
			// SELECT etc..., we have a result set
			$retArray = array();
			while ($row = mysql_fetch_assoc($this->result))
				$retArray[] = $row;
			mysql_free_result($this->result);
			$ret = $retArray;
		}

		$this->exitSemaphore();

		return $ret;
	}

	public function beginTransaction()
	{
		$this->query('SET AUTOCOMMIT = 0');
		$this->query('SET NAMES utf8');
		$this->query('START TRANSACTION');
	}

	public function rollback()
	{
		$this->query('ROLLBACK');
	}

	public function commit()
	{
		$this->query('COMMIT');
	}
}

The Forking Process

$pid = 'initial';
$maxProcs = $argv[1];
if (!$maxProcs)
{
	 $maxProcs = 3;
}
$runningProcs = array(); // will be $runningProcs[pid] = status;
define('PRIORITY_SUCCESS','-20');
define('PRIORITY_FAILURE','-19');

try
{
	$dbh = new DBHandler(DBHOST,DBNAME,DBUSER,DBPASS);

	$dbh->beginTransaction();

		// fork all needed children
		$currentProcs = 0;
		while ( ($pid) && ($currentProcs < $maxProcs))
		{
			$pid = pcntl_fork();
			$currentProcs++;
			$runningProcs[$pid] = 0;
		}

		if ($pid==-1)
		{
			throw new Exception ("fork failed");
		}
		elseif ($pid)
		{
			// parent
			echo "+++ in parent +++n";
			echo "+++ children are: " . implode(",",array_keys($runningProcs)) . "n";

			// wait for children
			// NOTE -- here we do it with priority signaling
			// @TBD -- posix signaling or IPC signaling.
			while (in_array(0,$runningProcs))
			{
				if (in_array(PRIORITY_FAILURE,$runningProcs))
				{
					echo "+++ some child failed, finish waiting for children +++n";
					break;
				}
				foreach ($runningProcs as $child_pid => $status)
				{
					$runningProcs[$child_pid] = pcntl_getpriority($child_pid);
					echo "+++ children status: $child_pid, $status +++n";
				}
				echo "n";
				sleep(1);
			}

			echo "+++ checking if should commit or rollback +++n";
			if (in_array(PRIORITY_FAILURE,$runningProcs) || in_array(0,$runningProcs))
			{
				echo "+++ some child had problem! rollback! +++n";
				$dbh->rollback();
			}
			else
			{
				echo "+++ all my sons successful! committing! +++n";
				$dbh->commit();
			}

			// signal all children to exit
			foreach ($runningProcs as $child_pid => $status)
			{
				echo "+++ killing child $child_pid +++n";
				posix_kill($child_pid,SIGTERM);
			}
		}
		else
		{
			// child
			$mypid = getmypid();
			echo "--- in child $mypid ---n";
			//sleep(1);
			echo "--- child $mypid current priority is " . pcntl_getpriority() . " ---n";

			// NOTE -- following queries do not work, for example only
			$dbh->query("select ...");

			echo "--- child $mypid finished, setting priority to success and halting ---n";
			pcntl_setpriority(PRIORITY_SUCCESS);
			while (true)
			{
				echo "--- child $mypid waiting to be killed ---n";
				sleep(1);
			}
		}

}
catch (Exception $e)
{
	// output error
	print "Error!: " . $e->getMessage() . "n";

	// if parent -- rollback, signal children to exit
	// if child  -- make priority failure to signal
	if ($pid)
	{
		// rollback
		$dbh->rollBack();
		foreach ($runningProcs as $child_pid => $status)
			posix_kill($child_pid,SIGTERM);
	}
	else
	{
		pcntl_setpriority(PRIORITY_FAILURE);
		$mypid = getmypid();
		while (true)
		{
			echo "--- child $mypid waiting to be killed ---n";
			sleep(1);
		}
	}

}

Well, all of this sounds well, and also worked well on a development environment. But it should be taken out of the lab and tested on a production environment. Once I give it a shot, I will update with benchmarks.

Zend Framework Database Admin

If you’re looking for a simple tool that uses Zend Framework’s robust database classes (such as Zend_Db and Zend_Db_Table), you can check out zdbform. It’s a short yet effective library that let’s you perform simple administration tasks on your database, with minimal coding.

It’s not a full blown phpMyAdmin, but it’s a simple way to view, edit and add your tables rows in a web interface. Also, don’t expect it to scale, because I am sure this library was written to serve some quick table administration needs, and that it is not ready to handle large datasets. But, it is very convenient if you have a small database to administer.

I implemented it with the Zend MVC components, and following is a brief overview.

In the front controller, or front plugin, or any class that your controller subclasses:

$this->db = Zend_Db::factory('Pdo_Mysql', array(
				'host'     => DB_HOST,
				'username' => DB_USER,
				'password' => DB_PASS,
				'dbname'   => DB_NAME
		));
Zend_Db_Table_Abstract::setDefaultAdapter($this->db);

Then set your controller and view scripts as necessary. Let’s say you have two tables to admin, “clients” and “history”. First make sure they are declared as subclasses of Zend_Db_Table:

require_once "Zend/Db/Table/Abstract.php";

class Clients extends Zend_Db_Table_Abstract
{
    protected $_name = 'clients';
}

class History extends Zend_Db_Table_Abstract
{
    protected $_name = 'history';
}

Your controller would look like:

require_once "Zend/Controller/Action.php";

class AdminController extends Zend_Controller_Action
{

	public function init()
	{
		require_once 'zdbform/zdbform.class.php';
		require_once 'zdbform/zdbform_widgets.class.php';
		require_once 'zdbform/zdbform_validations.php';

		parent::init();

		$this->view->headLink()->appendStylesheet('/zdbform/zdbform.css');
		$this->_helper->viewRenderer('index');
	}

	public function indexAction()
	{
	}

	public function clientsAction()
	{
		$this->view->dbform = new Zdbform('Clients');
		$this->view->dbform->setWidget('description', 'textarea');
		$this->view->dbform->processForms();
	}

	public function historyAction()
	{
		$this->view->dbform = new Zdbform('History');
		$this->view->dbform->processForms();
	}

}

And the single view script you need is admin/index.phtml:

<?php
echo $this->headLink();

include_once "Zend/Filter/Word/CamelCaseToDash.php";
include_once "Zend/Filter/Word/CamelCaseToUnderscore.php";
$cctd = new Zend_Filter_Word_CamelCaseToDash();
$cctu = new Zend_Filter_Word_CamelCaseToUnderscore();
$classes = get_declared_classes();

foreach ($classes as $class)
{
	if (is_subclass_of($class,'Zend_Db_Table_Abstract'))
	{
	?>
		<a href="/admin/<?= strtolower($cctd->filter($class)) ?>"><?= strtolower($cctu->filter($class)) ?></a>&nbsp;&nbsp;
	<?php
	}
}

if ($this->dbform)
{
	?>
	<h1>Table: <?= $this->dbform->tableName ?></h1>
	<?php
	$this->dbform->showForms();
	$this->dbform->showTable();
}
?>

There were also a couple things that needed changing in the zdbform class itself:

  • Replace all PHP_SELF with REQUEST_URL. On the mvc case, PHP_SELF is empty or index.php, and we don’t want all the forms posted there, we want them to go back to /admin/clients or /admin/history
  • After this line
    $this->pk = $tableInfo['primary'];

    I had to add this:

    if (is_array($this->pk))
    	$this->pk = $this->pk[1];
  • zdbform->orderBy is treated as a single column, of you want multiple column sorting you have to hack a bit with getAllRows().

That’s it, point your browser to /admin and you’re good to go. In a very short time and with a little bit of code, you can get something similar to a stripped down version of phpMyAdmin, using the power of Zend Framework.

Flex on Ubuntu: The Complete How To Guide

ubuntu_flex

I usually live with the axiom that whatever you can find in the realm of Windows and proprietary software, you can easily find in the realm of Linux (any flavor) and open source. While this is indeed usually the case, when it comes to a Flex IDE for Ubuntu, there’s a real gap. Adobe has their Flash IDE for willing and paying Windows users, and I am happy to say that I was one of these happy customers a while ago. But since then, Ubuntu has taken over my life, and when I set out to make a small Flex app a couple of days ago, I came across some hurdles. Not impossible to overcome, but not trivial as well.

The Options

There is no one complete solution for developing Flex on Linux. Many folks are looking for one, but there is still none to be found. There are many open source tools that cover vast areas of the ActionScript and SWF world, most of the listed on the wonderful osflash.org. Some of them are just right for you if you have a very specific task (like converting between formats, just compiling a bit of code, etc.), but none provide a complete IDE that lets you both drop in WYSIWYG elements and manually code some stuff, while easily maintaining complete control of what libraries are used.

The Choice

The choice then is to combine as few tools as possible. I have succeeded to get along with 2 tools: Flex Builder Linux Alpha, and Open Dialect.

Flex Builder Linux Alpha is an Adobe free product, which is a Flex build environment as a plugin for Eclipse. Don’t worry about the Alpha part, it seems like a very stable product, and besides eating up some memory, I had no problems with it. It is actually an exact replica of the Flex Builder for Windows, without the features of Design View, some wizards and the profiler.

Open Dialect is the most comprehensive attempt i’ve seen at creating a graphic WYSYWIG IDE for developing Flash. It has some basic characteristics of such an IDE, like a timeline with frames and key frames, and a graphic interface for creating shapes and editing their properties.

The Method

The development cycle is quite simple once you have these two tools running. Use Open Dialect for whatever graphic or animation you need, then grab the code from the “Document Script” tab, paste it to the Flex Builder in Eclipse, and start tweaking whatever is needed, add the MXML code etc. Open Dialect has great potential if it were to enable manual script editing, but currently it doesn’t.

Getting Things To Work

Requirements and Installation of Flex Builder Linux Alpha are covered on Adobe’s release notes. In short, you got to have Eclipse 3.3.x, Sun JRE 1.5.x and Firefox, and just follow the installation instructions there. Be sure to set Eclipse’s browser to Firefox, as mentioned in the release notes, and there are also a couple of guides to walk you through. Oh, and it’s very important to follow these instructions in order for your existing Eclipse plugins to survive the install. I installed the builder and lost my subclipse and collabnet merge plugin, and had to reinstall them.

Open Dialect uses the .NET framework, so you need Mono to run it. According to the installation instruction, you can either download pre-compiled binaries of Open Dialect, or download the source and compile, but then you need MonoDevelop as well. In my case, using apt-get install mono was enough, and Open Dialect ran like a charm.

Tweaking and Real Life Example

Let’s go through an example of how to make a rounded rectangle that gets its color through a FlashVar.

In Open Dialect:

  1. Fire up Open Dialect.
  2. Choose Rounded Rectangle from the Shapes list on the Items pane.
  3. Draw a rounded rectangle on the canvas.
  4. Set its X and Y to 0.
  5. Go to the Document Script tab, select all the code and copy.screenshot-open-dialect

In Eclipse:

  1. Fire up Eclipse, create a new Flex Project, name it “Test”, and go through the creation wizard (next, next, finish).
  2. You are now editing Test.mxml which is a skeleton Flex app file. Paste everything you copied from Open Dialect into Test.mxml, instead of its current content.
  3. Save.
  4. Hey, we’ve got errors! That’s right, BListBox type is not defined. It’s because the script uses the “ActionScriptComponents” library that comes with Open Dialect, in which BListBox is declared, but we haven’t imported it yet, let’s do it.
  5. Copy the directory /path/to/opendialect/ActionScriptComponents to /path/to/workspace/Test/src/
  6. Run again.
  7. Viola! Rounded rectangle showing up!

Now that we’ve covered the basics, let’s see how to pass FlashVars to our app. In order to do so, we need to understand what the Flex Builder environment does in build time — besides compiling the SWF, it also takes the file called index.template.html found in the html-template directory, and compiles it to a file called Test.html in the bin-debug directory, then fires this file in Firefox. So to pass FlashVars and process them in the script:

  1. Open index.template.html
  2. Scroll down to the javascript part under “hasRequestedVersion”, this is the part that runs the swf on our page (assuming we have javascript enabled and the correct version of the flash player).
  3. Under
    "src", "${swf}",

    add

    "FlashVars", "color=0x000000",
  4. Run once to see everything is working, and that we did not screw up the html template.
  5. Add the following variable declaration where all other variable declarations are:
    private var rectColor:Number;
  6. In init(), add the following line at the top of the function:
    rectColor = Application.application.parameters.color;
  7. In the next line, where the shape properties are defined, change the last 2 colors (which in my case were 0x000000,0xFF0000) to rectColor,rectColor:
    RRect1A.Properties[0] = new ShapeProperties(0,0,120,172,"BRoundedRectangle",1,rectColor,rectColor);
  8. Run again, see the rectangle is now black!

Conclusion

The field of Flex development on Linux is of course bound to change as time goes by, but for now it seems like it is still in its early unstable days. This was a brief demonstration of how to harness the IDE power of Open Dialect, and the development and build power of Adobe’s Flex Builder Linux Alpha, to create a working environment for developing Flex apps on Linux. It is by no means the simplest solution — my guess is that running Flex Builder on a Windows VMware could be easier, albeit costly — but this solution that I presented is of course free and conforms to the open source spirit. And, most importantly, can get you developing flex apps quickly and easily.

Have fun flexing!

A Couple of Sidenotes

  • You don’t have to use the ActionScriptComponents library that comes with Open Dialect if you don’t intend to use tweening or frame events. You can use the regular mx.controls classes for just plain drawing.
  • Flex applications tend to be big because the whole framework is included in the build. You can use RSL loading to cache the framework on the client side, or you can make a plain ActionScript Project instead of a Flex Project, still use mxml and just import the needed flash.* libraries. Both of these subjects are out of the scope of this post.

Startups Are The Marines Of The Business World

I recently finished watching Generation Kill, a 7-episodes HBO mini series depicting the advancement of the Marines 1st Reconnaissance Battalion during the war in Iraq in 2003.

Without going into the many political, social and human aspects to the series, what struck me the most was the remarkable resemblance between the Marines activity in their world, to the startup activity in the business world. As I was watching the series, I felt there were 3 main themes I could relate to, and that were analogous to the unique day to day life of a startup company.

Observe Everything, Admire Nothing

While advancing through enemy territory, the Marines are ordered to pay attention to all details, but not to settle their eyes on any one point for too much time, because it can lead to distraction or numbness.

Same goes for a startup company which advances in the competitive market, and should be aware of all aspects of its activity, and yet have no time to go into research and development in fields that are not its core business (as a corporate might).

Marines Make Do

While the other army/navy/air force branches of the military have excellent supplies, Marines have to settle for the equipment they have, and manage to pull through using nothing more than what was readily available.

Same goes for startups, which as opposed to corporates, operate most of the time on a low budget and try to keep a low burn rate, and have to get by with what’s at hand and no more than that.

The Vision Is Clear, The Mission Is Constantly Changing

While the vision of the war is clear to the Marines, the mission is constantly changing. One day you storm an airfield, the other you police civilians in a city. Moreover, it seems that in military terms, there is nothing more agile than a bunch of troops mounted on light humvees.

Same goes for startups, which have a very clear business vision, but usually work in a constantly changing market which dictates a constantly changing mission. One day you develop a feature for your product, the other priorities dictate it’s business development time. And in the business world, no company is more agile than a startup made up of a bunch of entrepreneurs with a clear vision.

Well, stay frosty!

Widgetbox Missing Out: Blidgets Should Be Aimed At Developers, Not Users

Widgetbox is an amazing service with a great promise: You might have content that can be widgetized and distributed, but there are numerous formats your widget can be in, and tons of services which it can be distributed on. Widgetbox is a centralized place to handle all that is widget, and it connects your content with several distribution methods (to all major blogging platforms, to social networks like MySpace, Facebook and Bebo, and to personalized homepages like iGoogle and Netvibes, and the list goes on). On top of that, it also helps your content get viral distribution.

Sounds like widget heaven, however in reality it is far from perfect. As it seems from toying around with Widgetbox, their delivery is below their promise — sometimes because of limitations imposed on them by 3rd parties, but sometimes because of what seems like misunderstanding of developer’s needs (or at least my needs ๐Ÿ˜‰ ).

I set out to turn some RSS feeds, or some other API-available content, to be widgetized and distributed on social networks. I encountered two major issues that prevented me from succeeding.

The first issue is their support for distribution over MySpace and Facebook, or lack thereof. Widgets can be distributed on MySpace or Facebook only if they are flash widgets. If distributed on MySpace, outgoing links don’t work, and Widgetbox use some sort of weird workaround that makes the user copy and paste the link in a different browser tab (will you ever do it if a widget asks you to?). Moreover, it seems that there was an option to turn your widget to a Facebook App, but it has been down for the last 6 months.

The second issue is more important in my opinion, because it is entirely up to them and not imposed on them due to restrictions made by 3rd parties, and it reflects a flaw in their business perception. They have a great tool which is called a Blidget — a widget that takes an RSS feed and turns it into a slick flash widget showing the recent items in the feed. The main problem with it, is that it is aimed towards end users, and not towards developers. Say I have a web service with hundreds of thousands of users, each of them having his own RSS feed, and I want to enable them to get my branded widget with their feed in it with a click of a button — I can’t make use of Blidgets. Blidgets require the user to prepare them and brand them, there is no API for that which developers can use, nor is there anyway that I can prepare a branded Blidget, and pass the feed URL as a parameter, because a Blidget is made on a per-feed basis.

I consulted their support, that’s what they had to say:

It seems all you would have to do is get the RSS feed from them and make the widget yourself. You can put any brand or logo on the widget.

And when I said that I am looking for a scalable solution, not something that I have to manually do myself:

One link is not possible. You can have a link to Widgetbox.com on your site. Then you can list there RSS feed for them to copy and paste into widgetbox to create there blidget.

Which is unacceptable as well, since if they create the blidget, it’s not branded as I want it.

I think that Widgetbox is missing out here big time, at least in the Blidget case. Instead of leveraging the communities of existing web services and the innovation of developers (they try to do it with their other widget formats, I don’t understand why that’s not the case with Blidgets), they are turning to end users in search of virality. I believe that turning to developers and enabling them to use Blidgets, will increase the use cases and the virality of Blidgets. After all, getting a million users is harder than getting 10 developers, each with a tenth of a million users on his web service.

Hardware Failure Apocalypse

I might know a thing or two about handling servers, configs, deployments and cloud architecture. But when it comes to hardware failure on my own workstation, I become a complete layman.

It’s the first time my Lenovo R61 failed me. It’s running a mighty Ubuntu 8.04, with all the components a hacker needs (from a complete LAMP stack, through PDT and a customized version ofย  svn 1.5.1, to InkScape and xvidcap…), and it’s the first time that after the system froze and I rebooted, I just gazed at the terminal at startup and shrieked:

Kernel panic – not syncing: Attempted to kill init!

And a whole other bunch of error messages, every time at a different stage in the boot sequence. This behavior, combined with the fact that the system just froze and I didn’t do any dramatic changes, makes me think it’s bad RAM or other hardware components (like here, and disk is of course a candidate), but sometimes it seems like people get over it by re-installing a kernel.

I don’t know what I prefer, hardware or software failure. I guess that RAM failure is the best, just swap it with new RAM. Disk failure might mean data loss, which I am sure I don’t want to handle, and recompiling the kernel can be a tedious task, but preferable than losing data and having to re-install the whole system again.

And what I asked myself, when I rode my bike back home today, is “why can’t I just instantiate a new instance in the cloud with the newest working snapshot of my system? Why hardware failure in the cloud is so easy to deal with, and hardware failure in the office isn’t?”. And I had a vision of all the people working on machines similar to mainframe terminals, running only the basic things and having the OS and all the data just sit in the cloud.

This day isn’t far. But tomorrow it’s back to the lab to (hopefully) have my RAM replaced.

TechCrunch Cashing In on CrunchBase

TechCrunch just announced that they have a premium report of an analysis of 2008 according to data gathered on CrunchBase. The report is available for $149, and includes data about startup financing, products, trends and exits.

Two things strike me here.

First, CrunchBase is finally proving as an asset, with TechCrunch cashing in on its non stop data gathering. Whether or not this is moral is a point to argue about — after all, it is a wiki, most of the data is community driven. But in any case, the TechCrunch empire has another income stream, unrelated to the traditional media advertising revenue stream.

Second, we all know the downturn is here, but put visually through the graph of the number of founded startups per month:

From 170 to 20 in 1 year. Phew. But then again, this can also be seen in an optimistic perspective — the market is now less competitive for startups, and if you’ve got the funding and the business model, you might get through this survival of the fittest battle with a winner.

Make Sure Those 404 and 500 Responses Are Long Enough

Internet Explorer is a term you can’t stay indifferent to. Whenever you hear somebody say it, they either say it with a respectful awe, or they utter it with a developer’s pain. Today I was on the latter side of this indifference, when I fought a WordPress bug.

Internet Explorer expects a minimum sized response content when receiving HTTP errors in the 400-500 area. If the content size received in the response is lower than this minimum, an arguably “Friendly HTTP Error Message” takes over, and displayed instead of the content.

This (rather obtrusive) behavior doesn’t go well with WordPress wp_die() function, which is invoked on many occasions throughout the WordPress code. This function sets the HTTP response status to 500, and then sends a very short html code containing any error message you’d like. These error messages are usually no more than a brief sentence, which results in a very small response content.

So take an IE7 browser, and try to post an empty comment on any WordPress that is not upgraded to version 2.7 yet. See that friendly browser message? The same in Firefox produces the expected “Error: please type a comment.” message. WordPress had already fixed this in 2.7 (you can copy their workaround) after some bug reports, but any installation prior to 2.7 which is still not upgraded (and many MU installations which always trail behind in the version releases) has this.

For any application other than WordPress, be sure to go over this table of minimum content size for each HTTP status error code, and make sure your responses are long enough.

Apparently, sometimes being laconic is bad.

How to delete those old EC2 EBS snapshots

EBS snapshots are a very powerful feature of Amazon EC2. An EBS volume is readily available, elastic block storage device that can be attached, detached and re-attached to any instance in its availability zone. There are numerous advantages to using EBS over the local block storage devices of an instance, and one of the most important of them is the ability to take a snapshot of the data on the volume.

Since snapshots are incremental by nature, after an initial snapshot of a volume, the following snapshots are quick and easy. Moreover, snapshots are always processed by Amazon’s processing power and not by the cpu of your instance, and are stored redundantly on S3. This is why using these snapshots in your backup methodology is a great idea (provided that you freeze/unfreeze your filesystem during the snapshot call, using LVM or XFS for example).

But, and this is a really annoying but – snapshots are “easy come hard to go”. They are so convenient to use and so reliable, that it’s natural to use a cronned script to make a daily, or hell — hourly! — backup of your volume. But then, those snapshots keep piling up, and the only way to delete a snapshot is to call a single API call for a specific snapshot.If you have 5 volumes you back up hourly, you reach the 500 snapshots limit withing 4.5 days. Not very reliable now, huh?

I have been searching for a while for an option to bulk delete snapshots. The EC2 API is missing this feature, and the excellent ElasticFox add-on is not compensating. You just can’t bulk delete snapshots.

That is, until now :). I asked in the AWS Forum if there is anything that can be done about this problem. They replied it’s a good idea, but if I really wanted it to be implemented quickly, I should build my own solution using the API. So I took the offer, and came up with a PHP command line tool that tries to emulate a “ec2-delete-old-snapshots” command, until one is added to the API.

The tool is available on Google Code for checkout. It uses the PHP EC2 library which I bundled in (hope I didn’t break any licensing issue, please alert me if I did).

Usage is easy:

php ec2-delete-old-snapshots.php -v vol-id [-v vol-id ...] -o days

If you wanted to delete ec2 snapshots older than 7 days for 2 volumes you have, you would use:

php ec2-delete-old-snapshots.php -v vol-aabbccdd -v vol-bbccddee -o 7

Hope this helps all you people out there who need such a thing. I will be happy to receive feedback (and bug fixes) if you start using this.