Scaling RSS Mixer – Analyzing Google Crawl Stats

The public alpha for RSS Mixer has now been up for a week. The site started out with around 18,000 feeds in the directory. These were added over the last year, since the launch of the initial prototype last summer. The count now stands at 24,000–a relatively large increase for our first week. This total number of feeds translates into nearly 3 million posts in the RSS Mixer directory.

A mention in Mashable! (RSS Mixer Makes Mashup Easier), along with a number of mentions in China (most notably Web Share 2.0) and a post in a Spanish language blog (Geeks Room) among others–helped add nearly 3,000 feeds in just 24 hours. Things have slowed down a bit since, but we are still serving up a lot of pages and supporting an ever-increasing number of widgets and feeds.

When the prototype site went live last year, we were swamped and the site was crushed by spikes in traffic. This time around the structure of the application and the database is much improved.  Not only can we handle the load, the time it takes to deliver pages is vastly improved. Take a look at the chart below from Google Crawl stats. This shows how Googlebot (which indexes pages) has spidered the RSS Mixer site over the last 90 days. You’ll notice as the Alpha site replaced the prototype there is a huge spike in activity, as new pages are added to Google. Check out the bottom graph and you’ll see the download time fall off the chart!

This drop-off shows the performance improvement in RSS Mixer. Of course, if we continue to add 6,000 feeds and approximately 750,000 post every week–we’ll have to revisit our site structure in the coming months.

Back To Blog

Recent Posts