Twitter’s one-to-many scaling impossible?

Twitter has been having all kinds of scaling challenges. There have been hundreds, if not thousands, of posts on the subject. Dave Winer pushed an idea for a decentralized Twitter (and has since admitted the power of Twitter is in its centrality). There is a single, simple, reason for Twitter’s challenges – Math is against them.

The facility of communication on the Twitter service is absolutely outstanding. I’ve written extensively about using it to receive an amazing amount of quality information in my series on flow.

I originally questioned the scaling ability of the service prior to SXSW, but when the service held up I went back to the drawing board to make sure my numbers were correct.

Before continuing, let’s establish the basics about the service so the math will make sense…

  • Each Twitter account can follow any other Twitter account (bear with me and forget those accounts with private updates).
  • Messages travel in one direction, from the updater to the follower.
  • Each account has updates from other accounts it follows placed in its timeline.
  • A Twitter account can selectively receive pushed updates immediately via instant messenger and SMS in addition to having an update added to its timeline.
  • An update added to an account’s timeline may or may not be push based (lets assume it’s demand driven, or pull based).
  • An update sent to an account from an account denoted as SMS or IM announcement is push based (there is no other way to send an update – it must be actively pushed from the server).
  • The mere possibility of an update needing to be pushed requires the system to check with each follower’s settings, thus requiring analysis of each follower for each update.

A warm-up equation

If there are one hundred (100) users and each user follows ten (10) fellow users, and each user sends ten (10) updates per day, assuming all updates are push-based, how many updates are sent?

Continue Reading

Flow – Jabber/XMPP as an RSS over HTTP replacement

Twitter on XMPP is just the beginning…

Speed of Light

Courtesy NASA Glenn Research Center

.

I’ve been using Twitter as a main source of news and entertainment (it’s entertaining and informative to have commentary coming in with links, events, articles, and photos). Most everything pertinent to my areas of interest are discussed, so the latest news is passed around as discussion.

As my series on flow describes, my Twitter stream is received through a GTalk client and I’m receiving about 30 to 40 tweets per minute.

This is a lot of incoming information. A lot more than one could read and keep up with all day. It’s valuable for periods of time… Jump in to the river, jump out. This is sort of like news.

Now, I love RSS. I spend a good hour per day reading feeds. I believe it will be the standard in syndication for years to come. And maybe it will be the format passed over XMPP channels, too. In using Twitter for my flow of information I have discovered how amazing real-time updates of news can be, and how HTTP (the current method of pulling RSS feeds from various servers) isn’t powerful enough.

Imagine Google Reader being push based. Instead of periodically receiving items every five, ten, or fifteen minutes. You receive new blog entries, articles, etc, within milliseconds of their publication. This becomes amazingly powerful because you are no longer reading what happened, you are participating in what is happening.

Comment systems become conversation engines. Discussions and exchanges of information become natural, rather than one-way.

HTTP and web services, with their beautiful RESTfulness, won’t be going away. They have a very effective place for on-demand pulls of data. What I’m describing is a move away from HTTP and web services which currently poll – the enablement of FriendFeed, Twitter, blogs, and news services to fire off announcements on a push basis…

Nobody wants to wait three minutes before receiving their next round of updates. We want it when it happens.

All incoming Twitters are saved and searchable in Gmail

I came by this as a latent side effect from switching to my flow method of using Twitter. It seems a lot of people want a quick and easy way to save their Twitter stream and be able to search it later…

To do this, you need to set up Twitter so you’re getting (or also getting) your updates via a GTalk/Gmail account. It’s very easy:

First – set up Chat in Gmail

1. If you don’t have a Gmail account, get one! After logging in, go to “settings” and hit the “Chat” tab.
GMail Chat Tab
2. Choose to “Save chat history in my Gmail account”.
3. Save this setting.

Second – set up Twitter to send notices to your Gmail account

1. In your Twitter account, go to “Settings” -> “Phone & IM”.
2. Enter details for your Gmail account.
Twitter IM Settings
3. Save the settings.
note: Only updates from Twitterers you follow and are selected for IM updates will be sent to your Gmail account.

Last – Log in to Gmail and keep that browser open

Log in

1. Choose to Sign into chat. Your Twitter updates will start arriving in Gmail.
2. Keep a tab or window open. If you log out of Gmail, or close the browser or tab, the updates will stop arriving since Twitter only sends updates to users that are logged in. Simply keep a browser tab open (very easy to do if you’re already a Gmail aficionado).

Flow – Day 9 – I switched to iChat for Twitter XMPP

iChat Count 386 – 7 minutes

:

When following a lot of friends in a flow environment and using XMPP, one sees the above numbers in less than ten minutes. I’d been using Adium, but Adium doesn’t smooth scroll between each received tweet. It constantly jerks messages upwards and has made it virtually impossible to have a meaningful experience. There are often times when I want to read each incoming tweet. A good, smooth, reading experience was needed.

iChat has a slightly smoother hit at each received message, and is therefore much more enjoyable to read. The interface is customizable enough, but nothing quite as nice as some of Adium’s minimal themes.

I was mostly hesitant to switch since Adium has outstanding AppleScript support. I’ve been thinking of prototyping something (given a couple hours – someday). Apparently iChat has something even better which I should have known about… Callbacks! A script can fire for each received message.

This will make dynamic, real-time, filtering a reality.

iChat AppleScript

The start of something very cool…

Flow – Day 9 – Open it up

I’m used to the speed of the flow and it’s slow. It’s time to open it up and look for five-figures…

Useful link: flow entries

Follow me on Twitter: sol

Open it up

I read the flow of XMPP Twitter traffic with breakfast and in the evenings. I then scan it when checking email or if I catch a lot of added traffic on the IM window. The part which most people don’t understand is how this translates and how it’s even immaginable to distinguish signal from noise here.

It’s easy. I’m now following over 4,000 fellow Twitterers (Twitterites? Twitterans?). The TPM (Tweets Per Minute) ranges between 20 and 35. This equates to the Twitterers I’m following announcing, approximately, once every two hours (obviously some are once a day and some are every 10 minutes).

Reading the flow at this rate is easy. You have tweets coming in 24 hours per day, but you absolutely can’t follow it the entire time. Feeling like you have to read every Twitter announcement your friends send is the first psychological obstacle to get over. Once you get beyond that feeling of needing to maintain control, you free yourself to dip in to the news of the moment as reported by everybody.

To ensure I’m not missing any messages specifically to me, I keep a browser tab open (usually immediately to the right of my GMail tab) to the Twitter Replies page.

The main trick to keeping a strong signal is being selective in who you follow. By tuning this early, you avoid needing as much filtration later. To date I have only filtered out a single spammer account.

One last point is that some feel this approach is a pull technique in which I’m getting, but not giving back. I  disagree. I submit my status and the special news and information I come by. I encourage people to follow me so they’ll be able to have an insight in to my thought processes and activities.

Given the present rate of flow, I see 10,000 as the next step. It’ll take a while to get there with a selective approach. In the meantime I’m interested in metrics and whether Twitter will continue to be a best source of this data.

Any service could provide an XMPP flow… Imagine Facebook, MySpace, Pownce, etc, offering an XMPP feed of updates. FriendFeed with an XMPP flavor would be incredible.