Scraping the rust off – AppEngine and iPhone SDK


Photo by Peter BaerCC


As a tech manager I’ve got myself in to that mode. You know the mode. The one where you’re so focused on building a great product that you’re not getting to code that often, if at all. This isn’t bad – you have to do whatever you can to get things done – but if you’re a developer manager, you need to live in this space. And I’ve felt the atrophy.

So over the weekend I scraped the rust off and tried some new stuff. I’ve never coded in Python, but I’ve had Google AppEngine sitting on my account for a while. And I’ve got a personal iPhone developer SDK and ADC membership. It was time to whip out the programmer-WD40.

What did I build? Pytchfork. What is Pytchfork? You’ll find out – but not in this post. It’s something I’ve had on my mind for a while. In about an hour I had AppEngine installed and Pytchfork configured. Less than two hours later I was done with a REST library and the framework for what Pytchfork will become.

A REST feature set for input. Basic XML, RSS, ATOM, and JSON as output. In a few hours. Not bad, and it felt gooooood.

From this I’ve learned Python is a friendly animal, and not just in theory. It’s too friendly. The lack of semi-colons in my C/C++ brain feels like I’m walking up to a cliff without a railing at each line ending. But it’s something one gets used to.

Unless you’ve written a PHP or Ruby framework you’re married to, AppEngine and Python is about the best thing you could do for yourself as a way to publish a small, personal, application.

Starting a Monday without rust feels great. Stay sharp!

How to build a really successful web 2.0 service on top of another service and screw it all up

Twicecream – a fake service to demonstrate a point about single sign-on…

In web 2.0 there is a determination to screw up potentially great services. It’s my number #1 pet peeve with software development these days. Here’s a fictitious example of a service you might create…

You’ve built a service that automatically Twitters your geo-position and the name of an ice cream parlor when you’re in front of it. Your phone buzzes when an ice cream parlor is detected and begins sending photos to SnapTweet and TwitPic, including Zagats ratings and commentary. Other patrons respond back and generate conversations. This is your social network: Twicecream – a social network for twittering ice cream enthusiasts.

In front of Ben & Jerry’s on the Wharf, Zagats 4-stars, pics:

Congratulations! You just failed.

You didn’t fail by creating a service few would use. You failed because you didn’t utilize the authentication mechanism your patrons preferred. You built an unnecessary barrier to your garden by requiring an unnecessary account creation. Don’t do this, it’s arrogant and inefficient.

Your patrons have Twitter accounts. Twitter has an API. Your service should have asked the patron to log in with their Twitter credentials.

This isn’t just for social networking. This goes for all web services. SaaS solutions that require secondary account creations are a bad idea. Single sign-on, whenever possible, should be used.

The whole idea is to simplify access to what the customer needs. If you’re requiring unnecessary account creations, you’re screwing it all up.

Crossing the streams – large numbers of Twitter updates

Chris Bilson (@cbilson) had a good description regarding my post about Twitter’s scaling/architecture challenge.

Kevin Rose and Leo Laporte tweet at the same time = crossing the streams”

I dunno if Proton Packs have exponential load challenges, but the end result for a server can feel similar. Is my post I pointed out that Twitter has to determine delivery options and potentially deliver between 100 million and 1 billion updates per day.

But that’s in a day. 1 billion messages in a day are a piece of cake when spread over 24 hours. What if 1 billion messages have to be delivered in an hour? Or all at once?

Take my list of the top-10 Twitter accounts and imagine them all at TED, WWDC, Google I/O, or your local unconference. These ten users, if each sends an update around the same time create 321,928 messages that need delivery (total number of followers for top-10 accounts). This is an awesome amount of message delivery. If those ten users live-blog or get conversational and send ten updates in an hour… 3,219,280 (again, that’s from only 10 users).

I don’t illustrate this to state it’s these power user’s fault. Absolutely the opposite. They’re generating amazing amounts of traffic, which is a wonderful thing, and the algorithms are the problem.

It’s possible to optimize algorithms and modify systems for maximum performance. I bring up Twitter’s challenges because I’m wondering if this is a challenge beyond present day computing.

To open some minds, here’s an impossibility often overlooked: Huge numbers in a deck of cards (just to show impossibilities can stem from small initial numbers).

Twitter’s one-to-many scaling impossible?

Twitter has been having all kinds of scaling challenges. There have been hundreds, if not thousands, of posts on the subject. Dave Winer pushed an idea for a decentralized Twitter (and has since admitted the power of Twitter is in its centrality). There is a single, simple, reason for Twitter’s challenges – Math is against them.

The facility of communication on the Twitter service is absolutely outstanding. I’ve written extensively about using it to receive an amazing amount of quality information in my series on flow.

I originally questioned the scaling ability of the service prior to SXSW, but when the service held up I went back to the drawing board to make sure my numbers were correct.

Before continuing, let’s establish the basics about the service so the math will make sense…

  • Each Twitter account can follow any other Twitter account (bear with me and forget those accounts with private updates).
  • Messages travel in one direction, from the updater to the follower.
  • Each account has updates from other accounts it follows placed in its timeline.
  • A Twitter account can selectively receive pushed updates immediately via instant messenger and SMS in addition to having an update added to its timeline.
  • An update added to an account’s timeline may or may not be push based (lets assume it’s demand driven, or pull based).
  • An update sent to an account from an account denoted as SMS or IM announcement is push based (there is no other way to send an update – it must be actively pushed from the server).
  • The mere possibility of an update needing to be pushed requires the system to check with each follower’s settings, thus requiring analysis of each follower for each update.

A warm-up equation

If there are one hundred (100) users and each user follows ten (10) fellow users, and each user sends ten (10) updates per day, assuming all updates are push-based, how many updates are sent?

Continue Reading