Saturday, November 9, 2013

The Hidden Technology That Makes Twitter Successful


Twitter started with a simple form and through tens of billions of repetitions became a network unto itself
Illustration by David Parkins
 
Twitter started with a simple form and through tens of billions of repetitions became a network unto itself

Consider the tweet. It’s short—140 characters and done—but hardly simple. If you open one up and look inside, you’ll see a remarkable clockwork, with 31 publicly documented data fields. Why do these tweets, typically born of a stray impulse, need to carry all this data with them?
While a tweet thrives in its timeline, among the other tweets, it’s also designed to stand on its own, forever. Any tweet might show up embedded inside a million different websites. It may be called up and re-displayed years after posting. For all their supposed ephemerality, tweets have real staying power.
Once born, they’re alone and must find their own way to the world, like a just-hatched sea turtle crawling to the surf. Luckily they have all of the information they need in order to make it: A tweet knows the identity of its creator, whether bot or human, as well as the location from which it originated, the date and time it went out, and dozens of other little things—so that wherever it finds itself, the tweet can be reconstituted. Millennia from now an intelligence coming across a single tweet could, like an archaeologist pondering a chunk of ancient skull, deduce an entire culture.

Twitter’s Nov. 7 initial public offering marks the San Francisco-based company’s coming-out party, the moment when it graduates from its South of Market beginnings and takes its place as one of the Internet’s most valuable properties, without ever turning a profit. What’s perhaps most remarkable about Twitter’s rise is how little the service has evolved from the original core concept of the 140-character tweet—which is to say, not at all. It’s tempting to view tweeting as silly and trivial, and Twitter itself as overhyped and overvalued. But there’s some sophisticated, supple, and even revolutionary technology at work. Appreciating Twitter’s machinery is key to understanding how an idea so simple changed the way millions of people advertise their existences to the world.
How do you look inside a tweet? It’s easy; the structure of a tweet is a matter of public record. Twitter, as a modern Web company, reveals to the world some of the technology it uses, in the form of an application programming interface—an API—which allows external software developers to build tools on top of the service, making it more widely used and thus more valuable for everyone.
All tweets share the same anatomy. To examine the guts of a tweet, you request an “API key” from Twitter, which is a fast, automated procedure. You then visit special Web addresses that, instead of nicely formatted Web pages for humans to read, return raw data for computers to read. That data is expressed in a computer language—a smushed-up nest of brackets and characters. It’s a simplified version of JavaScript called JSON, which stands for JavaScript Object Notation. API essentially means “speaks (and reads) JSON.” The language comes in a bundle of name/value fields, 31 of which make up a tweet. For example, if a tweet has been “favorited” 25 times, the corresponding name is “favorite_count” and “25” is the value.
You know how the National Security Agency collects “metadata” about the phone calls Americans make? Well, that’s what these fields are, except instead of metadata about phone calls, this is metadata about tweets. In fact, those 140 characters are less than 10 percent of all the data you’ll find in a tweet object. Twitter’s metadata is publicly documented by the company, open for perusal by all and available to anyone who wants to sign up for an API key.
This metadata contains not just tidy numerals like “25” but also whole new sets of name/value pairs—big weird trees of data. A good example is in the “coordinates” part of the tweet. This value contains geographical information—latitude and longitude—in a format called GeoJSON, a dialect of JSON that’s used to describe places. This can seem complicated at first, but it’s actually awesome, because it means that simple-to-understand formats such as JSON can express some pretty complex ideas about the world. GeoJSON isn’t controlled by Twitter; it’s a published, open standard. Twitter has added another field, called “place.” Places are not just dots on a map but “specific, named locations.” They include multiple coordinates—they actually define polygons over the surface of the earth. A tweet can thus contain a very rough outline of a given nation. A few tweets can, with some digital fiddling, serve as a primitive atlas. And through some slightly complex math, they can reveal how far one tweeter is from another. Tweets also have a “created_at” field, which indicates the exact time at which they were posted.(Sources from Bloomberg)

 

No comments:

Post a Comment