Is there Life on Twitter?

Updated to include a legend, as per advice from Scott Weingart.

Twitter Network Legend

How tweets and users are connected and visualized.

Large components of a twitter networkA collection of 15 different microorganisms found swimming through twitter recently.  Each component is a collection of tweets and users (though the small, round clusters have only one user) connected through @username tweets.  The tweets all came from various hashtags related to the recent violence in Norway and are color-coded by time and keywords so that earlier tweets are lighter in shade and later tweets are darker, while tweets that mention “Muslim” are in green, tweets that mention “terror” are in fuchsia and tweets that mention both muslim and terror are in teal.  You can clearly identify certain types of twitter lifeforms, such as the loquacious-but-disconnected globes and the spammer spiders (green bottom-left).  Interestingly, the purple spammer spider on the left is a spammer, but someone who is not spamming links to videos but rather someone who is spamming public service announcements.

A network of tweets and Twitter users consisting of ~4000 users and ~26000 tweetsThe largest component of the network significantly dwarfs these small organisms and is roughly 20 times the collective size of the last slide.  It consists of organic connections between many twitter users along with a significant number of link-spamming users (whether malicious or not) that act as glue holding together much of the weakly-connected network.

Twitter spam spidersIf we pull out these twitter spam spiders, and dye them green…

Dyed twitter spidersThen they can be reinserted into the network to see if their influence on the connectivity of the component.

Spiders in the Twitter NetworkWhile they do seriously improve the connectedness of this component, the spammers aren’t as important to connectivity as the activist tweeters whose posts resemble spammers (in yellow).  A third category of Twitter user also holds together these networks: the celebrity.  Even without tweeting, they act as a locus of tweets from various concerned fans.  Compare the topologies of a Twitter activist and celebrity:

Justin Bieber in the #Norway/#Oslo/#Otoya Twitter network

Keep in mind a fundamental difference here is that the latter has sent no tweets out using one of the studied hashtags and so is a connection only of incoming tweets, some imploring, some informational, some obviously just name-dropping.  Stepping back from this single, albeit large, component, we can look at the entire network to try to get some understanding of this particular information ecosystem.
A Twitter Network

A Twitter Network core region

At this scale, the microorganism is no longer a useful analogy, and instead we have to focus on vastly dissimilar scale entities: the galaxy or interstellar cloud.

Regions in a Twitter CloudAt the center of the cloud is the hot, highly-connected region, made up of the single component above.  Orbiting around that are a few weakly-connected pieces of that component and a collection of very small user-and-tweet clusters that number no more than a dozen total tweets.  The edges drawn to connect each tweet-to-user give the impression of traditionally drawn constellations.

Twitter Cloud Regions DetailFinally, in the outer periphery, we see no more of the structures that remind us of organisms or galaxies and instead on a dull background noise.  A few tweets from one user, occasionally one connecting two users, oftentimes where the other doesn’t make a single, recordable note.

This is the, remember, the structure of tweets related to a set of hashtags, considering connections based on a single rule.  It could be that many of those Twitter users out in the periphery according to this view are actually highly connected by tweets outside the hashtags collected here.  But with Twitter producing over 100 million tweets a day, the idea of capturing and visualizing and studying a network at that scale would be such an endeavor as to be practically impossible.  Another possibility is to draw the network according to different rules, such as making connections based on retweets.  This is a traditional method of visualization of Twitter networks (if anything like this could be considered traditional) but it tends to overemphasize connectedness to the point that there are no interesting substructures, just pulses of important news traveling user-to-user.

The old dig against Twitter is that it is merely a place to broadcast.  “I’m eating a turkey sandwich.”  “I can’t believe there’s no parking at SFO.”  “I’m a winner, buy this hairspray.”  Frankly, I cannot comprehend what Twitter is for.  The sheer quantity of material, even for three days on one topic, is enormous.  The messages are dominated by recycling earlier messages and by a pervasive element of spamming–both malicious and well-intentioned.  Many users send of dozens of tweets with little response, whereas others are only recipients and seem not to engage at all.  There are arguments and conversations within this network, but they are extremely rare.  Again, it may be because of the method in which the network was created, given that responses and conversation may happen more naturally without hashtags, but I intuit that the texture of the network as represented here holds true for the service as a whole.

In examining this network I made a huge number of visualizations using Gephi.  If you’d like to see some of them, they’re here.

This entry was posted in Big Data, Graph Data Model, Social Media Literacy, Visualization. Bookmark the permalink.

Comments are closed.