the digital humanities at stanford: mixed networks, mixed messages

Who's working on digital humanities at Stanford? As technologists, our impulse is to record and analyze this network, yet as humanists, we can see the problems with this project before we even start. Network analysis has not often been applied to humanistic topics because our data sets and data collection are willfully untidy: no two people, no two relationships, no two objects are alike, so how can we reduce them to a simplistic set of nodes and edges? Yet we're intrigued by the possibility of one more way to look at human relationships, so we created these mixed network graphs — graphs of unlike objects — and analyzed them by asking questions rather than computing answers. Below are three visualizations that reveal different aspects of the mixed network. We created these multiple perspectives in order to shed light on the decisions that go into the creation of a network visualization.

The data were gathered in late 2010 and early 2011 from academic websites by Elijah Meeks and Molly Wilson. Elijah's data-gathering ranged across the digital humanities landscape of Stanford, while Molly's focused on the Mapping the Republic of Letters project. Elijah and Molly could have chosen alternate ways to collect data — personal interviews, time on task, email networks — but chose to look at the way the digital humanities community at Stanford is representing itself. The visualizations are thus a representation of the digital humanities' own web presence.

I. Who's There?

First, let's pick out the people involved. Anything that does not represent an actual person here is greyed out, and the people nodes are colored according to their status within the university: faculty, graduate students, staff, and others (usually, people who are not a part of Stanford). Larger nodes simply mean more connections to other nodes.

●
Faculty
●
Graduate Students
●
Staff
●
Other People
◎
Degree (undirected)

Several of the largest projects seem to attract primarily one type of contributor. There does seem to be such a thing as a faculty project node, a graduate student project node, or a staff project node. Does this have an analogue in the real world, or is it an artifact of the way people are credited as contributors?
Faculty (red nodes) are connected to the greatest number of nodes. There are three large green nodes but no large blue nodes. Are faculty actually involved in the most projects, or does this show how connecting faculty names with a project gives the project greater visibility?
Many of the nodes in this graph do not represent people; more specifically, few of the highly connected nodes are people. This is partially explainable by the way the data were gathered; we scraped websites and CVs rather than interviewing people about their personal relationships. Nevertheless, is there a real-world analogue to projects being central to people's working relationships?

II. Project Neighborhoods

It's easy to accept that a node represents a person or project, but the concept of a link is harder to swallow. The visual symbol of a connecting line can mean founding a project, funding a project, or even a single promise to drop by for a meeting. For our first foray into problematizing links, let's turn to four Stanford projects that we'd expect to see represented especially well: the Bill Lane Center for the American West, the Literature Lab, the Spatial History Project, and Mapping the Republic of Letters. In each graph, the color starts at the white node and spreads one degree in all directions, showing what graph theory calls an "ego network". Hover over the graphs to see the ego network expand to two degrees. Node size corresponds to betweenness centrality: how likely is it that paths between nodes go through the given node?

●
Bill Lane Center for the American West network
●
Lit Lab network
●
Spatial History Project network
●
Mapping the Republic of Letters network
●
Center of ego network
◎
Betweenness centrality

In some cases, adding another degree to the ego network makes the ego network much larger (e.g. Mapping the Republic of Letters). In other cases, it doesn't make as much of a difference (e.g. Bill Lane Center). Becoming connected to a project node with many other connections — in network theory, a node with high degree — can make an ego network explode.
What is the relationship between the first-degree network and the second-degree network, and how does it play out in projects? A project with a small first-degree network but a large second-degree network may be more nimble and streamlined, with resources at hand but not in the way. On the other hand, it may be isolated, with valuable resources too far away to be useful. A project with a large first-degree network may be thriving, or it may be paralyzed with more hangers-on than it can handle.

III. Hubs and Authorities

Now that we're examining edges, we can think of networks as belonging to two broad categories. The first is an undirected network, where each link represents a reciprocal, symmetrical relationship between two nodes — most social networks are seen as undirected. The second is a directed network, where each link has a source and a target, such as the world wide web. Our digital humanities network is directed, and we chose to make larger entities the targets of smaller entities. For example, an edge between a person and a project treats the person as the source and the project as the target. To preserve the integrity of directed edges, we didn't record many links between two people or between two projects.

Here, the nodes' size is based on an algorithm called "hubs and authorities", originally developed to analyze relationships between websites. A hub is a node with many links pointing outward from it, and an authority is a node with many links pointing to it. The graph on the left represents hub rankings, and the graph on the right shows authority rankings. The four projects from above are colored as before, but in addition, nodes with hub or authority values greater than the median value are highlighted; hover over the graphs to find out what those nodes represent.

●
Bill Lane Center for the American West
●
Lit Lab
●
Spatial History Project
●
Mapping the Republic of Letters
●
Other hubs
●
Other authorities
◎
Hub or authority value

A strong authority, for us, is a large entity that many smaller entities are associated with, and a strong hub is a smaller entity that's associated with many larger ones. Our network has many more strong authorities than strong hubs. Where are all our hubs? Given the cynical view that academia trades on big names, we might expect a few faculty or staff to have their names on many projects, especially because our data collection came from websites and publicity materials. However, the data collection doesn't reveal a cohort of individuals who are involved in massive numbers of large projects. This might allow us to call Stanford's digital humanities community fairly distributed, with nobody who holds a disproportionate amount of influence. On the other hand, the reason we don't see more hubs could also have to do with our data collection. We spent more time on project websites than on personal websites, and we made projects the center of our investigation.
Given that all of our major hubs and authorities are projects, not people, it's worth thinking about what a hub project looks like and what an authority project looks like. A hub project (e.g. Humanities Research Network) probably is involved with many other projects that we saw as larger or more comprehensive. It is a resource, and a well-used one. An authority project (e.g. Parker Library on the Web) is likely large and makes use of many resources, and most of the projects it's involved with are projects we saw as smaller. For us, a funding agency counted as smaller than the project it funded, so an authority project may also be one funded by many sources.
Two of Stanford's flagship digital humanities projects are the only two nodes that are both strong hubs and strong authorities: the Spatial History Project and Mapping the Republic of Letters. This means that they are linked to larger projects and smaller ones, and if we didn't know anything about the projects themselves, we might imagine that these were midsize projects with a relatively low profile. However, these are two of the most well-known projects in the network, possibly due to their many edges directed both inward and outward. Should we be making projects strong hubs as well as strong authorities?