With Matt Jockers’ new book out, and the reviews already coming in, I’m starting to find the macroanalysis/microanalysis framework a little lacking. It’s not that I don’t think it a good approach, and it takes many forms in digital humanities scholarship. There are numerous examples of distant reading paired with close reading as per the classic Moretti definition in literature, or the HGIS flavor seen when Ruth Mostern contrasted state-level change with local administrative practices with Dividing the Realm in Order to Govern, or my own attempt to contrast strategic cartograms with an Oregon Trail-inspired “situated perspective” within the network, as I did with ORBIS|via. But while you can arbitrarily define macro and micro based on the research questions of a DH project, I think there needs to be a formal mesoanalytical layer defined somewhere between macro and micro.

It’s not that I think an arbitrary middle layer is needed, or that we need an exhaustive formal hierarchy like that found in ecology. I think that the meso layer is a necessary complement to the existing approaches. I think of distant reading or macroanalysis as focused on patterns in the data, and close reading or microanalysis as focused on sophisticated interpretation of case studies and their context. Data visualization and analysis patterns are well-established for dealing with these two cases, regardless of whether your macro scale is 3000 novels or terabytes of Twitter data. But, as I was touching on in the Visualization Across Disciplines forum at HASTAC, there’s a fertile space for the representation and analysis of functions and processes.

Partly, I’m in favor of developing mesoanalysis because I find I’m more capable of engaging meaningfully in the representation and analysis of these functions than I am in the micro and macro patterns of the objects being studied. This is natural for most of alt-ac, I suspect, where we’re asked to contribute to research and technical development for projects that are outside our fields of study. But I think it also grows out of digital humanities scholars’ constant unease with computational methods. I thought it was enough to know that something worked well in approaching certain sets of data, then I figured I needed to know how it was working so that I could explain the function to a scholar I was working with, but now I realize that there never is a best process, only a contingent chosen process for showing or analyzing data. As such, it’s incumbent upon me to represent the variations in that process and expose it as a formal construct so that scholars can modify it and critique it.

This goes back to building models, since models are aggregates of processes paired with data and informed by close reading of the data. And it’s driven by the increasing availability of interactive data visualization that lets one perturb such functions to see how they may or may not inflect the results. Xueqiao Xu’s pathfinding piece makes me think of my own Network Analysis toy and how useful it would be to have something like that to explain topic modeling, and community detection, and tasslecap transformations in GIS.

Anyone who codes, or plays video games, or has a particularly awesome refrigerator knows that functions come in many sizes. They can be rather complex, like the pathfinding functions Xueqiao Xu so deftly explores, or interesting more for their formal nature than their complexity, such as the choice in weights for full text searches. For one thing, the sheer variety of functional approaches to data is overwhelming. There’s a hundred page paper on different community detection algorithms for network analysis that I like to use to scare people who think the Modularity button in Gephi is a standard. Ben Schmidt’s recent call for more than one pattern detection method in computational approaches to text is in reaction to the assumption that topic modeling is the only approach to corpora. I don’t think this is a result of the naïveté of the DH community. These functions are usually so wedded to the data and choice of representation as to be explained in a pragmatic sense that when explained they’re given as simply the “best” or “most optimal” choice. But as I settle down to write about how we computed a Tragedy Index for an upcoming project, or how different centrality measures reveal not better or worse but different patterns within a network, I realize that our approach to functions needs to be a bit less pragmatic and bit more post-modern.

This tripartite formalization of digital humanities research also highlights the need for more engagement with the adoption of tools developed for pragmatic industry goals in scholarly research. We’re using Google’s patented formula to determine the importance of literary works, information retrieval aids for the study of poetry, and GIS techniques created for planning highways to understand environmental refugees. There’s more and more publication of code, and detailed explanation of that code such as Jockers’ explanation of how he optimizes data for topic modeling, but I’d like to see more functions expressed visually and interactively, to allow readers the same capacity to manipulate them as they could work with published data and analyze the narrative. This kind of research cannot be adequately reviewed if we limit ourselves to the macro and micro scales in our reading because it forces the reviewer to either choose between accepting or dismissing the functional component, rather than tweaking it or engaging with it in a more rigorous critique.

The original version of this post spelled it “mezoanalysis”, as you can see in the url.

This entry was posted in Algorithmic Literacy, Big Data. Bookmark the permalink.

Comments are closed.