What does textual scholarship have in common with the semantic web?

A reading of James Smith’s “Working with the Semantic Web” from the newly published collection of essays, Doing Digital Humanities (2016).

Some context: James Smith is a Lead Software Engineer (Kit Check) who also teaches the RDF and Linked Open Data (LOD) course at the Digital Humanities Summer Institute in Victoria (which I’ve had the pleasure of attending this past summer). I came across this chapter on a syllabus designed for a LOD directed reading group I’m involved in and wanted to share a few half-baked observations.

Smith begins his chapter by way of analogy,

The Semantic Web and Linked Data are computational applications of existing scholarly practices: linking to primary and secondary sources, signaling trusted vocabularies and authorities, and positioning a work in a larger conversation. (loc. 6650)1

For many textual scholars, this is a welcomed site: a warm invitation. We know analogy. We understand that analogy works as a powerful narrative tool. And we know when we’re about to be told a good story. Upon arrival, the text signals a comparative framework, a bond Smith continues to return to as he guides readers through what is, for most textual scholars, the strange new world of working not just on, but with, the semantic web. For the purposes of this reading, rather than provide a comprehensive overview I’d like to instead focus on two crucial moves Smith makes in this chapter.

First, Smith reviews the basic mechanics of how textual scholarship works. To do this, he uses the following example: “The new sovereign has achieved self-determination” (loc. 6667). With a little pressure, this sentence cracks under the ambiguity of “sovereign” (which sovereign?) and “self-determination” (what self-determination?), and we, as well trained textual scholars, feel the lack of historical context — of reference. Interestingly, Smith works from an electronic text default, drawing on the function of hyperlinks in digital scholarship before turning to Franco Moretti’s printed chapter in Distant Reading as an example of “intra-textual referencing”, or, what Smith would call “crude hyperlinking” (loc. 6667, 6679). I’ve reproduced Moretti’s excerpt here:

The new sovereign — ab-solutus, united, freed from the ethics-political bonds of the feudal tradition — has achieved what Hegel will call ‘self-determination’: he can decide freely, and thus post himself as the new source of historical movement: as in the Trauspiel, and Gorboduc, and Lear, where everything indeed begins with his decision; as in Racine, or La Vida es Sueño. (qtd. in loc 6679)

Next to the efficiency of hyperlinking, Moretti’s list of references, notes, and notes on references seem wild and dizzying. Necessarily restricted by the technology of print, Moretti “links” to the particular definitions of “sovereignty” he has in mind and inserts a brief description of his take on Hegel’s use of “self-determination.” But why the context overload? Surely there’s such a thing as providing too much context. As Smith is quick to point out, what Moretti is doing with this rudimentary “linking” is ensuring that the reader “doesn’t need to follow the ‘link’” (loc. 6667). With hyperlinks, there’s always a chance that readers will get lost as they go off and explore the contextual crumbs. But consider the print reader who has left a book to go follow a tempting footnote and fetch a referenced text from the library. The print readers’ chances of return are far less likely when compared to electronic readers — or, perhaps more crucially, the chances of setting down a book in order to seek out the referential thread in the first place seems even less feasible. Instead, as Smith points out, the kind of “linking” seen in Moretti’s chapter works to signal to the reader that he “trusts” Hegel’s vocabulary (people who know something of Linked Open Data start grinning here) and conveys a sense of “alignment” between Moretti’s language and Hegel’s, indeed King Lear’s, as Moretti’s writing becomes, to draw on Smith’s language, “informed” by the literature he’s referencing (loc. 66698). Remember, Smith reminds, “As we read a text, we bring all the material we have encountered before” (loc. 66698).

Second, Smith introduces this concept of “at least one.” The “at least one” concept goes as follows: A textual scholar, let’s use Moretti again, mentions a set of literature “with the hope that we will have read at least one” (Smith, loc. 66698). If the mission is to make a connection, what Moretti needs is for us, the reader, to have read one – just one. At first, the language here seems almost exacerbated (“Have the decency to come to class having read at least one of your readings.” Silence. “O come now, at least one!”). In fact, Smith repeats “hope” and “at least one” twice in one paragraph when referring to this desire to connect over a shared reference. Like computers, a human reader scans the information, eyes moving swiftly across familiar words, logs the connections away, and moves on. If nothing looks familiar, however, the reader stalls (perhaps over a wave of curiosity, or, less preferably, renewed anxiety). Machines don’t waste their time feeling anxious: if the information doesn’t look familiar, they give up. This shared reference becomes central to Smith’s guide to working on the semantic web, building on his connection to scholarly reading: “It is critical that the scholar read far and wide in their career: the greater the shared background, the more efficient the communication” (loc. 66698).

Scaling back from the macroscopic fantasy of “wide” reading, Smith returns to the bread and butter of textual scholars: close reading. This return is only to strengthen the natural tie he has been asserting this entire chapter, one between textual and computer science scholars. “The act of making as many connections as possible between the text and what we know,” Smith writes, “ is the essence of close reading” (loc. 66698). This essential connection between linking and close reading, Smith goes on to explain, is why textual scholars find themselves apart of “one of the defining fields in the digital humanities” (loc. 66698).

The rest of Smith’s chapter walks through the basics of structuring information, representing information, vocabularies, relationships, using linked data, and publishing linked data.2 The bulk of the heavy lifting, however, what I would underline as the driving force of this piece, has already been worked out in the first half-dozen pages. To avoid any ambiguity — ever the responsible computer scientist — Smith’s argument becomes fully articulated near the end of his chapter, under the very appropriate SUMMARY heading:

It is by bringing to our computational work the practices of our scholarly work that we elevate the digital side of digital humanities to be equal with the traditional humanities scholarship practices. (loc. 6944)

Refreshingly, Smith departs from approaches that urge humanities scholars to take on the praxis and language of scientific methodology.3 Instead, Smith asks what textual scholarship can bring to this kind of work with the semantic web and gestures towards a model of scholarship that is strengthened by this process of coming together, one that is necessarily — and, as Smith would argue, inherently — open to collaborative and cross-disciplinary work.4


1 The “loc. xxxxx” identifiers work in lieu of page numbers and refer to places within the Kindle edition of this text.

2 To the curious and anxious students of linked data: keep reading. Smith’s gives an accessible and concise overview on how to transform textual information, what readers will soon call a “dataset,” into published, linked data. Though there are moments where readers who are eager to get their hands dirty are left hanging for further instruction, Smith is quick to provide an abundance of links to projects and resources peppered throughout in the form of footnotes, hyperlinks, as well as the inclusion of a Further Readings section.

3 See John Unsworth’s The Importance of Failure, see Franco Moretti’s Conjectures on World Literature.

4 See Susan Brown and John Simpson’s, along with CWRC Project Team and INKE Research Group’s, An Entity By Any Other Name: Linked Open Data as a Basis for a Decentred, Dynamic Scholarly Publishing Ecology.

Works Cited

Brown, Susan, and John Simpson. “An Entity By Any Other Name: Linked Open Data as a Basis for a Decentered, Dynamic Scholarly Publishing Ecology.” Scholarly and Research Communication 6.2 (2015).

Moretti, Franco. “Conjectures on World Literature.” New Left Review 1 (2000): 54– 68.

Smith, James. “Working with the Semantic Web.” In Compton, Lane, and Siemens (eds.) Doing Digital Humanities. Routledge, 2016.

Unsworth, John. “The Importance of Failure.” Journal of Electronic Publishing (1997).

Photo credit: michael podger via Unsplash


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: