Visualizing Modern Poetry : Thematic Tags in First Books by McKay, Pound, H.D. Plath...

(I'll be presenting a more formal version of this work at the Modernist Studies Association in Amsterdam this weekend -- if I ever get there! Flight cancellations, delays, etc.)

One thing I was happy to do this summer was finish the expansion of my Claude McKay project -- it now contains all of McKay's early Jamaican poetry. I also had a chance to look at the Daily Gleaner and Jamaica Times microfilms at the Library of Congress, so I could include most of McKay's early uncollected poetry as well.

Along the way I got interested in the force-directed graphs you can produce in Scalar, which has a customized form of the D3.js visualization library built into the platform. This makes it incredibly easy to generate network diagrams without having to know Javascript or CSVs. The diagrams are interactive and clickable, so they can serve both as visual depictions of digital collections and as site maps through which we access the texts themselves. However, the price of ease-of-use is that the diagrams are constrained by Scalar. (One of my goals for the summer was to learn enough Javascript that I could start building versions of these diagrams outside of Scalar. It's mid August; I'm not there yet.)

Below I'm going to paste screenshots some of the diagrams I've been generating using "first book" collections by several authors: McKay, Georgia Douglas Johnson (Bronze), H.D. (Sea Garden), Ezra Pound (Personae), and Sylvia Plath (Colossus, and Other Poems). I had earlier in mind the idea of including W.H. Auden's first proper book ("Poems"), but found it difficult to apply the method I have been developing to that book (Auden's early writing involved poems that were too long and narratively immersive to be readily reducible to "themes"). I have also been working on Tagore's Gitanjali, and might add it to the collection below soon.

* * *
Claude McKay, Harlem Shadows

This work starts with a basic act of close reading and interpretation: I (sometimes "we") read books of poems and assign thematic tags. With Claude McKay’s Harlem Shadows , the tags were developed in a collaborative process with a group of graduate students in a graduate seminar at my institution. Each of seven graduate students read the poems and came up with their own tag set; the group then met and consolidated similar tags. As I’ve extended their work, I’ve reduced the tag set they generated to a set of broad, socially important themes in McKay’s work: Race, Labor, Class, Homoeroticism, Sexuality, Desire, Nature, Poetry, and Home.

Here is a graph of the the thematic tags in Claude McKay's first American collection of poems that came out of that project:

Screen Shot 2017-08-07 at 5.19.43 PM.png

[Keep in mind that this is just a screenshot. To access the 'live' version of this archive on a Harlem Shadows mini-site, click here.]

It should be acknowledged that the tagging choices we made are arguable. The tags we focus on tend to be ones that bring together our interests as 21st century readers -- the social justice themes are particularly important to me personally -- with the poems in their own textual and historical contexts. That said, in my own reading practice looking at other poets I've tried not to approach the texts with presumptions that certain themes will appear (i.e., looking for "gender" tags simply because the author is a woman).

My expectation is that other readers would likely select alternative tag sets depending on interest. The graphs below are NOT intended to be objective or empirical representations of the texts. That's by design: these are tools to enhance interpretation, not replace it.

More narrowly, with McKay, the question of where to posit homoeroticism is particularly complicated. Since McKay remained in the closet throughout his life -- but we know, from biographical details, that he was a gay man with an active personal life during the years he was writing these poems -- many of his poems drip with allusions to forbidden relationships and dangerous desires. Is it right to say that a poem like “One Year After” is homoerotic? We decided to err on the side of inclusivity -- to tag poems dealing with "dangerous" desire for homoeroticism even without direct evidence that the relationship being described was specifically between men. But there is a good case to be made for a more conservative approach -- which would largely lead to the elimination of the "homoeroticism" tag, though for the most part the "desire" and "sexuality" tags would remain the same.

* * * 

Georgia Douglas Johnson, Bronze. Introducing "Women of the Early Harlem Renaissance"

Partly inspired by one of the graduate students in that class, I've also started reading some of the women poets in the early Harlem Renaissance, with an idea (in the near future perhaps) to build a small archive of collections of poetry by black women writers between 1900-1922. This constraint (which is partly a copyright constraint) means we lose some of the most famous women writers whose works we might want to include, but it is also an invitation of sorts to explore the works of some writers who have largely been forgotten by all but specialist readers: Georgia Douglas Johnson, Carrie Williams Clifford, Pauline Smith, Clara Ann Thompson, Carrie Law Morgan Figgs, and Mazie Earhart Clark. (Nearly all of these writers have materials available on, but very few have had their work converted to usable digital editions.) Jessie Fauset and Angelina Grimke are both more well-known, and we may end up including some out-of-copyright works by them in the archive as it develops. This project is at a very early stage, but a proposal of sorts is here.

Here is a visualization of the thematic tags in Georgia Douglas Johnson's Bronze:

Georgia Douglas Johnson   Bronze   1922 alternate .png

[For the live / clickable version (and a digital edition of the collection) click here.]

Again, the tags I have here are pretty readily connected to a few major themes, with racism being by far the most central. The “Motherhood” cluster is a close second -- and overlaps to a fair degree with the “racism” theme (at the heart of Bronze are a series of poems where the poet reflects on how American racism will affect the lives of her newborn son).

* * *

H.D., Sea Garden; Ezra Pound, Personae 

I tried the same method with some other poets, including a few who don’t foreground social justice themes as much as Johnson and McKay do. With these more conventional, canonical modernists, the process of creating tags ends up being a little different.

As one reads, one looks for patterns and recurring themes. The themes that seem to be especially obvious rise up and become tags; other times one operates a bit more idiosyncratically based on what one is personally drawn to consider. H.D.’s Sea Garden, and came up with a tag set that was a little longer: “Violence,” “Austerity”, “Flora,” Sea” “Gods,” “Abjection,” “Inscription,” and “Beauty.” Most of these tags are probably unobjectionable (perhaps my decision to use “Inscription” might be debatable), though again, they are not intended to be "objective."

Visualizing  Sea Garden  Using Thematic Tags.png

[For a live version, click here]

In all of these network diagrams, it’s important to remember a couple of things. This is what Johanna Drucker refers to as “capta” rather than “data” -- the tags have been selected by a human reader, and are the process of close reading, not an objective or empirical analysis. The algorithm decides which nodes are closer together together, and which are isolated, but one shouldn’t read too much into the location of “violence” in the above image. (As I’ve studied how the D3 force-directed graph algorithm works, I was surprised to discover that it starts with randomized locations for the nodes, and then sorts them based on the numbers of connections.) Even a slight tweak can lead to a rather different configuration. Here is an alternate representation of the poems and thematic tags in Sea Garden:
Visualizing  Sea Garden  Using Thematic Tags alternate.png

What I think we can perhaps use these diagrams to say is probably a simple confirmation of what any attentive reader would also see: H.D.’s first collection of poems is tightly focused: the poems revolve around certain themes. Each tag has many poems associated with it, and each poem tends to have multiple tags.

Compare this to a visualization I made upon reading Ezra Pound’s first major book of poems, Personae.

Visualizing Pound s Early Poetry.png

[For the live / clickable version, click here]

The clustering is not as tight, but it’s still pretty evident: the three themes at the middle -- “Medievalism,” “Intertextuality,” and “Poetry” (i.e., poems that self-consciously thematize poetry itself) -- are the core of this collection.

Overall, it might be fair to say that despite their many differences, these diagrams suggest that at the level of structure and thematic control, Ezra Pound’s Personae, H.D.’s Sea Garden, and Georgia Douglas Johnson’s Bronze are actually somewhat similar: they are tightly structured books of poetry that help establish each of these three poets as emerging authors with a distinctive voice and range of thematic interests. The outlier of these four is Claude McKay -- who has several different areas of concentration in his writing that don’t entirely intersect.

* * * 

Sylvia Plath, Colossus, and Other Poems 

I tried the same experiment with the first published book by a much later modernist, Sylvia Plath. Again, I used her first full collection -- Colossus and Other Poems -- not the posthumous Ariel. The choice of a smaller first collection rather than posthumous collected works allows us to focus more attention on a smaller range of poems (Colossus doesn’t include some of Plath's most famous poems, though some of the same themes are nevertheless in evidence here if you’re looking carefully. The fact that we don’t have the bombshell of “Daddy” -- after one reads a poem like that, it’s hard to say that anything else is important -- also helps. The poems in Colossus, as I read them, are surprisingly subtle and often slippery.)

Colossus and Other Poems by Sylvia Plath  Tags only .png

[For the live version, click here. For obvious copyright reasons, this mini-site is simply a collection of tags and poem titles rather than a proper digital edition. It only exists to give us the graph above.]

While the diagram for Plath is somewhat larger than the ones we’ve been seeing with the other collections -- my list of tags kept growing as I read -- the overall idea might be somewhat similar. This is a tightly focused first collection -- the overall shape of the diagram is markedly similar to that of Pound’s Personae. There is an especially tight cluster around “Death, “The Sea,” and “Nature,” with the latter two themes being close enough that they are effectively redundant (nearly every “Nature” poem is also a poem I felt I should tag as “The Sea”).

* * *
Next Steps

Attentive readers will notice that all of the reading processes that lead to the creation of these diagrams is close reading. We read the poems and generally define tags as we see patterns starting to emerge. 

This puts us at odds with much digital humanities scholarship that is interested in scale -- the analysis of large corpora. Right now it's not clear to me how one would apply this method to corpora analysis (which might take us into topic modeling and other associated methods). I've been invested in human interpretive choice with respect to picking thematic tags -- “supervised” analysis. Rather than corpora analysis, why not crowd-source thematic tags to a larger array of authors and texts? We may not get 1000 authors, but we could very well get to 100. (I’m open to collaborating...)

At larger scales we'd also have to go to different graphing formats. Network diagrams at larger scales lose their ability to give us shapes; they tend to look like pure noise.

One thing is clear -- before moving to a larger scale it seems important to get feedback from specialist critics  who know these poets' works and find out: 1) do other readers agree that these diagrams have some interpretive value (are they valuable as interpretations of the texts, 2) is there value in comparing the shapes produced by graphing thematic tags in these collections of poems, and 3) are there ways to structure these graphs so as to make them more intrinsically valuable?

Link dump: