Friday, June 10, 2016

Group Project: Sentiment Analysis of Poetry in Python (DHSI 2016)

I took a one-week course on Coding Fundamentals at DHSI 2016 with Dennis Tenen (Columbia University) and John Simpson (University of Alberta). You can see the syllabus for the course here

Let me start with a quick plug for Dennis Tenen's group at Columbia, the "Group for Experimental Methods in the Humanities"  You can see some of the projects they are doing at their Github site; one in particular that seems really interesting is RikersBot, a Twitter bot that conveys a series of statements from inmates at Rikers Island Prison in New York. It was created as a joint project between Columbia University students and Rikers inmates interested in learning coding; part of the project involved teaching all of the young people in the class the coding they would need to build a Twitter bot. The Bot is currently not active, but the stream it produced over several months is well worth a look.

*

Why coding? I wanted to get started with coding because it seems to be one of the major dividing lines between people who can chart their own independent course through the digital humanities and people who work with ideas and tools developed by others. It's not the be-all, end-all, of course (as I've said before, you can do so much now with off-the-shelf tools), but some experience with coding seems like it could be really helpful for projects that don't quite fit the mold of what's come before.

The class itself was intense, frustrating, and sometimes really fun. I'm not going to lie: learning how to code is hard. I can't say that I will readily be able to start spitting out Python scripts after four days of working with the language, but I might at least be able to figure out how to a) do some simple scripts to process batches of text files that otherwise require repetitive, laborious work, and b) use libraries of code developed by others in Python to do more advanced things.

*

Monday, June 06, 2016

#MyDHis... (Text of my Presentation at DHSI)

(I'm doing a 5-7 minutes presentation at DHSI in Victoria this afternoon. This is the text of what I'll be presenting.)

Admittedly, I'm not using the hashtag quite right -- it should be #MyDHis. But I like the flexibility (and brevity) of just making it "MyDH"... 

1. I feel presumptuous saying #MyDH; I have until recently been more a kibitzer than a doer. But ok, I’ll own it. #MyDH, here goes #dhsi2016

2. #MyDH explores social justice issues as a starting point and as fundamental to project architecture. Not as an afterthought. #dhsi2016

3. #MyDH allows that people who agree with #2 might not necessarily agree on what social justice looks like. #dhsi2016

4. #MyDH encourages projects that mitigate the uneven access to the internet outside of privileged, western academic centers. #dhsi2016

5. #MyDH: Just as women writers were once excluded from the Canon, contributions of women scholars have been marginalized in DH. #dhsi2016

6. #MyDH is oriented towards communicability and teachability. Don’t skimp on documentation, roadmaps, explainers, and How-Tos. #dhsi2016

7. #MyDH uses technology as a subset of humanities scholarship, and advocates for all humanities work, including non-digital work. #dhsi2016

8. #MyDH opposes technoutopianism and worries about depending on commercial cloudware. I prefer presentism, realism, autonomy. #dhsi2016

9. The focus on #MyDH needn’t diminish the DH ethos of collaboration (i.e., #OurDH). #MyDH is a way of recognizing differences. #dhsi2016


10. #MyDH doesn’t need six-figure grants. We can do a lot with off-the-shelf tools, patience & a willingness to learn/ screw up. #dhsi2016


I would be more than happy to talk more with you about any or all of those Tweets in the Q&A later. But in the time I have left, I’d like to just briefly expand on Tweets 2 and 3, related to social justice. In the fall of 2015 my colleague Ed Whitley and I co-taught our first-ever Introduction to Digital Humanities course at Lehigh. One of the prominent units we lined up related to digital archives; what I discovered was surprising and disconcerting. (Incidentally, I wrote about this in detail in a blog post called “The Archive Gap: Race, the Canon, and the Digital Humanities.”) The essential point is that there is a huge gap between the archive frameworks that exist for canonical writers and those that exist for minority writers and writers from the colonial world.

There’s no doubt that this problem has been recognized and that there’s been a growing effort to address the conservative and canonical legacy of some early digital archive projects. But in my view, simply aiming to match archives of canonical figures with works by writers from the emerging canon isn’t sufficient. Going forward, I would be interested in seeing if we can design digital archives differently. Established archives of canonical figures tend to emphasize the neutral and idealized presentation of the materials. Any references to politics, and any specific points of editorial advocacy are carefully downplayed. What if we reconceived of our role as archivists and editors? Perhaps our role in presenting materials should be as much to advocate for the authors themselves – and along the way, offer actual interpretations of their works – as it is to present their textual materials. 

I’ve been aiming to do some of these things with a new digital project I’ve been developing in Scalar with a pair of graduate research assistants (the project is presently at a very embryonic phase). We aren’t exactly hiding from the canon – the project is called “The Kiplings and India.” But there are two ways in which our thematic collection might be different from earlier projects. One is that it emphasizes the extensive degree to which the famous Author, Rudyard Kipling, collaborated with his other family members, including especially his sister, Alice Kipling. (In my Tweet #5 I mentioned that women writers have been written out of the Canon; here we could say the women Rudyard Kipling collaborated with have been written out of the story of his emergence as a writer, and I would like to write them back in.) Second, we are designing the journalism component of the archive with an eye to social movements and conversations that were happening all around British India (including the voices of actual Indian people, especially Indian women), but with which the Kiplings themselves may not have had extensive direct engagement. The idea is that someone interested in issues related to, say, Indian women and divorce law (a topic which was being hotly debated during by both British and Indian participants in the 1880s) could gain access to useful editorial insights and archival materials from our site without necessarily having to see that interest mediated through the Kipling family.

To go back to teaching. After we talked about the Archive Gap dynamic in the DH class I was co-teaching last fall, I designed a collaborative class project assignment around a groundbreaking 1922 book of poems by Claude McKay, Harlem Shadows (which includes the famous statement of rebellion, “If We Must Die…”). Admittedly, there is already a pretty nice presentation of those poems in a project by Chris Forster and Roopika Risam, but it’s very textually focused and offers minimal editorial commentary. With my graduate students at Lehigh, I encouraged them to think about a project that might appeal to a broad constituency of readers, including undergraduates and high school students as well as non-specialists.

The students were given certain encouragements, but then we let them loose to make their own design and editorial decisions. What they came up with was surprising and deeply impressive. First, they retitled the project to differentiate it from a standard digital edition. Second, they created two presentations of the poems in Harlem Shadows, one version that corresponds to the poems in the order in which they were originally printed, and another version that presents the poems thematically. All of the poems are thematically tagged based on a set of tags agreed upon collaboratively by students in the class. The site includes a clickable Wordcloud of student-generated tags that leads users to lists of poems oriented around specific tags. They also generated a substantial number of contextual and biographical essays that help bring the poems in Harlem Shadows to life for today’s readers. And finally, students built the site themselves, including menus, graphics, and text. I directed them to use a public domain, “dirty OCR” version of Harlem Shadows derived from the Internet Archive. They proofread and corrected the OCR and produced unique pages for each poem in Harlem Shadows. (As a side note, if we did the project today, we would do it in Scalar -- but I hadn’t really gotten my head around Scalar last September.) 

Sunday, June 05, 2016

Digital Humanities Blogging: Retrospective on a Pretty Productive Year

I'm in Victoria this week for the DHSI. On Monday 6/5 I'll be presenting at a plenary session at the conference; I'll probably post the text of my brief presentation here sometime tomorrow.

Meanwhile, for new friends visiting this page, here are a few blog posts I have written related to Digital Humanities issues over the past year. (Quite a diverse range of stuff! Now that I've embarked on my own DH project in earnest the range of topics we discuss might narrow.)

In Defense of Digital Tools (by a Non-Tool). My response to the critique of DH in the LARB that appeared last month. The critics of the Digital Humanities make many good points, but their critique is tendentious and aims to demolish the field rather than make it better. I think we can use critique to keep making it better.
http://www.electrostani.com/2016/05/in-defense-of-digital-tools-by-non-tool.html


The Archive Gap: Race, the Canon, and the Digital Humanities. I was proud of this essay, which evolved out of teaching notes in September 2015. If I have my act together I will turn this into a publishable article sometime:
http://www.electrostani.com/2015/09/the-archive-gap-race-canon-and-digital.html


Fall 2015: Digital Humanities. The syllabus to the course I co-taught with Ed Whitley in Fall 2015. We designed the course with a strong emphasis on social justice. 


Digital Teaching Notes: The 'Harlem Shadows' Collaborative Project
http://www.electrostani.com/2015/12/digital-teaching-notes-harlem-shadows.html


Syuzhet For Dummies. Where I learned enough R to be able to apply Matthew Jockers' Syuzhet package for sentiment-analysis and assess some of the challenges people have made regarding the way the package visualizes data. I tried applying the package to a series of George Eliot novels.
http://www.electrostani.com/2015/10/syuzhet-sentiment-analysis-of-novels.html


An Account of David Hoover's DHSI 2015 Keynote: Performance, Deformance, Apology. I found this controversial keynote address alternately really interesting and deeply frustrating.
http://www.electrostani.com/2015/06/an-account-david-hoovers-dhsi-2015.html