Showing posts with label Modernism. Show all posts
Showing posts with label Modernism. Show all posts

Gendered Pronouns in Early 20th Century Fiction: A Simple Quantitative Study

Gendered Pronouns in Early 20th Century Fiction: A Simple Quantitative Study

The following short essay is a work in progress -- I am exploring the uses of a corpus of early 20th century literature I have been developing for a few months. The study below represents an attempt to make use of that corpus to query a topic that has been of interest in quantitative DH in recent years. 


I have long been fascinated by a DH paper published in 2018, “The Transformation of Gender in English-Language Fiction” (link here; authors were Ted Underwood, David Bamman and Sabrina Lee) that has suggested strong statistical evidence that men were increasingly dominating the world of fiction in late 19th and early 20th centuries – that between 1850 and 1950 the percentage of published novels that were authored by women dropped dramatically (from near parity to more like a third or a quarter). Thus, at the exact period when we might have expected women to be gaining visibility and influence – associated with the early 20th-century suffrage movement and the appearance of important feminist voices like Virginia Woolf – they were actually losing position on the whole in the publishing world


According to the authors, the pattern only started to reverse in the second half of the twentieth century (and today, the publishing industry would of course look very different). Also, within their fiction, “The Transformation of Gender” authors indicate that men writers tend to write more about men, while writers who are women might be closer to gender parity in the amount of time given men and women in the social world represented in the story. The authors suggest that particular tendency hasn’t improved or changed as much.


Source: Underwood, Bamman, and Lee (2018)

Incidentally, the concern with the growing marginalization of writers who were women alluded to above is not a new one. The authors of “The Transformation of Gender” cite a 1989 study, Edging Women Out: Victorian Novelists, Publishers, and Social Change (Gaye Tuchman and Nina Fortin), where the authors did quantitative (but not digital!) scholarship with similar findings. Tuchman and Fortin counted and classified entries in Leslie Stephen’s Dictionary of National Biography to compare how women writers were talked about versus men writers. They found that while books by men were reviewed more frequently on the whole, the gender disparity in the more recent authors (late 19th century) became especially sharp with respect to works of nonfiction.  The authors of “The Transformation of Gender” used a very large corpus of tens of thousands of novels from HathiTrust (and checked against the smaller University of Chicago novel corpus) as well as sophisticated modeling techniques built around Natural Language Processing (NLP) to infer gender within a text and derive percentages. Some years ago, I finally gained enough confidence in basic Python to explore some of these methods on my own, using David Bamman’s BookNLP software (sadly, that software does not appear to be working at present, so I will not be using it for the results below).


One other bit of background: in the revised version of the essay published in his book, Distant Horizons, Ted Underwood mentions the Gendered Language Visualizer, a simple but deceptively powerful tool that tracks the association between non-gendered words and gendered pronouns in works of fiction. The technique behind that led to the beautifully illustrative image below (from the jointly written 2018 essay)


Source: Underwood, Bamman, and Lee (2018)

What it shows: women in fiction tend to "smile" and "laugh"; men tend to "grin" and "chuckle." (Though note that the divergence diminishes over time -- so in contemporary fiction that 'gendering of mirth' would be much less pronounced than it was at the peak of the divergence, around 1950.) 


Earlier studies: I should say that this is a more complex version of a type of analysis scholars have been doing in stylistics for many years; there are studies that go back to the 1990s that aimed to predict the gender of a writer based on characteristics of function words and articles. Koppel et al. (2002) used sophisticated statistical techniques with a fairly straightforward counting to find that writers who are men tend to use a higher proportion of noun specifiers (a, the, that), and numbers in their fiction. They also claim women tend to use more pronouns (she, herself), negation (not), and certain prepositions (for, with) and conjunctions (and). By lining up counts of these various parts of speech, the authors claim to be able to predict the gender of an author of an anonymized text with 80% accuracy. (Note: for what it’s worth, I tried to replicate their results with my own small, early 20th-century corpus, and failed. The only place where I saw a clear correlation was with gendered pronouns -- which might explain how I got to the design of the present study below.)


Moving past binarized gender thinking: Admittedly, I am not so interested in this particular application for my own research – it’s almost never the case with 20th-century fiction that the gender identity of an author is unknown. I also tend to be interested in writers who pushed against conventional gender roles and expectations in any case, many of whom might be understood as LGBTQIA+ today – writers like Virginia Woolf, E.M. Forster, D.H. Lawrence, Radclyffe Hall, or Wallace Thurman. Today, most scholars would find the "predict the gender" type of analysis overly restrictive and as essentially reinforcing binarized gender thinking. If E.M. Forster, for example, breaks with the expected pattern in novels that feature women protagonists (spoiler: he does!), that would be a more interesting finding than simply that reconfirming that 80% of men are from Mars, as it were.  


A simplified method for the present study: What if we drastically simplified the query with a corpus of early 20th-century fiction? As a starting point for thinking about patterns with respect to gendered socialization, why not simply look at gendered pronouns: he/him/his and she/her/hers? If the conclusions by Underwood et al. are correct, we should expect to see a lopsided homosocial tendency in fiction by men (men mostly talking to and about other men, and only occasionally mentioning a woman), and maybe a more balanced gender representation in fiction by women. We might also see some interesting anomalies in the patterns that might be worth exploring.  


Before doing this at a mid-range scale, I was curious to see how authors I know would shake out. Over the past few months, I’ve been developing a custom corpus of early 20th-century texts. I have described the basic design of the corpus here; it contains about 1000 total texts, including about 100 texts that might be thought of as canonical high modernist texts, 130 texts by African American authors, and about 90 texts associated with colonial South Asia. It also contains a substantial amount of genre fiction. The results below only reference works of fiction, though there are works of poetry, drama, and nonfiction in the corpus. 


With a little help from generative AI coding assistants, I devised a simple bit of code to count the use of gendered pronouns (he, him, his vs. she, her, hers), first, in a single novel, then in a batch of files, and then derive a percentage from the total word length of the file. I then took those gendered pronoun percentages, and compared them to one another to get a ratio. Rather than overwhelm the reader with a vast array of raw data, I’ll start with some smaller findings, initially focused on gendered pronoun ratios in a small set of ‘high modernist’ works of fiction, mainly by white British and American authors. I’ll then expand the conversation to other authors and consider broadly why any of this might be significant. 


From my limited high modernist collection, what are some texts that are especially lopsided towards men? (If you expected to see Ernest Hemingway on this list, you would be right!)




Text Ratio of masculine to feminine pronouns
Ernest Hemingway: Men Without Women 11.4 to 1
Hemingway: In Our Time 9.4 to 1
James Joyce: Portrait of the Artist as a Young Man 9.2 to 1
John Dos Passos: Three Soldiers 7.3 to 1
D.H. Lawrence: Kangaroo 4.2 to 1
Hemingway: The Sun Also Rises 3.5 to 1
James Joyce: Ulysses 3.0 to 1
James Joyce: Dubliners 2.2 to 1
E.M. Forster: A Passage to India 2.2 to 1
F. Scott Fitzgerald: The Great Gatsby 2.0 to 1

What to make of the lopsided nature of some of these texts? I should say, off the bat, that I don’t think the lopsidedness necessarily serves as an indictment of someone like Hemingway. The relative absence of women in his various short stories is partly due to their settings (several in Men Without Women deal with soldiers and World War I, and “The Undefeated,” about an aging Spanish bullfighter out for a last hurrah, is a pretty marvelous critique of dysfunctional masculinity). Moreover, A Portrait of the Artist as a Young Man is a coming-of-age narrative for Stephen Dedalus at schools that only admit boys and men with teachers who are also only men, so it’s not a huge surprise that the social world represented in the text is also pretty lopsided. (The imbalance might have been less if Joyce had kept in more of the love interest/romantic sections that were in the original Stephen Hero version of his manuscript.) The lopsidedness of other writers (and other Joyce texts) is less extreme, though it’s striking to see novels by D.H. Lawrence and E.M. Forster here (especially since Forster, with Howards End, is also on my second list below). 


Again, I don’t see it as an indictment per se, or as a reason to drop Hemingway or Joyce from my syllabus, though it is still worth knowing. (Do readers want or need to see characters that match their own gender identity or expression in order to connect with a text? Probably not, but my hunch is that it might help...) Still, the pattern does appear to show that there is a pretty limited role for women in the social worlds we find in these texts. It is not as if the authors don’t know it, either: the title Men Without Women can be read as self-critique of a symptomatic nature. These are men without women, and perhaps that’s why they are so broken.


And what about woman-centered texts by writers of literary fiction typically associated with high modernism? 



Text Ratio of feminine to masculine pronouns
Dorothy Richardson: Pilgrimage 1Pointed Roofs 13.1 to 1
Bryher: Development 9.9 to 1
Richardson: Pilgrimage (other volumes) varies between 5 to 1 and 2 to 1
Nella Larsen: Passing 4.9 to 1
Radclyffe Hall: The Unlit Lamp 3.9 to 1
Radclyffe Hall: The Well of Loneliness 2.8 to 1
Wallace Thurman: The Blacker the Berry 2.8 to 1
Gertrude Stein: Three Lives 1.6 to 1
Virginia Woolf: Mrs. Dalloway 1.6 to 1
Katherine Mansfield: The Garden Party And Other Stories 1.6 to 1
Mansfield: Bliss and Other Stories 1.5 to 1
Virginia Woolf: The Voyage Out 1.5 to 1
Woolf: Night and Day 1.4 to 1
Woolf: Orlando 1.4 to 1
Woolf: To the Lighthouse 1.3 to 1
Forster: A Room With a View 1.2 to 1
Forster: Howards End 1.2 to 1
It was not hugely surprising to see Pilgrimage: Pointed Roofs as the most lopsided she/her centered text in the high modernist selection from my text corpus. Pointed Roofs is the story of a young woman teaching at a girls’ boarding school, so, as with Portrait of the Artist above it is not surprising that it reflects a homosocial world with largely girls and women as characters. 


Also, anyone who has read Passing recently would not be surprised to see how prevalent she/her/hers pronouns are there: it really is a novel focused on the relationship between two women. (If anything, this finding only reconfirms readings that have stressed the homoerotic subtexts of that relationship.)


I was intrigued to see a book by a man, Wallace Thurman, come out fairly high on this list (2.8 to 1). I am not entirely sure what to make of it; the novel in question is a thoughtful and often bitter account of colorism within the Black community with a woman protagonist. 


The bigger takeaway might be that the pattern described by Underwood et al. appears to be in evidence with this small group of high modernist writers – writers who were women were, on the whole, less lopsided than were their peers who were men. Instead of a ratio of 10 to 1 or 4 to 1 or even 2 to 1, the median here for writers like Woolf and Mansfield – two of the core authors in the modern feminist canon – is closer to 1.5 to 1. 



Expanding the Range of Authors: Genre Fiction Writers


Now, let’s move to the broader dataset. The first discovery might be that the gendered pronoun disparity can be wildly lopsided in adventure fiction and westerns: 


Text Ratio of masculine to feminine pronouns


Zane Grey, The Young Pitcher 630 to 1

Zane Grey, Ken Ward in the Jungle 433 to 1

G. K. Chesterton, The Man Who Was 

Thursday 139 to 1

H. G. Wells, The First Men in the Moon 104 to 1

John Buchan, Prester John 101 to 1

Lord Dunsany, The Gods of Pegana 57 to 1

L. Frank Baum, The Master Key 47 to 1

Dhan Gopal Mukerji, Kari the Elephant 44 to 1

G. K. Chesterton, The Man Who Knew 

Too Much 43 to 1

Jack London, The Call of the Wild 18 to 1

John Buchan, The Thirty-Nine Steps 15 to 1

Dorothy Sayers, Lord Peter Views 

The Body 5.8 to 1

Agatha Christie, The Big Four 4.0 to 1


The scale of lopsidedness is pretty vast – and consistent – with early 20th century men who wrote westerns, science fiction, and detective fiction all showing a highly lopsided, man-centered social world. (I ran hundreds of titles for this study, and am only including a few noteworthy titles on these tables; readers who want to see the raw data can find it here; note that it contains texts that are not works of fiction--I've been disregarding those in the present study) Even women who wrote detective fiction tended to show a version of it, though Dorothy Sayers’ Lord Peter Views the Body (at 5.8 to 1) is still much less imbalanced than something like The Young Pitcher (another narrative of a young man at school, with no girls or women about). 


And what about woman-centered genre fiction / popular fiction? 


Text Ratio of feminine to masculine pronouns


Rokeya Hossain, Sultana’s Dream 10.8 to 1

Vita Sackville-West, The King’s Daughter 5.7 to 1

Edith Wharton The Old Maid 5.4 to 1

Elinor Glyn, Man and Maid 5.0 to 1

Gertrude Atherton, The Living Present 3.8 to 1

L. M. Montgomery, Anne of Green 

Gables 3.2 to 1

Somerset Maugham, Liza of Lambeth 2.9 to 1

Louis Bromfield, The Green Bay Tree 2.4 to 1

Edith Wharton, The House of Mirth 2.4 to 1

Zane Grey, The Call of the Canyon 2.4 to 1

Temple Bailey, Judy 2.0 to 1

H.G. Wells, Ann Veronica 1.8 to 1



Again, while there are some texts that are highly woman-centered (Sultana’s Dream is, famously, a feminist utopia with men kept in enclosures, while women run the world), the imbalance for romance fiction writers like Elinor Glyn or girl-oriented children’s fiction writers like L.M. Montgomery (Anne of Green Gables) is considerably less pronounced than with their counterparts who were men. 


Given how lopsided Zane Grey generally is, it is interesting to see one of his novels here (a shell-shocked World War I veteran moves to Arizona and has to choose between two different women). It’s also noteworthy to see an instance of H.G. Wells’ “new woman” fiction here. (Again, if anyone would like to see the full / raw data, it is here.) 


Quick conclusions; Next steps in the analysis?

Admittedly, this is a fairly crude method. At most, it shows some general patterns and trends, and confirms (albeit with a very small sample of texts) what Underwood/Bamman/Lee claimed using a much larger statistical model. Here is Underwood in Distant Horizons

“It turns out that women are consistently under-represented in books by men. On average, only a third of the words men use in characterization are used to describe feminine characters. Women writers, on the other hand, spend equal time on fictional men and fictional women. This difference remains depressingly constant across two centuries, and it may help explain why books by men tend to have more stereotyped gender roles.” (Distant Horizons, 127)

For me, the next steps might not be more quantitative queries. Rather, I am curious to look at the anomalies and exceptions in the early 20th-century corpus to try and learn more about what might have been going on, perhaps the old-fashioned way (i.e., actually reading the novels in question). For instance, for a writer who was so dramatically lopsided towards men otherwise, how did Zane Grey's  The Call of the Canyon feature women's voices in a 2:1 ratio? What was he doing differently here? (Especially curious since for someone like Hemingway, fiction responding to the psychic effects of World War I was often overwhelmingly oriented to men.)

Also, for writers like E.M. Forster and Wallace Thurman, both writers of literary fiction who lived their lives as closeted gay men, it is intriguing to see they both wrote novels with women protagonists who scored fairly high on the second table above. It might be interesting to gather together other novels written by cis-identified men with women as protagonists. Are there any patterns that can be gleaned from them? 

Finally, I'm curious about the representations of animals in the corpus. It's striking that a big part of the reason Hemingway's "The Undefeated" and Jack London's The Call of the Wild appear so lopsided in terms of gendered pronouns is that the vast majority of the animals are gendered male in both texts (alongside the human protagonists of those stories, of course). It might be interesting to make a small corpus of animal-oriented fiction from this fiction and study how animals are gendered (perhaps adding in more emphasis on non-gendered pronouns...).










2025: My Year in Books


1. General Interest Recommendations


Arundhati Roy, 
Mother Mary Comes to Me. This was a standout for me this year -- Roy's beautifully written memoir of her rocky relationship with her mother. It is also a compelling intellectual autobiography that follows the arc of Roy's career, from her early days (training as an architect; acting in and then writing for films and television), to her more contemporary social justice interventions. The God of Small Things was a work of fiction, but every major character was based on a real person, and many of the difficult things that happened to the children in the novel are based on events experienced by Roy and her family. I especially appreciated the section in Mother Mary Comes To Me on the architect Laurie Baker, someone I'd not heard of before. 

Even now -- and after many, many years of teaching books like The God of Small Things -- I've still never seen Roy's early films (Massey Sahib, directed by Pradip Krishen; In Which Annie Gives It Those Ones, which Roy wrote; and Electric Moon, which, frankly, I'd never even heard of!)

Massey Sahib (1989) is a kind of loose adaptation of Joyce Cary's Mister Johnson transposed to India; there's a version of it on up on YouTube here.

There's a version of In Which Annie Gives It Those Ones (1990) here. (This film, which is based on Roy's experience in a school of architecture in Delhi in the 1970s, seems like the place to start)

I don't see any versions of Electric Moon (1992) online. (Probably ok; in her account of it in the memoir, Roy suggests that this film, a hybrid British-Indian production made with BBC funding, was a bit of a misfire.)

Caoilinn Hughes, The Alternatives. File under: thoughtful climate fiction. A readable but somewhat idiosyncratic novel of ideas; what would it really mean to move to rural Ireland and drop off the grid? What sacrifices would it require, especially in terms of your personal relationships and your family? At the center of this smart novel are four sisters, each with a Ph.D. -- one a philosopher, one a geologist, one a caterer, and the fourth a political scientist. The debates between the sisters form the core of the novel. Some of the philosophy might be a little abstruse for readers (Kant!), though Hughes does find ways to make it accessible enough and relevant to the core ethical dilemmas she wants to explore. 

Charlotte McConaghy, Wild Dark Shore. File under: climate fiction + thriller. A novel set on a remote island outpost near Antarctica (Shearwater Island), with a group of caretakers whose main job is to protect a doomsday seed bank. The novel has the stylized language and lyricism of literary fiction, though in the second half it turns more into a thriller plot. Overall, it made me curious to visit the place itself, though given its remoteness that seems far-fetched. (Let's start by getting ourselves to Australia or New Zealand first...)

Percival Everett, James. I'm guessing most people in my circle have read this brilliant rewriting of Huckleberry Finn from Jim/James' point of view -- it was on everybody's top ten lists last year. I finally read it this year; it's very good. I especially liked the investment in James' interest in writing his own story: "With my pencil, I wrote myself into being. Wrote myself to here." Also: "I can tell you that I am a man who is cognizant of his world, a man who has a family, who loves a family, who has been torn from his family, a man who can read and write, a man who will not let his story be self-related, but self-written." This theme of the novel reminded me of other 'postcolonial' texts that write back to the Anglo-American Canon -- and that thematize the act of writing as a central part of coming to own one's subjectivity (see: J.M. Coetzee's Foe). I've never taught Uncle Tom's Cabin, but if I were to do that in the future, I would do it alongside James

Introducing: A Modernism Text Corpus [aka Early 20th Century Literature Corpus]

I thought it might be worthwhile to create a textual corpus collecting out-of-copyright materials from a broad range of authors from the early 20th century -- 1900-1930. The idea is to include publication information, genre classifiers (literary fiction, romance fiction, detective fiction, etc), and some topical tags (World War I, gender/feminism, etc). I am also including selected texts from the earlier African American Literature and Literature of Colonial South Asian corpora I created a few years ago in the new corpus.  

Short version:

  • The corpus containing out-of-copyright works 1900-1930 is here; the metadata file (also very important!) is here.  

  • As of May 2026, it contains about 1100 texts by about 215 authors -- British, American, Irish, Canadian, New Zealand, Australian, Indian, and a few others. Some dual nationality authors (Joseph Conrad was born in Poland but became a British citizen) are indicated with both of their nationalities. 

  • Of those 1100 texts, about 130 are by African American writers. 
    Of those 1100 texts, 14 are by Native American/Indigenous writers.
  • Of those 1100 texts, 3 are by Latinx writers, including one text published in Spanish.

  • About 340 of the texts in the corpus are by women, and about 30-40 texts are by authors who may be today understood as Trans or Nonbinary (marked as NB in the metadata), including Radclyffe Hall, Bryher, and Gertrude Stein (on Stein's transmasculinity, see Chris Coffman's book). As I continue to expand the corpus, I will be prioritizing authors who were women (though I believe the current 3:1 ratio, imbalanced as it is, is fairly close to the historical average for this period). 

Precedents: There is a small modernist (mainly high modernist) corpus created by a group in the UK here. I corresponded a bit with the curators of that project, though in the end I created my own corpus from scratch. And the US Novel Corpus at the University of Chicago is here (1200 texts are open access). In truth, it is not very easy to use; I have looked at their metadata file for reference, but I have not used their texts in building this corpus. 


Basics: What is a Textual Corpus? 

A textual corpus is a collection of texts, typically in plain text format, arranged to be analyzed in various ways, sometimes using quantitative methods. (I also think text corpora can be used by scholars doing more traditional thematic and historicist research, especially if the materials are tagged. More about that below.)

The first major creators of textual corpora were computational linguists, who have studied large-scale linguistic phenomena in corpora constructed within a given language. More recently, digital humanities scholars have been working with corpora of specifically literary texts, often with methodologies that borrow from or gesture towards linguistics. For instance, can we infer author gender in a large corpus of novels to ascertain patterns in the demographics of fiction over time? Can we use certain linguistic patterns to ascertain the genres of novels within a larger corpus?  

While anthologies and archives (including digital archives) have traditionally been designed to represent the most important and meaningful texts in particular geographical, cultural, and historical contexts, textual corpora often eschew questions of literary value in the interest of maximal inclusivity. Many quantitative methods rely on large-scale corpora to achieve statistical viability, and to answer questions about patterns in language usage, the fact that a particular book of poetry was critically well-received and another was not might be less important than the fact that both were published at a certain time and place. In our collection, we have aspired to maximal inclusivity, incorporating materials that the editorial tradition might have overlooked, such as 'minor' texts by 'major' writers, as well as writing that has entirely fallen off the critical radar. 

What is in this Early 20th C. Text Corpus?

The idea is to collect materials from recognizable high modernists like Virginia Woolf and James Joyce, alongside African American writers, Indian writers like Rabindranath Tagore and Cornelia Sorabji, as well as a sampling of genre fiction (including detective fiction, historical fiction, adventure fiction, science fiction, romance, westerns, etc.). 

So: everything from Jack London to Edith Wharton to Georgette Heyer to Langston Hughes. 

The goal is to produce a collection that could be useful to people doing quantitative analyses of these materials, but also to scholars doing conventional historical scholarship on the literature of the period. 

I've been creating thematic tags and genre classifications as I go, so that people interested in just writing by modernist women, for instance, could sort the collection that way (see the metadata file). Similarly, people interested in just African American poetry could sort the collection that way as well (using the Af-Am poetry folder). Other topics I've started tracking are materials related to World War I, materials related to colonialism and empire, LGBTQIA materials, disability, and the environment. 

(Note: tagging is at a very early stage thus far. I would welcome help and contributions from any readers who have specialist knowledge about any of the topics mentioned above.) 

Having these topics represented in the metadata was important to me; it's one reason why I've found existing textual repositories online insufficient. Project Gutenberg, for instance, has in recent years dramatically improved its approach to data about original publication, but many texts in their collection continue to have no information about publication date or the publisher name. I wanted to make a collection where all of that information was added back in. 

How to access the corpus? 

This is a work in progress; it has been steadily growing over the course of 2025-2026. The whole corpus can be found here. You can either work with the materials in that Google Drive folder, or download the whole folder to your computer.

As I've been going, I've been drawing largely on digital files at Project Gutenberg, Archive.org, and HathiTrust. I have been cleaning the Gutenberg files of header and footer boilerplate, though admittedly there may be some files in the folders that have not been cleaned. 

As important (or more important) than the collection itself is the metadata file, with information about the texts. I'll say more about the metadata file below. 

Licensing and Use: This work is licensed on a CC-BY-NC basis, meaning you are free to download the whole project and use it for your own research, though I would like to be credited. Also, I ask that all uses of this corpus and associated metadata remain non-commerial. 


1. "File Under"/Folders: 

On the Google Drive, I have been subdividing files into folders to make them more useful to conventional, historically-minded scholars.

Literary Fiction-High Modernism. Essentially what you would expect -- texts from 30-40 prominent modernist writers from the UK, Ireland, and the U.S., with a few less well-known figures like Hope Mirrlees. Writers like Arnold Bennett would not have called themselves "modernists," but they were definitely in the ballpark of literary fiction and in conversation with modernists. One reason to have this classifier might be to distinguish / compare writers like Ford Madox Ford, Ernest Hemingway, Virginia Woolf, Sherwood Anderson, and so on, against writers associated with Genre Fiction.  

Genre Fiction, including Science Fiction, Detective Fiction, Adventure, Romance, Horror. This period was of course the Golden Age of Detective Fiction, with Arthur Conan Doyle writing at the fin de siecle and writers like Agatha Christie and Dorothy Sayers emerging in the 1920s. Writers like Doyle and Wells both straddled the late 19th and early 20th centuries; ultimately, I will probably aim to put their pre-1900 works in an appropriate folder for people doing author-based work. You'll also see out of copyright materials by people like A.E.W. Mason, H.Rider Haggard, Georgette Heyer, etc.

Drama. As of the present moment, I haven't been actively seeking out dramatists to include in this folder; it mostly consists of plays written by authors who were primarily not playwrights (such as Yeats), though there is a pretty good collection of Somerset Maugham plays. 

African American Fiction. For more on this collection, see this earlier description of my African American materials

African American Poetry. See the link above.

Colonial South Asian Texts. For more on this collection see here

Nonfiction and Essays (including Travel narratives, Memoirs, and Literary Criticism).


2. Metadata File.

I've been collecting the following information about the texts as I go. 

Author's name (Last, first)

Title of work

Year of First Publication

Year of Author's Birth. This is interesting and probably important. We see writers like Joseph Conrad who is often considered a "Modernist," but who was born in 1857. Most writers associated with high modernism were born between 1870-1890.  Virginia Woolf and James Joyce were born on the same year! Quite a lot of writers associated with the Victorian period -- Henry James, Rudyard Kipling -- were still actively writing and publishing well into the early 20th century.

Publisher (first publisher). Publisher information could be really interesting to explore; and again, the absence of this information has been a major limitation of Project Gutenberg's collection. Modernist studies scholars have long been interested in small presses like the Woolfs' Hogarth Press or Elizabeth Yeats' Cuala Press; it's revealing to see how and when writers worked with these presses, and when they published with big commercial houses like Macmillan. This information could be useful to scholars interested in the business/publishing side of early 20th-century literature. (It's interesting to see that many African American writers before the Harlem Renaissance used small and local publishers, as the major houses were typically closed to them.) 

Publisher location. I have also been keeping track of publisher location. This is not always 100% accurate, especially when books may have been simultaneously published in London and New York. I have been going from how the publisher is described on the book's title page. Besides New York and London, it's interesting to see the publishers in other locations, including Chicago, Indianapolis, Toronto, San Francisco, Dublin, and Calcutta.  African American literature before the Harlem Renaissance was mostly self-published on local presses around the country. It became more "New York" focused in the 1920s.  

Genre or Mode: Fiction, Nonfiction, Poetry, Short Fiction, Drama

Author's inferred gender: M, F, NB. As of now, I am understanding writers like Bryher and Radclyffe Hall to be nonbinary (NB). Others of course have complex relationships to gender expression (one thinks of Gertrude Stein, who has historically been identified as a lesbian, but who some scholars have been positing as transmasculine or genderqueer). This category may be revised or rethought over time. 

Author's nationality

File Under: Broad classifier, equivalent to the shelf in a bookstore or library: Historical fiction, Romance Fiction, Literary Fiction-High Modernism, etc. 

Location of Publisher: London, New York, somewhere else? 

Tags and Themes: Some tags I have been tracking: WWI, Travel ("Italy," "India" etc), LGBTQIA, Gender / feminism / suffrage, Disability, Environmental, African American, South Asian, Indigenous, Interracial, Passing

Provenance of Text: Gutenberg, HathiTrust, Archive.org, etc.

Again, the metadata file is very much a work in progress. Completing it may take weeks or even months, but I hope that when it's complete it will be useful to researchers. 

Modernist Studies Association 2024: A few notes

I was recently at the Modernist Studies Association Conference in Chicago. I've been going to the conference on and off for many years (going back to the early 2000s?). Lately, I've been going there to present on materials relevant to my digital projects. If interested, slides from my presentation are here.

I'm not going to try and give a comprehensive account of what I saw and did at MSA, but below are a few highlights. Overall, the vibe was good -- despite the wild week in US politics, everyone seemed eager to talk about their research. Indeed, in a few cases (especially with some of the material related to queer and trans writers), it seemed like there was a more intense relevance in light of the growing anti-trans tendency in public discourse. 


Saturday Keynote: Nella Larsen's Passing

It was fun to have the Saturday keynote be a screening of the 2021 Netflix adaptation of Passing, followed by a panel discussing it. The film was great (I hadn't seen it!), and the panel discussion following, with Rafael Walker, Pardis Dabashi, and Cyraina Johnson-Roullier, was lively and enlightening. My main takeaway from the panelists was that the film is a pretty faithful adaptation of the novel, but it's more optimistic about love and less pessimistic about the affect of racism on personal relationships than Larsen's book. 

 

Queer and Trans Writing

Panel attended: Transing modernism/queering modernism

Jaime Harker, University of Mississippi 

Chris Coffman, University of Alaska, Fairbanks 

Aaron Stone, University of Virginia 

Mat Fournier, Ithaca College Marquis Bey, Northwestern University 

Marquis Bey, Northwestern University

This was a standout panel. Papers on Bryher, Virginia Woolf's Orlando, Djuna Barnes' Nightwood, and more

The idea of thinking about Bryher as a trans figure seems especially worthwhile. Also, the paper on Orlando mentioned some recent adaptations of the novel, including a film called Orlando: My Political Biography as well as a 2019 opera adaptation by Olga Neuwirth.

There were also a couple of papers that were theoretical / auto-theory interventions on the concept of dysphoria, and the concept of gender itself (memorable phrase: "from gender dysphoria to gender euphoria"). 

At another panel I attended, I saw another paper dealing with trans issues -- Michael Mayne of Denison University. He had rewritten parts of his paper at the last minute to reflect the results of the election. (In recent years, 664 anti-trans bills have been proposed by state legislatures. In the recent election, 41% of the ads for Trump were anti-Trans ads. 

The Well of Loneliness is increasingly being read as a trans novel (including by scholars like Jack Halberstam and Leslie Feinberg). Mayne's emphasis was on the idea of transness as abjection in Hall's novel. He also mentioned Julia Serano's idea of "Effemimania" (a term I hadn't heard before), and Susan Stryker's idea of the prospect of trans writers reclaiming the "monster." 

*
At another panel, Pamela Caughie gave a presentation on "Bloombsury's Gender Politics," where she alluded to the painter Dora Carrington, who was not quite trans, though she did engage in some transgressive gender play, and who was certainly queer and polyamorous (key line: "How I hate being a girl! Tied with female encumbrances and hanging flesh"). 

Caughie also mentioned many other writers who were new to me, including Rosamund Lehmann (Dusty Answer, 1927), and Denton Welch (Maiden Voyage, 1943). 


Early Postcolonials

For many years, the MSA has been a welcoming place for people doing work on what we might think of as "early postcolonial" literature (1950-1980, roughly). This is the era of people like Naipaul and Lamming, Khushwant Singh, Mulk Raj Anand, C.L.R. James, etc. 

*
On the panel where I presented, Ben Fried gave a paper on the relationship between V.S. Naipaul and his publisher, Andre Deutsch. Deutsch was a Jewish immigrant who fled from German-occupied Europe. Deutsch and Diana Athill worked together to form a new publishing house (Allan Wingate), which published Naipaul and many other postcolonial writers. Throughout his early career, Naipaul struggled with the tension of being a highly culturally grounded writer at a time when publishers were looking for "universal" appeal. 

*
On another panel, I saw a paper by Rochona Mojumdar of the University of Chicago. She was interested in the dialogue between Mrinal Sen's early 1970s "Calcutta Trilogy" and radical Latin America in the "Third Cinema" movement -- specifically, Fernando Solanas' revolutionary classic, La Hora de Los Hornos (The Hour of the Furnaces), and Sen's Padatik (1973). There's an interesting moment of borrowing or appropriation in Sen's film -- where he takes the exact footage of police beating protestors that also appears in Solanas' film. 

*
I also attended this panel:

R28. Mediating Empire: Comparative Colonialisms, Comparative Media Studies


Chair: Jessica Berman, University of Maryland, Baltimore County 

Daniel Morse, University of Nevada, Reno 

Stephen Pasqualina, University of Detroit-Mercy 

Abhipsa Chakraborty, SUNY Buffalo 

Nasia Anam, University of Nevada, Reno


This was another standout panel, with papers on radio adaptations of Raja Rao's Kanthapura, CLR James' broadcasts on the BBC, and more. Recent scholarship on the BBC's radio broadcasts has really expanded our understanding of how postcolonial literature emerged as a new formation during and after World War II.


*
On Sunday morning, I was in a Digital Humanities Seminar, on "Modernism in/and as Data." It was a fun and productive discussion.