• About
  • Comment Policy

Margaret Sanger Papers Project

~ Research Annex

Margaret Sanger Papers Project

Tag Archives: elizabeth drinker

Topic Modeling and the Sanger Papers

02 Tuesday Apr 2013

Posted by Cathy Moran Hajo in Digital History, MSPP, Sanger

≈ 3 Comments

Tags

digital history, elizabeth drinker, margaret sanger, martha ballard, text mining, topic modeling

sangerwriting-drawnI recently attended the Women’s History in the Digital World conference, sponsored by Bryn Mawr College’s Albert M. Greenfield Digital Center for the History of Women’s Education. The sessions were packed with great papers and projects, many of which started the wheels turning on different ways that we might use digital research tools to better understand Sanger and her ideas.

In the very first panel I attended, Bridget Baird of Connecticut College and Cameron Blevins of Stanford University, talked about topic modeling, the process of using a computer program to mine digital texts and build sets of words that frequently appear together. Their work compared the diaries of Martha Ballard and the Elizabeth Drinker. The women lived about a century apart and in very different conditions, so there was an expectation that their diaries would describe very different lives. The sample comparisons shown at the panel demonstrated both similarity in word usage and contrasts that reflected differences in social class, location, and time period.

A visualization of gardening terms by month in the Ballard diary.

What topic modeling can offer a historian is an objective snapshot of the content of the collection.  Rather than relying on our own readings of documents to combine them together into subject categories, we look instead to the words that appear together most frequently and then label those words in ways that make sense to us.  In the case of Martha Ballard, one cluster of words (birth deld safe morn receivd calld left cleverly pm labour fine reward arivd infant expected recd shee born patient) clearly related to her profession as a midwife. Others regarding gardening (see image above), fall into predictable seasonal patterns. Still other groupings of words are less easy to label, and some may not at first make any cohesive sense. Yet, we can study the frequencies with which certain groups of words occur.

We cannot rely only on the computer-driven groups to use in analyzing texts.  The next step is to look at the texts that contain repeating word patterns and conduct a close reading to see what we can learn about the topic. Plotting the topic over time enables us to locate trends in how important the topic was to the author, or when we compare them with other authors, we can investigate differences in the ways that two authors valued these topics or the different ways that they expressed themselves.

An example from the Ballard study is instructive, as Cameron Blevin discussed in his blog:

. . . topic modeling allows us a glimpse not only into Martha’s tangible world (such as weather or housework topics), but also into her abstract world. One topic in particular leaped out at me:

feel husband unwel warm feeble felt god great fatagud fatagued thro life time year dear rose famely bu good

The most descriptive label I could assign this topic would be EMOTION – a tricky and elusive concept for humans to analyze, much less computers. Yet MALLET did a largely impressive job in identifying when Ballard was discussing her emotional state. How does this topic appear over the course of the diary?

Like the housework topic, there is a broad increase over time. In this chart, the sharp changes are quite revealing. In particular, we see Martha more than double her use of EMOTION words between 1803 and 1804. What exactly was going on in her life at this time? Quite a bit. Her husband was imprisoned for debt and her son was indicted by a grand jury for fraud, causing a cascade effect on Martha’s own life – all of which Ulrich describes as “the family tumults of 1804-1805.” (285) Little wonder that Ballard increasingly invoked “God” or felt “fatagued” during this period.

Adopting topic modeling tools for the Sanger Papers’ Speeches and Articles project will be interesting as we have already spent a lot of time developing and affixing detailed subject terms to the texts in order to provide additional ways to search and display them. When you have over 600 speeches and articles, the vast majority of which discuss birth control, the trick is uncovering subtle differences between and among them. We create detailed index entries for each text in the edition, narrowing the focus in so that our readers can use the subjects to cut through the documents to find the best ones on a specific issue. Topic modeling can offer us some new groupings of documents that we might have overlooked, and it will give us the capacity to analyze Sanger’s rhetoric over time, looking for key changes.

An example might be the belief among women’s historians that Sanger abandoned her feminist rationales for birth control in the late 1910s and early 1920s as she sought support from experts in the fields of medicine, social work and eugenics. This comes from a qualitative reading of Sanger’s writings, not a strict quantitative one. If we can identify a cluster of words as “feminist,” we can then trace how frequently those words appeared in Sanger’s writings and whether the findings match our assumptions.

Will we find clusters of words we can describe with terms like “feminism,” “eugenics,” or “reproductive health”? What words will we find clumped with “abortion” or with “birth control”? Will we be able to trace these clusters over time to see how they change over the course of Sanger’s life? Interesting questions, and ones that we hope to be able to ask our digital edition.

Now just to find a programmer to work with!

___________

For more information on the work being done on Martha Ballard’s diary and topic modeling, see Cameron Blevins’ blog post.

Share this:

  • Click to print (Opens in new window)
  • Click to share on Facebook (Opens in new window)
  • Click to share on Reddit (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to email a link to a friend (Opens in new window)

Like this:

Like Loading...

How you can help

The Sanger Papers is a non-profit organization (501(c)3), hosted by New York University. Almost all project expenses are covered by grants and private donations. For more information, see our website, or make a donation online today!

Recent Posts

  • Comment on Removal of Sanger’s Name from Her Clinic
  • The “Feeble-Minded” and the “Fit”: What Sanger Meant When She Talked about Dysgenics
  • What Every Girl Should Know
  • Election Special: The Politics of Margaret Sanger
  • One Hundredth Anniversary of the Brownsville Clinic—A Media Opportunity

Categories

Abortion African American Birth Control birth control movement Birth Control Review Clinics Digital History Document Eleanor Roosevelt Eugenics Events Historical Legacy Illustrating the Insanity In Her Words Investigate IPPF MSPP MS Slept here Myths News People Places Politics Quotes Sanger Sanger Centennial Sex and Reproduction Uncategorized Whos who Woman Rebel

Like us on Facebook!

Like us on Facebook!

@SangerPapers

Tweets by sangerpapers

Archives

Blog at WordPress.com.

Privacy & Cookies: This site uses cookies. By continuing to use this website, you agree to their use.
To find out more, including how to control cookies, see here: Cookie Policy
  • Follow Following
    • Margaret Sanger Papers Project
    • Join 95 other followers
    • Already have a WordPress.com account? Log in now.
    • Margaret Sanger Papers Project
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    %d bloggers like this: