Author Archives: Georgette Keane

Getting Started with TEI

For DH week, I attended the ‘Getting Started with TEI’ workshop hosted by Filipa Calado at the Grad Center. TEI, or Text Encoding Initiative is a markup schema for representing the structural, renditional, and conceptual features of texts. For anyone familiar with HTML, the two look similar. However, HTML encodes how a text should appear on a page while TEI encodes the context of a text. Filipa gave a brief introduction to TEI, the guidelines for using it, and then we practiced encoding with pages from the Picture of Dorian Gray manuscript. 

Since TEI has its roots in XML, many of the same rules in XML apply when encoding in TEI, such as  proper nesting. Meaning, you must always close the last tag you opened before closing a tag you opened previous to the last one. Ex: <sentence> <emphasis> </emphasis> </sentence>. Structurally, every TEI document begins with an XML declaration, or DTD (Document Type Definition). This declaration is necessary for a computer to read TEI. TEI documents will consist of two parts, the Head and the Body. The Head describes the source text’s metadata and includes the following elements: <TEI>, <teiHeader>, <fileDesc>, <titleStmt>, <title>, <publicationStmt>, and <sourceDesc>. The Body is the main section of the document and is what you will see when the TEI document is transformed. The Body section begins with <sourceDoc> and will feature such elements as <add> <del> and <line>.

Although it does sound like a lot of work, the end product is definitely worth it. Filipa shared pages from Mary Shelley’s Frankenstein manuscript from the Shelley-Godwin Archive. Using TEI, the project team was able to decipher Mary Shelley’s draft, while including Percy Shelley’s revisions in red. 

Before starting an encoding project, Filipia advises that it is important to think about your goals of the project and consider not just the document, but your audience, your research goals, and how you wish to represent textual data. She offered some guiding questions from the Women Writers Project (WWP) for preliminary document and project analysis:

  1. As far as you can tell, how is the document structured?
  2. Are there any kinds of regularization or editorial amendment you will perform as you transcribe the text?
  3. How much information about the appearance of the document do you need to capture?

Keeping these questions in mind, we then evaluated pages from the Picture of Dorian Gray manuscript, and tried to encode pages 20 and 21 in a text editor. We uploaded to a Google Drive and compared our results. I really enjoyed this process, but found myself getting stuck on the crossed out sections, trying to decipher the writing underneath. Something that Filipa recommended to me was to use the ‘<gapreason=’illegible’>’ when encoding. Filipa then showed the section she has been working on to check our work. There were a number of revisions on the page, and Filipa was able to get most of them. Some writing is still illegible and may need to undergo a similar process that palimpsests do to decipher the text.

If anyone is interested in TEI, Filipa shared a number of links, like the DARC wiki on Text Editing, and The Letters of Vincent van Gogh project.

 

Georgette’s Skillset

Research: Since graduating with my MLS with a concentration in Archives and Cultural Materials, I am currently the Library and Archives Curator at a historical society in NYC. As the only librarian on staff I oversee all operations of the library and archive. In addition to basic cataloging and processing work, I contribute to outside research projects by combing through our collections for relevant materials. Requests usually come from researchers writing a book, filming a documentary, or creating a digital project. This usually leads to digitizing the selected materials and providing metadata. Two of the most recent projects I contributed to was Maynooth University’s Letters of 1916 Digital Project and a documentary called ‘De Valera in America.’

Outreach: A very important part of my job is outreach. I collaborate with the Board and Executive Director on fundraising, programming, and exhibitions based around the Society and its collections. I also create and update content on our website (WordPress) and social media platforms and would be happy to work on outreach for this project.

Project Management: As the only librarian on staff, I am used to working on projects solo or with interns and volunteers. This has led me to become extremely organized and understand the importance of time management and creating project goals that will allow for completion (or a change of staff) after a semester.

Design: I feel that this is my weakest point. I do have prior experience with WordPress, Omeka, and other archival software, and minimal knowledge of CSS and HTML. Since starting my DH coursework, I have some experience with ArcGIS StoryMaps and Tableau Public. I am eager to learn more during this semester.

Exploring Diversity in the Newbery Medal and Honor Books

Overview
Literacy is essential to a child’s development. Through reading, children expand not just their vocabulary but their understanding of the world around them. But can children really learn from books if a majority of groups and topics are misrepresented or ignored? Recent studies have shown that there is a lack of diversity in children’s books. And while there have been initiatives created to address this issue, the fact that children do not have access to all of these books is something to consider. But what about the books they do have access to?
This project will explore diversity in the most popular children’s literature books, the Newbery Medal and Honor Books. Data collected from the four hundred and fifteen Newbery Books will seek to answer the following questions: Do the Newbery Medal and Honor Books provide an accurate representation of diverse backgrounds and subject matter? If so, has this been a recent development? And are there any trends of note in the honorees? The project team will attempt to answer these questions by collecting the biographical data and subject matter of all four-hundred and fifteen ‘Newbery Honorees’ (both Medal Winners and Honor books), and use Tableau Public to create a digital visualization of their findings and share with the project’s intended audience of librarians, educators and the DH community.

Enhancing the Humanities
In a study by the Cooperative Children’s Book Center (CCBC) of three thousand books published in 2018, fifty percent of books featured a white main character. Twenty-seven percent of books featured an animal, and African American, Asian Pacific American, Latinx and American Indians/First Nations were featured a total of twenty-three percent. Sarah Park Dahlen and David Huyck, who presented these findings in an infographic to School Library Journal, argue that children’s literature continues to misrepresent underrepresented communities. But their hope is that their findings push conversations about this issue and lead to a change in publishing. And while there have been initiatives created by the American Library Association and children’s book publishers to address this issue, the fact that children do not have access to all of these books is something to consider.
School and public libraries offer children (and their caregivers) access to a vast number of books that they would never be able to purchase for themselves. And more people are going to public libraries each year. According to the 2016 Public Libraries Survey Report by the Institute for Museum and Library Services, more than 171 million registered users visited public libraries over 1.35 billion times in 2016. Even with this increase in patrons, librarians often deal with limited budgets and shelf space, so books must be carefully chosen. Librarians will often rely on book lists and reviews for guidance on purchasing, and the books usually topping these lists are the Newbery Medal and Honor books.
First awarded in 1922 to encourage original creative work in the field of books for children, the Newbery Medal is awarded to the author of the most distinguished contribution to American literature for children. The Newbery Medal is the most popular award presented to children’s books, and studies have shown that after the winners are announced, book sales can increase up to 1,000%. Honorees are highlighted on ALA websites and accompanying book lists, and librarians will often feature honorees in their display areas and programming. Children (and their caregivers) become exposed to these works that may or may not help them to understand and handle situations that deal with diversity in religion, race, gender, etc. And these books, for better or worse, usually stay on library shelves much longer than other books due to their status as honorees.
Since the Newbery Medal and Honor Books are so popular amongst the public and librarians, the questions this project hopes to answer are do these books provide an accurate representation of diverse backgrounds and subject matter? If so, has this been a recent development? And are there any trends of note in the honorees?

Environmental Scan/DH Context
Finding similar projects has been difficult, as projects tend to focus on analyzing diversity in the most recent children’s books published, or creating a book list that focuses on a particular group or theme. There are journals that investigate diversity, such as the Research on Diversity in Youth Literature (RYDL). RYDL is a peer-reviewed online journal hosted by St. Catherine University’s Master of Library and Information Science Program and University Library. The publishing community has also recognized the general lack of diversity and have started new initiatives to tackle the issue. Scholastic created the catalog, The Power of Story, that offers recommendations for books representing diversity of race, sexual orientation, gender identity and physical and mental abilities. In regards to digital projects focusing on diversity in children’s books, a good project is the Diverse Book Finder. This site collects information on picture books that feature black and indigenous people and people of color (BIPOC) from 2002 to the present. The themes given on the site are Genre; Categories; Settings; Tribal Affiliation/Homelands; Immigration; Gender; and Race/Culture. An issue with the site is that it only tracks fiction and narrative nonfiction picture books from 2002, and only books with suggested reading levels kindergarten through grade three. This visualization project will be unique in the fact that it analyzes all four hundred and fifteen Newbery Honorees, and that it will be an interactive visualization where users can search for specific information on authors, themes, and main characters.

Work Plan/Final Product
The project will consist of three stages: gathering the data, organizing the data into the pre-approved format, and then analyzing the data using the visualization software Tableau Public. The team will organize the data into the following eight categories: Year; Winner/Honor; Title; Author; Author’s Gender; Author’s Race; Main Character(s); and Themes. The first four categories are available on the ALA’s Newbery site. The team will have to find the author’s gender and race either on the authors’ websites, publishers’ sites, or an internet search (author interviews, etc.). The books’ main characters and themes will be found with the Library of Congress’ and New York Public Library’s bibliographic records.
Gathering the data will be the most time consuming part of the project, therefore the project team will use an existing software to display their results. Once the team has organized the data, they will use Tableau Public to create a data visualization of their findings. Tableau Public is a free service that allows users to create and publish data visualizations. Tableau Public users do not need programming experience, and there are many tutorials and a dedicated community available to assist the project team. Published visualizations are available to the public, and can easily be shared through email, social media and on websites. Once the visualization is completed the project team will analyze the findings and write a paper on their process and the results.