Newbery Group Update

In our group meeting, we reported our progress on our individual tasks, as well as the work that needed to be done due to our expansion of the project.

  • The team is very happy with the overall design of the website. Emily reviewed the suggested sites and recommended we create additional pages, including pages for our visualizations and infographics.
  • Meg wrote our first blog post and will post it on the website by the end of this week. We discussed possible content for future blogs and how often we should post. Besides recommending books, we will write project updates and in-depth features. For example, a post examining how African Americans were portrayed in the Caldecott books with selected illustrations. Meg will handle blogging, although any team member can contribute.
  • We discussed the importance of a strong social media presence for our project. So many award authors and organizations are using social media right now, which makes this is the perfect time for us to connect with them and raise awareness of our project. Meg will handle our social media accounts.
  • Kelly provided an overview of the Caldecott data that she scraped and cleaned. The data suggests that there are earlier instances of diversity (in contributors and subject matter) in the Caldecott books as compared to the Newbery books. Also, several illustrators were honored multiple times so collecting identity information will be quicker than with the Newbery authors. We will follow the guidelines we set for the Newbery data when finding identity information for the Caldecott.
  • I discussed how my search for historical data on children’s publishing is faring. Besides the statistics from the CCBC, I found articles from 1965 and 1985 that includes data about African Americans in children’s books. The data from the 1965 article seems to be flawed, because the questionnaire asked for books including African Americans instead of about African Americans. For example, a biography on Abraham Lincoln was counted as a book including African Americans since Lincoln issued the Emancipation Proclamation. The data from the CCBC however is books about African Americans. I will continue my search for data, but the 1965 article could be useful for inclusion in a blog post.

This week the team will focus on the following:

Emily and I will collect Caldecott author and illustrator identity information. Kelly will go back to working on the Newbery visualizations, which we will eventually send out for feedback. Meg will create the slides for the class presentation next week and post content on our social media accounts.

Heritage Reconstructed research update


I want to reflect about what we have done in the research area since the beginning of the semester. I haven’t seen it yet in the webpage because we have our group meeting tomorrow, but I know that Ashley and Chris uploaded the first two VRs on our website, which is exciting. One of the discussions we had at the beginning of the semester aimed to define what criteria we would consider for searching and mostly for selecting what VRs will be included in our database. We knew that what is available is not a reason per se, and even if we wanted to include what is available, we needed to find a narrative that explains why this is a reason for including a VR in our database.

Our aim was to create a Virtual Reconstruction database of archeological sites or objects. Since this was still too broad, we decided to focus on archeological sites in peril due to environmental damages, armed conflict or war, lack of investment in its preservation, earthquakes and other natural disasters, pollution, and poaching. This delimitation of the scope of the sites that we include in our database is important because it gives our project a strong conceptual lens and a clear standpoint to engage our VR digital project with the VR projects already being developed by different individual and institutions: scholars in universities, people in the game industry, artists, start-ups, among others. The second discussion we had is whether we would focus on a country and region of the world, and whether we would focus on a particular historical period. We decided to leave these options open and it turn out to be a good decision, which will be explained below. The third discussion we had is whether we would include natural sites in danger, e.g., parks, besides archeological sites or objects in peril. We decided to include only VRs of archeological sites, that is, sites that required human intervention and creation, for example, architecture, art, and/or religious buildings. Once we defined these criteria, the purpose of our VR digital project became much clear for all of us, not only in a practical sense, but rather it gave us a better understanding of the project we wanted to create. We needed also a new name, which is: Heritage Reconstructed: Virtualizations of Sites in Peril.

We began by exploring UNESCO Word Heritage in danger list. The list includes 53 sites in danger and includes cultural (archeological) sites and natural sites. We only focused on archeological sites. We made a search of VRs available online country by country, using different key words to make the search. Having completed the search following the UNESCO’s list, we identified two patterns: one, there are not many VRs for the sites the UNESCO considers in danger. Second, the VRs available are about countries and archeological sites that come from the same geographical region and mostly are in danger, or were destroyed, by war and terrorist groups, e.g., Irak, Libya, Syria, and Afghanistan.

We also included VRs sites that randomly appeared in our search thanks to the work of a wise algorithm. We found VRs that are publicly available, but also VRs that were created by the game industry; by start-ups, such as ICONEM, which dedicates to the digitization of endangered cultural heritage sites; Rekrei, a crowdsourced project, which also creates 3D representations of sites in danger; and CyArc-ICOMOS-Google created five VRs of sites taken from UNESCO list, which are mostly in danger due to climate change. We still have to figure out whether we will be able to upload the VRs these organizations have in their sites in our digital project or we will have to include just the link.

As noted previously, Ashley and Chris uploaded two VRs in our website. The following task for us is to review the VRs we have collected and decide which ones we will include, and contact the organizations I mentioned before, which have many VRs on archeological sites in peril, and ask them whether they allow us to upload their VRs in our website.



Kelly’s Reflection: Week of 3/16 – 3/22

This week was all about bandwidth and RAM, both literal and metaphorical. I had trouble connecting to our class’s Tuesday Zoom session—certainly a matter of too many devices and too many programs demanding too much of a limited Internet connection in my getaway in Kentucky. I also had trouble working on the visualizations this week—certainly a matter of too many worries and too many plans to make demanding too much of a brain that would ordinarily be on spring break, my annual reboot.

So, eager not to hold my team back with my distractedness, I turned to low-level but time-consuming tasks: scraping and cleaning data. Our team is interested in comparing our current data set of Newbery authors and protagonists to data about other awards. So, I scraped the basics on the next most recognizable children’s book honor: the Caldecotts, which recognize excellence in picture books.

The scraping experience reminded me that most online python tutorials work with best-case scenarios. The videos that taught me how to scrape earlier this semester drew from huge, well-established websites (the New York Times and monster.com) to demonstrate the power of the code. The few sites offering a full list of Caldecott winners were less established, and the HTML was erratic at best. The site from which I scraped the Caldecott honorees fortunately organized those winners in lists, so I could find all <li>, but within those lists, they only sometimes embedded the title in anchor tags, and they often included manual spacing and tabs for no apparent reason. About half of the time, the books were illustrated by someone other than the author, so splitting the results by the word “by,” as I was able to do for the Newberys, got tricky. So, my code just grabbed the titles and attempted to grab authors, resulting in a two-column .csv file in desperate need of real cleaning—a far cry from the comparatively tidy results of the Newbery scrape. But that kind of cleaning—split screening the data and the original site and checking manually for errors—was exactly the kind of mindless labor I needed. Now, we’ve got the years, titles, authors, and illustrators, and already, without further research, we can see that the Caldecotts are much more diverse than the Newberys. But of course, further research is what we now need as we identify author and protagonist gender and race/ethnicity more precisely.

Another great boon this week: Meg reminded us that what we’re doing matters. She crafted an initial blog post for our website intended to remind our users that as children consume books in isolation—away from school and peers and the outside world—that parents need to make sure that those pages reflect themselves and others. If children only read from a small slice of the literature available, they will be isolated indeed. We hope our project can help parents make well-informed decisions this spring, when reading might be, in some ways, the only contact kids have with the outside world.

(On a side note, I’ve remembered this week what we learned last term in the Intro to DH class: that the data infrastructure in the US allows us access to our jobs and each other in this time of crisis in a way that few other countries’ infrastructure can. I’m wondering how we might use that access to support those without it. Thoughts?)

HR Update

Considering everything that is going on, Heritage Reconstructed has had a productive two weeks. Our site is live! Head on over to https://hreconstructed.github.io/ to check us out. In addition to our website, we are proud of the accounts we have for our project, so far. We are on Gmail – heritagereconstructed@gmail.com, Omeka – https://hreconstructed.omeka.net/, GitHub – https://github.com/hreconstructed, and of course our most public media, Twitter – @HReconstructed. I love our name, and we took hours of discussion and debate on coming up with it together, and the transition to hreconstructed is totally logical. However, that was a side effect that we didn’t necessarily consider initially and due to this, there is not 100% continuity throughout our media names. My initial thought when I noticed this was, “let’s just change our email address to hreconstructed@gmail,” but we have already started outreach. So, that would possibly be counterproductive.

The website is a conglomeration of Ashley and I’s efforts, and we, and the team, are proud of it. The site went through a couple of editions before it became the site that is live today, and through that process, Ashley and I learned, as well as, tightened our HTML/CSS skills. We were also forced to home in on what we wanted to get out of it for the project, which we are for the most part done with.

The website is written in HTML and CSS with a link to our Omeka database. We initially, included the CSS for the page within the HTML code of each of the pages, but this became daunting to change repeatedly for all the pages. So, the CSS was broken into its own page. A major reason for the CSS to have its own page was the lengthy code for our footer.

The pages of the website have a color and font scheme that is meant to replicate that of Omeka’s database. Which is the main portion of our project, with the website acting as a landing page for details of our project. There are plugins for making pages in Omeka, but we are both past and current students of Patrick Smyth. So, to push our knowledge that began in Software Design Lab is a sensible move for us. At times, its been difficult to get parts of the page to work when they would be simply rendered on a pre-built website, but having to work out the code is not only rewarding, it is able to give our page a setting that is one of a kind.

In addition to code learned through Software Design Lab, we utilized https://www.w3schools.com/ and https://webdevtrick.com for the footer. An issue we were having with the footer was that it was not responsive. The menu and body of our pages shifted when the webpage minimalized, but the footer we were initially utilizing did not. When the page is minimalized, the menu starts to stack on top of each other and our Twitter blog on the right side of our page drops below the text of the page. We looked for a footer that did the same and were impressed with the footer by https://webdevtrick.com/. Now the three panels of our footer become a list when the page is collapsed to a certain width.

For the rest of this week, we are working on a draft email that will be sent out for outreach and working on our Omeka database – which we are very excited about.

Newbery Group Update

Stage One (2/19-3/19)

  • Planning & Research: Research proper race/ethnic terms for protagonist. Research critical race theory and find articles on Newbery Awards and diversity in children’s literature.
  • Content Development: Complete the Google Spreadsheet of Author Breakdown. Complete the Google Spreadsheet for Protagonist Breakdown
  • Design: Sketch up outline of website: each page (Home, About, Methods, Data) using WordPress through Commons. Future Page Suggestions: Social Media, Suggested Reading, and Infographic.
  • Outreach & Publicity: Set up social media accounts for the project. RT news on diverse books (and other suggestions). Create an email address.

Referring to our Project Work Plan, we have met our milestones in Stage One. We researched critical race theory, overall diversity in children’s books, and found articles discussing the Newbery Awards. We completed the Author and Protagonist Breakdown of Newbery Medal and Honor Winners and are creating initial visualizations. We drafted a website and created social media accounts and an email address for outreach. (Instagram & Twitter: whowinswithbookawards; gmail: newbookaward@gmail.com.)

Since we are ahead of schedule, the team decided to expand the project to include Caldecott Medal and Honor Books in our analysis. We will finish collecting the data by next week, and will have visualizations for both Awards by the end of Stage Two (March 20-April 19).

Our primary focus this week is to scrape the remaining Caldecott data and manually collect author and protagonist identity information. We also hope to get a set of visualizations together that we feel are ready for outside feedback. We recognize that current global circumstances may slow the feedback process, so having the Caldecott data to play with will give us good purpose as we wait.

We will also post our first blog and create social media content promoting diverse award books parents can read with their children-thanks Bret for the suggestion. Regarding the website, we are researching accessibility and will make any changes necessary before asking others for feedback. This week we will look at the website examples Bret sent to determine if there are any additional pages or layouts we want to include.

A Preamble to Tomorrow’s Class Session

Hi Everyone,

One of you made an important point in an email to me that “we need to be aware that the goals we set up in February changed.” In a similar vein, Matt Gold said on Twitter this evening that we “need to … recalibrate academic expectations,” and that we can’t “continu[e] on as if nothing has changed but delivery methods.”

Let’s devote the first twenty minutes of  class tomorrow to discussing what impact the pandemic is having on us and how we should address it in our course and groups and projects. Naturally, our work plans are going to have to be updated to reflect this new reality in both ways we can predict and can’t yet foresee. The health and well-being of our class community needs to be a priority, and now is a good time to remember that while the products we produce in this class are important, the process — and above all the learning process — is what matters most.

I will convene a video-conference meeting of the entire class at 6:30 pm. We’ll meet for 20 minutes or half an hour at most and then break into our smaller team meetings for the rest of the session. Tomorrow, before class, I’ll post a link to a Zoom meeting via the Commons group. If we have trouble with that platform, we can switch to Google Hangouts.

I’m going to ask Micki to join each of your team meetings for a bit so you can consult with her. I will also join in with each group, but only so long as I’m not intruding on the important work you’re doing.

Stay well,


Ashley’s Reflection 3/15/2020

Well… this week has only been crazy due to the onset of the Coronavirus Pandemic. However, we are powering through. Luckily, as Digital Humanists, we have a slight advantage when it comes to handling the distance-learning model.

With class being online this week, our group was able to speak on the phone to go over what we had done in the past week and set goals for the week ahead. We were also able to meet for our last in-person meeting on Thursday to launch the website. There were a few hiccups but we eventually figured it out. If you are interested, the link to our website is hreconstructed.github.io. It is still a work in progress but it is a great step.

Chris and I are continuing to work on the website as well as diving into the start of using the Omeka Database. We will try to have one or two test cases before our next class so that we can see if there is more information that we want or need to add for each site, or if we have too much.

Since this pandemic, I have seen more and more cultural institutions creating virtual tours for visitors who are quarantined. Although our focus for this specific project is focused on sites in peril, this pandemic is more of a reason to have digital reconstructions of archaeological sites publicly available. This, of course, is besides the fact that there may be people who cannot travel during normal situations, though now everyone is forced to understand this feeling as well.

Kelly’s Reflection week of 3/9 – 3/16

Ode to Tableau, Part II

Last week, I extolled the virtues of Tableau, but I forgot one of my favorites: the open nature of Tableau Public. Thanks to the learn-with-us ethos of the site, the brilliant vizzes shared on the galleries are available to download and deconstruct. So, this week, I downloaded the public notebook of the site our team found most inspiring and germane to our work. I got to see how the data visualizer added her own text around the visualizations—something I had never done. I also got to explore the structure of her tooltips, which will allow me to include (eventually) some of the data worth exploring beyond our first glance.

This user’s work reminded me how simple and powerful waffle charts can be in conveying part-of-whole information to users. So, I used hers as models for how to share our data on author and protagonist gender and race/ethnicity. Then, in the playful nature of children’s literature and our audience of teachers, parents, and librarians, I created some icons in Illustrator to heighten engagement. (I did an online tutorial in Illustrator last summer, and was surprised what came back to me. Still, I’m very new.) And, I borrowed the color scheme from Emily’s great choice of graphics on the WordPress site she created.

Here’s what I have at this point:

Newbery Award Waffle Charts

The waffle charts weren’t easy for me, even though I had created one once before. I had to watch the first few minutes of a tutorial to refresh my memory. Even then, I had typos, which made for some holey waffles. I also miscalculated mathematically, thinking that a 10 x 10 waffle wouldn’t show less than 10% well, so I made a 20 x 20 grid—more work than I needed to do (and that I will most likely undo this week).

Further, because I knew the data from the bar charts, I was able to detect some flaws in the code that creates the calculated fields behind waffle charts. For example, the nonwhite protagonist percent is way too high. Right now, the code I created that decides if a protagonist is nonwhite reads:

IF [Protagonist Race/Ethnicity] != “white”
THEN “1”

Our intent with that chart is to show the number of human protagonists that aren’t white, but the code includes a nonhuman protagonist or a book with no protagonist in the count, inflating the percentage. The inverse is true with the female protagonist percentage which is currently too low, as the code counts only books with single protagonists that are female, not those with multiple protagonists that include a female. So, I’ll be tweaking this week, fixing these errors, incorporating feedback from my team, and exploring additional visualizations.

Kelly’s Reflection: Week of 3/2 – 3/9

This week’s post is an ode to Tableau, or at least to data visualization. If you haven’t gotten to play with process yet, let me share why I’ve come to love automated data viz.

The Magical
It doesn’t matter how well you know your data: the viz can surprise you. The data we scraped with Python (title, author, year), we didn’t know very well. But, the gender and race/ethnicity info, for both authors and protagonists, we had gathered painstakingly by hand—researching, categorizing, rethinking. We knew that data like we knew our own selves, and we had already drawn two basic conclusions: first, that the slight majority of Newbery authors were female, and second, that a greater majority of authors were white. But, once popped into the simplest of bar graphs in Tableau, the data stunned me. The awards are suffocatingly white—authors and protagonists alike. There’s no chance a kid of color could find herself in many of the books—in the pages or on the spine. Worse, there are more protagonists of color than authors of color, and we’ve already started to rely on Tableau’s tool tips to call attention to surprisingly recent white authors speaking to a nonwhite experience.

Newbery authors and protagonists by race/ethnicity

Screenshot of authors and protagonists by Census Bureau race/ethnicity. Note that yellow merely indicates Newbery Medal status, while gray indicates Honor status. White authors occupy the top row of the upper graph; white protagonists occupy the top from of the lower graph.

The early vizzes also pointed out that while there are more female authors, there are more male protagonists than female, so boys can see themselves more as the center of a story where the women are the spinners of them—something we hadn’t thought about.

The early visualizations also spoke, sadly, to the truth that it didn’t matter which of the racial/ethnic lenses we chose (though we went with the Census Bureau’s, so that we could make a point about its limitation). No lens would change the fact that white is the reality of the Newbery authors and characters.

The Mundane
Even if your first vizzes don’t yield these startling insights, automating visualizations in programs like Tableau can help identify errors in your data. And boy, did we have errors. Some were because we are human and make typos. Because we recorded our data in separate Google sheets, I made different decisions than Georgette and Emily when it came to issues that arose beyond the data’s architecture we had planned. While I thought I had caught and adjusted for them all as I merged their medal winners with my honors winners, Tableau said otherwise. Its agnostic eye registered white, wite, and White as three distinct races. We had other variances, such as how we registered our own doubt (?, ??, ???, and not sure), how we managed nonhuman protagonists’ genders and races/ethnicities (n/a, NA, and none), and how we categorized books with multiple or no protagonists (multiple, many, family, or male and female).

Yes, each exposed mistake meant more work. But often, the solution required returning to the very purpose of our project. What do we want to communicate, for example, about books with multiple protagonists? Is it more important to highlight the identities of center-stagers or ensemble casts? Did slicing the data of books with more than one protagonist into types of multiplicity clarify or fracture our findings? These are all great questions that we can start to answer with our user in mind.

The Message
This week, while I address the last of the data errors, I’ll begin the next fun (and scary) part: moving from simple visualizations to those that can really grip our audience, inviting them to explore for themselves. I browsed Tableau’s galleries for inspiration and found two in particular that I love: a woman’s personal reading history and a breakdown of gender and political affiliation in the House of Representatives. The former appeals to me because it is both interactive and powerful as a set of static images. Our group hopes to share with our users a printable poster to hang on a library or faculty room wall or to take to their principals, so this model looked pretty good. The government visualization was intriguing for its display of the same data in different ways, each geared to raise a different set of questions.

As we prepare to provide content for the website Emily is building, we know that the decisions we’ve made with the data and the early vizzes in Tableau will help shape our message and our direction. Already, in our weekly meeting, we’re discussing what other data we might gather—perhaps about other awards—that can help our users make good decisions, whether that’s the American Library Association who bestows the annual awards or librarians, educators, or parents as they make purchasing decisions.



The key to our social media strategy was identifying our primary audience. We are building a database to hold VR and digital reconstructions of sites and structures in peril. We knew that our topic was micro-targeting a very important issue where the primary audience was limited to people who already work in VR and digital reconstruction as well as people who are actively concerned about the environmental threat to archeological and natural structures. So the initial step of our outreach efforts was simply to find the online community where these groups of people engaged with one another. The most widely used platform for this type of  public academic engagement was twitter. So we created an email to register for a twitter account. The email will also serve inquiries and subscriptions from our website. This allowed us to discover a lot about the communities that will be the primary audience for our project and to think ahead about which institutional need our project fulfills in those spaces.

Social media strategy 

Our social media strategy is to engage with the 3D and Virtual reconstruction community and the environmental conversation community of scholars already on twitter by interacting with their work through likes, retweets and comments on a regular weekly basis. We want to make sure we are on top of news, discussions, and breakthroughs where a lot of conversation is being generated. We also generate conversation through new posts and add minor comments based on new articles that come into our inbox from the google alerts we set up. We regularly update our audience on our project. We also make strategic use of hashtags to bring people in the search tags to our page. We will also create a small marketing blurb to post about our projects on relevant facebook groups that will identify as the project nears completion.


Email strategy

Our email strategy is to organize a list of email addresses of public digital scholars and people working in the field of our projects into a sizable list.  We will gather these email addresses from academic communities that we have access to as well as public Linkedin and Twitter profiles. By the end of this week, we will have a draft email introducing our project that we will share with these members.We will have an automatic response email that links inquirers to our social media page where we update our project’s milestones.

Communication and Website

People will be able to reach out to us directly on twitter where our email address is listed and through our website where there will be a contact page that allows the public to ask questions and submit other requests. Our Website will also contain an about page that introduces our team members and gives specific details about the goal of our project. In addition our twitter feed will appear on the side of our site to allow people to follow us on twitter after they visit our site. The public will also be able to subscribe to our site to receive email updating them about the project.

We will also create a promotional  flyer with relevant information about our project to attach to email or give out in person.

Search Engine Optimization

A technical aspect of our outreach is to optimize pages of our website for google’s ranking algorithm. I’ve previously worked with SEO when I did marketing for a jewelry company and I helped place close to 7000 pieces of jewelry online. Although I used proprietary software to optimize our pages. I think it is worth trying to play with the algorithm in hopes of reaching a secondary audience. It certainly can’t hurt to try. Backlinks are proven to be one of the most effective ways of increasing google ranking, other than keyword optimization. Part of our strategy must be finding scholars who may be willing to link to our website from their blogs and other online public accounts (facebook/twitter)