HTRC UX audit

In 2021 I was hired as HTRC’s first UX designer. The remote research center was founded in 2011, and it’s digital products were designed by highly internal processes that did not put a great deal of emphasis on external users’ needs (very few user tests, interviews, or assessments were conducted during the first decade of HTRC’s existence). My first challenge at HTRC was to lead a UX audit of HTRC’s online platform, HTRC Analytics.

TL;DR

At the beginning of the project, I made several predictions about user pain points across the various sites and platforms hosted by HTRC online services: confusion about what HTRC ‘worksets’ are, how to use them, and why they are necessary to use on HTRC Analytics. Because the HTRC Analytics site had originally been designed for advanced programmers in the digital humanities field and internal staff members, the foundational concept of the ‘workset’ was potentially lost on newcomers or beginners to the domain of text analysis. After conducting multiple layers of research, my hypothesis was found to be true, but for surprising reasons: 1. The HTRC Analytics homepage was overly broad and confusingly organized, 2. HTRC’s connection to the HathiTrust Digital Library was murky in users’ minds, and 3. HTRC’s user facing documentation was poorly organized and scattered.

The goal

The overarching goal of this initial UX audit was to help HTRC’s culture shift away from a largely internal design focus to an external, user-centric design philosophy. As an organization, we wanted human experiences and user satisfaction to shape our online presence (i.e., our HTRC Analytics site and corresponding wiki help documentation). This audit was designed to get a base-level understanding of where HTRC stood on user experience, and elucidate areas we will want to improve in the future.

My role

As HTRC’s UX designer, I was tasked with leading all audit and assessment activities. I designed five different research phases of this project which included uniting stakeholders to the common goal listed above; gathering and analyzing any preexisting user-related data sources from the previous ten years; forming assumptions of specific problems experienced by users; performing external research in the form of user interviews, tests, and surveys; and creating a list of recommended changes based on findings alongside a new user persona that highlighted the salient needs of a prominent demographic of HTRC users, academic librarians.

The problem

The usability of HTRC’s online tools and resources were originally designed for making internal job duties easier for staff; user needs and perspectives had become secondary. This audit was conducted in the spirit of fixing this broad problem: making a truly user-focused online presence.

Here are some images of the initial “current state” of the HTRC Analytics site at the time this research was conducted:

Images: HTRC Analytics homepage, top (1) and bottom (2)

Image: HTRC Analytics signed-in worksets list page; worksets are user-created collections of volumes held in the HathiTrust Digital Library and transformed to be treated as data for text analysis.

Image: HTRC Analytics Algorithms page, where signed-in users can access each point-and-click text analysis tool developed by HTRC.

Image: A list of a user’s data capsules and usage amounts

Image: The remote desktop users can access when inside a started data capsule environment. In this secure environment users can create their own scripts for advanced text analysis methods to use on HTRC worksets downloaded into the capsule.

What users should be able to do easily on the HTRC Analytics site

The most basic journey a user must be able to successfully make in order to use the tools provided by HTRC Analytics (with the exception of the Bookworm+HathiTrust tool, which does not require creation of a workset, or even an HTRC Analytics account) is to:

either create their own HTRC workset, OR choose a publicly available pre-existing workset created by another researcher or HTRC
run the workset (if less than 3,000 volumes) through one of the provided point-and-click algorithms
create and then import the workset into a data capsule to run against their own text analysis scripts such as Mallet to apply robust statistical natural language processing upon the workset (this would be an advanced step, as knowing and understanding coding will be required on the user’s part)

Please note that this process is highly generalized, and since every research project is different, the only element common across nearly any user journey would be to create or locate a workset on which text analysis could be performed.

My hypothesis

Going into this project, I hypothesized that understanding what a workset is and how to create one would be a challenge for users, causing confusion and frustration (i.e., what is a workset and how is it different from a dataset?), and inhibit them from beginning to use the site, or engage with free curiosity in order to learn more (see image of my initial stage hypothesis map below). I also believed that the connection between the HathiTrust Digital Library and HTRC would be unclear to users. Thus, a high attrition rate of potential new users and scholars would create a negative impact for the research center itself, and its reputation in the non-profit text analysis community.

Image: A hypothesis map user journey showing suspected areas of frustrations and pain points when users try to get the data needed to create a visualization.

The current state of the website: (content audit and internal data)

Investigating the data: What the users were already saying

To test my hypothesis, I needed to conduct a thorough analysis and assessment of the current HTRC Analytics presence, from a UX and usability perspective:

Content audit of HTRC Analytics

This process included using a web crawling tool to pull all website pages into a spreadsheet, where each page was described, categorized, and assessed based on usability concepts like findability, intuitive use, descriptive labels, relevance, and being up-to-date. This assessment was codified, as indicated in the spreadsheet snippet below:

Image: Snippet of the code book created in Google Sheets for the HTRC Analytics content audit

Content audit findings

The most salient takeaways from the content audit analysis were the following:

The differentiation between “dataset” and “workset” terminology is not clearly defined
Home page graphics don’t relate to the site in any way
Information is generally scattered and unorganized throughout the site
Important pages like the “Jobs”, “Create a Capsule”, and “Validate a Workset” pages are buried and therefore difficult to find
Labels and page titles are often not matched

Investigating the internal data

In addition to my own current-state assessment of HTRC Analytics, I wanted to see what our users had already been saying about it. I dug into the following:

User help tickets

Analyzed 908 Jira help tickets collected between 2016-2021
Coded and themed the ticket content, creating a total of 44 themes (and subthemes)
Here’s a snippet of that code book:

Image: Portion of the user help tickets code book created for gathering and analyzing user submitted help tickets

When aggregated, the top five issues found in these help tickets were

Questions that were intended to be sent to the HathiTrust Digital Library (i.e., mistakenly submitted to HTRC) – e.g., users requesting library materials through HTRC
Data capsule problems – e.g., requesting help downloading files into the data capsule environment
Account unlock requests – e.g., users asking for their inactive accounts to be unlocked by staff
Data Capsule ‘Request Access’ queries – e.g., users requesting full access to the entire HathiTrust corpus, which is restricted until a research proposal is submitted
Issues with HTRC online tools – e.g., users requesting staff to help them import a collection from the HathiTrust Digital Library platform
Questions about HTRC worksets – e.g., users having trouble uploading a file to create a workset.

HTRC workshop feedback

I analyzed user feedback given in the form of surveys collected from participants in HTRC’s Outreach and Education workshop efforts (data represented from the years 2018 - 2020)
Of the ten themes identified from this data pool, two corresponded with other internal data findings:

Help creating worksets
Interest in managing textual data – e.g., data cleaning

“HathiTrust Research Center User Requirements Study White Paper” (submitted 2018)

I analyzed information collected and included in a white paper written and submitted by HTRC’s then-Associate Director of Outreach and Education
Paper included a synthesized list of user interviewees’ responses
Findings here included:

Request for HTRC-provided tips for creating useful worksets
Requests for more frequently released dataset lists
Improving data cleaning options
Expanding data capsule memory size

Google Analytics

Stats for page visit rankings were calculated and averaged between 2020 and 2021

Image: Portion of Google Analytics page hits collected during various points through two years.

The most regularly visited pages that corresponded with other internal data findings were the Worksets page, Datasets page, Data Capsules info page, and the List Jobs page (this page shows a user’s results after they use one of the online plug-and-play text analysis tools offered on the site)

Putting it all together between the internal data findings

Based on the findings listed above, I made several calculated assumptions about pain points shared by many of our users:

Users are struggling to understand worksets, and how to create them (which matched my original hypothesis)
Users cannot easily form a conceptual model of what HTRC Analytics allows them to do
As an advanced, open-ended tool, users struggle to understand the data capsule environment affordances, and if it is an appropriate tool for them to use

Based on these new-found assumptions, I decided I wanted to conduct interviews, user tests, and gather surveys from beginner-intermediate users (ie., users who have little-to-no experience with text analysis coding methods) so I could understand how our site could be more welcoming to new, curious scholars and students. As a foundational concept for all users, worksets represent a base-level of understanding how to use the site. I wanted to understand: do these types of users understand what worksets are and how to use them?

External data: surveys, interviews, and tests

Survey responses

Of the three main takeaways from examining HTRC’s pre-existing internal data, I decided to aim my focus on understanding how users experience understanding, creating, and using HTRC worksets. My first step was to send out an emailed Google survey to the HTRC user listserv. Respondents were asked questions about how easy it is to create a workset, which method of the three options they prefer to use, and ease of navigating between HTRC and the HathiTrust Digital Library to create worksets. A drawback to the survey was the small size of respondents: only 9 people. However, despite this small sample size, it was interesting to note various discord of preference regarding the process of creating a workset:

Image: Bar chart showing user responses to the question, “Count of Navigating among different online platforms (e.g., the HathiTrust Digital Library, HTRC Analytics, Workset Builder 2.0, etc.) to create HTRC worksets is an easy and seamless experience” – ~30% find necessary navigation between different platforms confusing.

Image: Bar chart showing that 60% of survey respondents use the “Import from HathiTrust” method for workset creation.

Because the survey’s response-rate was so small, too much emphasis cannot be placed on the overall validity of users’ experiences from this feedback alone. But it does show that there may exist pain points for users who first create a collection from the digital library and then import it into HTRC Analytics for workset creation. I wanted to learn more about what users experience as they create worksets in this and other ways.

Interview responses

I next moved into a user interview phase, asking researchers, digital strategy/humanities librarians, and graduate students questions falling into the mind map categories I created below:

Image: Mind map of questions posed to users during the user interview sessions

The questions were designed to understand how users build their mental models of the workset creation process, their process itself, the problems they encounter, and the positive aspects of the various steps undertaken in the course of successfully creating an HTRC workset:

Mental model:

Can you describe what a workset it?
Can you describe the difference between a workset and a dataset?

Process:

Can you describe the last time you created a workset?
How long does it take you to create a workset?

Problems:

Can you describe any problems you have encountered with creating worksets in the past?
What would make it easier for you to handle problems on your own?
What are the most time consuming aspects of creating a workset?

Positives:

How does the site help you create worksets?
Which method do you prefer to create your worksets with, and why?

Interview feedback and analysis

Seven interviews were conducted online over Zoom, recorded with user permission, transcribed, thematically coded, and analyzed. Once all the interview data was gathered, a codebook and corresponding affinity diagram were created in tandem to bring clarity to the responses of the interview feedback.

Image: Top portion of the code book created from user interview responses

Image: Page 9 of the affinity diagram created with draw.io to help assess coded themes in interviews. User quotes and synthesized responses included as evidence for each code (each ‘page’ is a thematic code)

Once interview responses had been thoroughly coded, I led a cross-functional team exercise to collectively define the most salient user-identified problems with the HTRC Analytics website generally and workset creation process specifically. The purpose of this activity was to collectively elucidate how the various captured codes were related, understand how they were related to the research questions, and add a new layer of interpretation in order to reveal the most salient embedded in the data themes.

User interview themes

The following shows the distilled user-reported themes discovered from the analysis process described above:

Requesting more reliable ways to see, “preview”, or understand the data before it is being added to a workset
Requesting more ways or instances to find relevant documentation
Complaining of an overabundance of platforms required to utilize HTRC Analytics to its full potential (e.g., HTRC Analytics itself, but also the HathiTrust Digital Library, wiki help documentation, beta tools hosted at the University of Illinois, and multiple metadata access points)

User tests

After conducting interviews, I set up a series of user tests with five users representative of a motivated beginner-level group to see how difficult it is for a beginner user to create worksets. This consisted of one PhD student and four digital humanities or software development librarians. I asked each of these users to create worksets in all three ways possible via HTRC Analytics. These test sessions were conducted virtually over Zoom, and each participant was asked to share their screen and to think out loud as they took steps to complete each general task (e.g., Create an HTRC workset from an already existing HathiTrust Digital Library collection.) Extensive observational notes were taken during each test. These notes were then analyzed for themes in the form of a code book and corresponding affinity diagram, similar to how data was analyzed for the user interviews. Below are images of the resulting artifacts from the two activities:

Image: Top portion of the code book created from the user test sessions

Image: Page 3 of affinity diagram created in draw.io to help discover broad themes relating to user pain points in the workset creation process.

User test themes

Once all the data had been collected, analyzed, and themed, several distilled themes could be articulated concerning user difficulties in creating worksets or using HTRC Analytics generally:

The design on the HTRC Analytics homepage is confusing and/or unhelpful in orienting the user to the capabilities of the site (e.g., the homepage images are strange and irrelevant, but take up a lot of space; the bulk of the text on the homepage would make more sense in a separate About page).
It is difficult for users to understand the connection between the HathiTrust Research Center and the HathiTrust Digital Library.
When creating a workset by importing a digital library collection, the necessity for collections to first be made public is unclear to users (i.e., a user cannot make a workset from their private collection).

Putting it all together – major findings and conclusions

Through the surveys, interviews, and user tests, we learned some important specifics regarding user frustrations when navigating between the HathiTrust Digital Library and HTRC Analytics platforms, building worksets, and using HTRC online tools. Recommendations based on these findings were written and shared with HTRC leadership, and included the following:

Enhance HTRC Analytics documentation

Includes creating a long-term content management plan for user-facing documentation, creating and maintaining content templates for user-facing documentation, and making a new tutorial showing users how they can best create useful HTRC worksets

Work towards better integration between the HathiTrust Digital Library and HTRC so that it is easier for users to create worksets from their digital library collections
Redesign the HTRC Analytics homepage to make it more informative and descriptive of site and organization functionalities

User persona

In addition to the list of recommended changes, a new user persona was created based on the data collected from real HTRC users. The persona is Janine, a Humanities Research Librarian at a university library, where she provides patron services to faculty and graduate students engaged in both traditional and non-traditional research methodologies, including, but not limited to, computational text analysis and data mining. The following images show the newly created persona, the persistent intermediate who loosely understands the basics of digital humanities research, but is not an ‘expert’ in the field:

Image: top section of the new HTRC ‘Persistent Intermediate’ user persona type for HTRC staff to keep in mind when designing and developing online tools and services to the HTRC user base.

Image: Bottom portion of the ‘persistent intermediate’ user persona type

Solving problems

There is an exploratory nature to conducting an expansive UX audit, meaning that we can make hypotheses at the beginning of the process, but that we will also need to be open-minded and willing to adapt our hypotheses to the newly gathered, incoming data. At the beginning of this audit, I predicted that users have difficulty understanding what worksets are and how and why they need to create them. I also hypothesized that without intuitive points of connection between the HathiTrust Digital Library and HTRC Analytics sites, users would struggle to understand the necessary relationship between the two organizations’ platforms.

By the time all the research was gathered and analyzed, my hypotheses were found to be true, but for unexpected reasons. Both the HTRC Analytics homepage and user-facing documentation were underperforming for users: users were routinely confused and foiled by the layout and placement of information across these two critical HTRC touch-points. The homepage is often the first thing many users encounter to try to make sense of what HTRC is and offers, and the documentation is integral to understanding the multiple affordances of the site. With those two areas lacking, our users were suffering in their experiences related to HTRC Analytics.

All of this is to say: the desired outcome of this project was not to solve problems, but to articulate as accurately as possible what the real problems are, and how to best remediate them for users. The previously stated recommended changes each stress the broad aims of necessary follow-up projects, which is the goal of this audit.

Challenges and what I learned

As my first major UX project at HTRC, I learned many things along the way. There are four main takeaways I keep in mind now for all future UX projects I lead:

Conduct stakeholder interviews or a heuristic evaluation with stakeholders prior to engaging with users. As the project moved forward, I could engage with staff members like this in a happenstance, unorganized way, that would lead to meaningful conversations about individual expectations and assumptions that could be collectively discussed. This is important to do because people think they are on the same page, but even if they are (and they aren’t always), they often think about the website in different ways that are useful to get on the table from the beginning.
HTRC has a small staff, BUT: If it’s possible (given staffers strained schedules), try to engage stakeholders in as many UX activities as possible -- UX design is really a communal process, and it works much better when all are feeling involved as much as possible.
If possible, have two staffers present during a user test or interview -- this will ensure that there are quality notes being taken by one staffer, while the other can engage in a meaningful way with the user.
Organize documents generated early and in one place. It’s just easier in the long run, and everyone needs to have access to them anyway, for transparency purposes.

To read more about the corresponding homepage redesign project undertaken as a result of the recommended changes created during this audit, please read about that case study here.