Justin Grant
UX Researcher

Mozilla Common Voice

Driving more inclusive AI and Voice Datasets through mixed methods user research

The Impact of this Project

My research led to an impactful and reasonable design recommendation that influenced the entire Common Voice community, including the business objectives.

I found that users were not complying with the contribution criteria outlined on the website, which is a problem for inclusive voice datasets — users were rejecting voice clips without realizing their own biases.

Overall, my recommendations improve the user experience, reduced biases in AI, and fulfill business objectives — an initiative to build inclusive voice technologies for everyone.

Role

UX researcher and moderator

Project

Research how users listen to, judge, and either accept or reject a recorded prompt on Common Voice.

Duration

4 weeks research preparation for 30 minute presentation to Mozilla Product Lead.

Team

2 UX Researchers

Why

Mozilla Common Voice is an open source database built with Deep Speech that trains machine models for conversational AI and digital assistants. 

This project involves real people that either record themselves speaking a prompt from their browser or listen to and verify recorded prompts from their browser.  My focus on this project is to research how users listen to, judge, and either accept or reject a recorded prompt.  

Also, developing artificial intelligence without biases is important for product development, diversity, inclusion, and equality in machine learning models and artificial intelligence applications.

Contribution Criteria Design Recommendation

Simple but effective

I suggested minor adjustments to the contribution criteria to help reduce biases in validating voice clips.

Through my research, I discovered that none of the interviewees scrolled beyond the first of seven contribution criteria. This is a problem, not only for Mozilla Common Voice, but for voice datasets, inclusivity and diversity.

Importance of non-biased AI and Voice Technology

Language Diversity

“What language hierarchies are we reinforcing if we don’t design them for linguistic diversity?” 

— Hillary Juma, Common Voice’s Community Manager

Race Gap

“[A study found that a] “race gap” was just as large when comparing the identical phrases uttered by both black and white people. This indicates that the problem lies in the way the systems are trained to recognize sound.”

—New York Times, There Is a Racial Divide in Speech-Recognition Systems, Researchers Say

Speech Recognition

“Certain words mean certain things when certain bodies say them, and these [speech] recognition systems really don't account for a lot of that.” 

— Safiya Noble, Associate Professor in the Departments of Information Studies and African American Studies at UCLA and author of best-selling title Algorithms of Oppression

Research Question: How to eliminate biases and drive more inclusive AI and Voice Datasets?

Natural language processing scientist performing a contextual inquiry.

Natural language processing scientist performing a contextual inquiry.

Capturing Feedback

Through my contextual inquiries and subtle usability tests, I captured users' insights by documenting their responses through active listening and follow up questions. For example, I had users navigate to Mozilla Common Voice by any means (most Googled "Common Voice"), then asked them what they would do next (without leaving the landing page). Then, after asking them to begin interacting with the platform, as I watched participants interact with Common Voice, I would repeat some of their comments by asking their comment as a question. This reinforced their response and help me generate insights from the user's perspective.

After watching, listening, and documenting the user's interaction with Common Voice, I returned to specific flows and instructed the user to repeat particular actions (listening to voice clips or reviewing the contribution criteria). I discovered that users had biased judgment toward voice clips and a lack of interaction with specific pages (not scrolling down on the contribution criteria page).

Since all six interviewees failed to scroll down through all contribution criteria (designed for bias mitigation), towards the end of the interview, I asked them to read through all of the contribution criteria, then followed up with a question about why they denied voice clips. 5 out of 6 interviewees admitted to biased judgment because of a person's accent or proper pronunciation — a violation of Common Voice's Contribution Criteria. While a lot of work is needed to reduce bias mitigation, I recommended a simple navigation menu for the contribution criteria due to time constraints. All participants agreed that a visible navigation menu would help with validating voice clips.

Driving more inclusive AI and Voice Datasets requires reinforced learning — currently, the contribution criteria is on another page, which does not reinforce learning.  Participant are required to remember the criteria instead of easily recalling it.

Driving more inclusive AI and Voice Datasets requires reinforced learning — currently, the contribution criteria is on another page, which does not reinforce learning. Participant are required to remember the criteria instead of easily recalling it.

Bias Mitigation Insights

Driving more inclusive AI and Voice Datasets through mixed methods user research and design iteration is one of many places to start. After users realized their biases in accepting or rejecting voice clips on Mozilla Common Voice, they suggested adding the contribution criteria on the layout page where voice clips were displayed. Currently, the contribution criteria to reduce biases is located on a separate page, and requires the users to move back and forth between the interaction page and the criteria page. Adding criteria to the voice clip page can reinforce the requirements and drive more inclusive AI and Voice Datasets.

Other Platform Recommendations

Trustworthiness and use of data collection

  • Transparency about the system, real world applications, who uses the data collection

  • More mention of privacy and importance of diversity in language

  • Add articles from academic researchers and industry reports like NY Times or The Economist

Market exposure

  • Advertisements on Reddit and other social media platforms

Visible and upfront Contribution Criteria 

  • Menu navigation on contribution criteria

  • Add contribution criteria questionnaire to recordings that are denied to help with Machine Learning

  • Potential tutorial for new users to quickly overview the criteria

  • Navigation indicator (e.g. pagination or table of contents) so users either scroll beyond the first criteria or can tab through.

  • Video tutorial, interface walkthrough, recorded voice examples (accept vs reject)

Product expansion into educational platforms for second language learners

  • Listen to native speakers (user who are verified fluent in native language)

  • Include statistics (accept and denied prompts)

  • Gamification (accomplishments, goals, and language proficiency)

Mozilla is a global nonprofit
Please support Mozilla.

Mozilla is a global nonprofit dedicated to keeping the Internet a global public resource that is open and accessible to all. I'm thankful to be a part of this organization and I'm grateful that my contributions to the platform are moving forward with the engineering team.

Hire me!
I can juggle.