Appen Provides Private, High-Quality Audio Data to Hugging Face's Open ASR Leaderboard

New datasets improve benchmark integrity for a more complete picture of real-world speech recognition performance

KIRKLAND, Wash., May 06, 2026 (GLOBE NEWSWIRE) -- Appen Limited (ASX: APX), a leading provider of high-quality data for the AI lifecycle, today announced a collaboration with Hugging Face to bring private, high-quality audio datasets to the Open ASR Leaderboard, one of the most widely used benchmarks in the speech recognition community.

Since its launch in September 2023, the Open ASR Leaderboard has been visited more than 700,000 times, underscoring its central role for researchers and enterprises evaluating automatic speech recognition (ASR) models. The leaderboard ranks models by word error rate (WER), a measure of transcription accuracy where lower scores indicate better performance.

"The speech AI community has made huge strides in model performance, but the benchmarks used to measure that progress haven't kept pace," said Sergio Bruccoleri, vice president of Delivery at Appen. "Leaderboards only tell the full story when the underlying data reflects how speech technology is actually used. And that's exactly what this collaboration with Hugging Face is all about."

As the leaderboard has grown in prominence, so has the risk of "benchmaxxing," the practice of optimizing models specifically to score well on public test sets without achieving equivalent gains in real-world performance. To address this, Appen provides a suite of new, private English-language audio datasets that are incorporated into the leaderboard evaluation framework. Keeping these datasets private makes them significantly harder to game, which increases the trustworthiness of results across the board.

What Appen's Datasets Add
Appen's contribution covers both scripted and conversational speech across multiple accents, enabling the leaderboard to surface a more nuanced picture of model performance. Specifically, the new private data supports metrics including:

Average Scripted WER: Covering read speech across multiple controlled recordings.
Average Conversational WER: Capturing natural dialogue with interruptions, fillers and variation
Average U.S. vs. Non-U.S. Accent WER: Highlighting performance gaps between American English and more diverse accent profiles.

These dimensions reflect a core finding from Appen's research: there is no single "catch-all" ASR model. Systems that excel on clean, American-accented audio may underperform on conversational speech or non-native speakers. These new metrics make those tradeoffs visible.

“Reliable AI evaluation starts with high-quality data and we’re excited to partner with Appen to launch this new track in the Open ASR Leaderboard,” said Eric Bezzam, Audio ML Engineer at Hugging Face.

How Private Data Changes the Score
This leaderboard is part of a broader industry shift toward more rigorous, relevant benchmarks. Across Appen’s research, including multilingual evaluation to multimodal red‑teaming, one theme keeps reappearing: there are not enough benchmarks that truly reflect how models are used in the field.

By expanding audio coverage to real-world conditions, laying the groundwork for non-English and non-European languages, and transparently surfacing both accuracy and efficiency tradeoffs, the Appen and Hugging Face ASR Leaderboard helps enterprises, researchers, and builders make better-informed decisions about the speech technologies they rely on.

To read Appen's full perspective on building more trustworthy ASR benchmarks, visit the Appen blog. Explore the Open ASR Leaderboard and read Hugging Face’s blog post to see how models perform under real-world conditions. Connect with our experts to understand how tailored, human-in-the-loop evaluation can help you benchmark and improve your own speech systems.

About the Open ASR Leaderboard
The Open ASR Leaderboard, maintained by Hugging Face, is an open benchmarking resource for automatic speech recognition models. It standardizes evaluation across models and datasets, with open-source evaluation scripts and UI code available on GitHub and Hugging Face Hub. Model developers can submit results via a pull request to the leaderboard's public GitHub repository.

About Appen
Appen (ASX:APX) is the global leader in data for the AI Lifecycle with 30 years of experience in data sourcing, annotation and model evaluation. Through our expertise, platform and global crowd, we enable organizations to launch the world’s most innovative artificial intelligence products with speed and at scale. Appen maintains the industry’s most advanced AI-assisted data annotation platform and boasts a global crowd of more than 1 million contributors worldwide, speaking more than 235 languages. Our products and services make Appen a trusted partner to leaders in technology, automotive, finance, retail, healthcare and government. Appen has customers and offices globally.

Contacts
BOCA Marketing for Appen
Appen@bocamarketing.com

Legal Disclaimer:

EIN Presswire provides this news content "as is" without warranty of any kind. We do not accept any responsibility or liability for the accuracy, content, images, videos, licenses, completeness, legality, or reliability of the information contained in this article. If you have any complaints or copyright issues related to this article, kindly contact the author above.