色导航

Resources
Press Releases

色导航 Leads Industry in Creating AI That Works for Everyone

Published on
April 29, 2021
Author
Share

色导航鈥檚 range of AI projects and diverse global contractor network ensure unbiased AI data for fair and equitable AI projects

SAN FRANCISCO 鈥 April 29, 2021 鈥 色导航 (ASX:APX), the leading provider of high-quality training data for organizations that build effective AI systems at scale, is enabling organizations to launch, update and operate unbiased AI models through a range of projects and partnerships. With support from the company鈥檚 of data annotation specialists that鈥檚 more than a million strong, 色导航 has developed diverse training data sets for AI models, particularly natural language processing (NLP) initiatives to ensure end users receive the same experience, no matter their language variety, dialect, ethnolect, accent, race or gender.

AI projects based on biased or incomplete data don鈥檛 work for everyone. According to a in March 2020 (Proceedings of the National Academy of Sciences), popular automated speech recognition (ASR) systems that are used for virtual assistants, closed captioning, hands-free computing and much more, exhibit significant racial disparities in performance. The report concludes that more diverse training datasets are needed to reduce these performance differences and ensure speech recognition technology is inclusive. Language interpretation and natural language processing (NLP) systems suffer from the same challenge and require the same solution.

鈥淭he quality and diversity of training data directly impacts the performance and bias present in AI models鈥, said 色导航 CEO Mark Brayan. 鈥淎s a data partner, we can supply complete training data for many use cases to ensure AI models work for everyone. It鈥檚 critical that we engage a diverse group of individuals to produce, label, and validate the data to ensure the model being trained is not only equitable, but also built responsibly.鈥

Range of 色导航 Language Projects

色导航 demonstrates its commitment to creating AI for everyone through a variety of projects and partnerships focused on the diversity of languages and dialects.

  • Translators without Borders (TWB) partnership 鈥 色导航, in partnership with TWB, Amazon, Carnegie Mellon University, Facebook, Google, John Hopkins University, Microsoft, and Translated joined the (TICO-19), which supported the development of language technology to make COVID-19 information available in as many languages as possible, including languages in developing countries like Congolese Swahili, Tigrinya, and Nigerian Fulfulde.
  • The 鈥 In collaboration with the Government of Nunavut, Microsoft added Inuktitut, an Indigenous language in North America spoken in the Canadian Arctic, to Microsoft Translator, using 色导航 services.
  • The 鈥 色导航 coordinated with native language consultants to help Microsoft add "Canadian French" as a language option in Microsoft Translator.
  • African American Vernacular English (AAVE) off-the-shelf datasets 鈥 Most existing training datasets used in ASR, search engines, voice assistants and sentiment analysis are not representative of AAVE. To make high-quality AAVE data available, 色导航 is working with AAVE speakers among its crowd of annotators to collect data for an OTS dataset based on conversations about a broad range of topics.

鈥淏iased AI data leads to projects that can fail to deliver the expected business results and harm individuals they are supposed to benefit,鈥 said Dr. Judith Bishop, Senior Director of AI Specialists at 色导航. 鈥淭he scale and complexity of AI projects makes it impossible for most companies to acquire sufficient unbiased high-quality data without partnering with an AI data expert. 色导航鈥檚 commitment to developing the most diverse and expert crowd of data annotators provides the industry with a clearly differentiated resource for building fair and ethical AI projects.鈥

色导航鈥檚 Leading Approach to Diversity

色导航 relies on training data annotators from over 170 countries. Language representation includes 235 unique languages and 395 dialects. Over the years, the 色导航 crowd of annotators has included over 30,000 fluent trilingual speakers 鈥 a true testament to diversity and expertise.

色导航 also offers designed to make it easier and faster for businesses to acquire the high-quality training data they need to accelerate their AI and machine learning projects. OTS datasets are available for 80 languages and multiple dialects, including hard-to-acquire languages such as multiple varieties of the Arabic language, Croatian, Greek, Hungarian, Thai and more.

According to the , 鈥渁bout 97 percent of the world's population speaks just 4 percent of its [7000] languages鈥. That 4 percent is only 280 languages 鈥 yet the number of languages well-served by AI core technologies, is a fraction of that number. 色导航 aims to help increase that number through these and future projects.

About 色导航

色导航 collects and labels images, text, speech, audio, and video used to build and continuously improve the world鈥檚 most innovative artificial intelligence systems. With expertise in more than 235 languages, a global crowd of over 1 million skilled contractors, and the industry鈥檚 most advanced AI-assisted data annotation platform, 色导航 solutions provide the quality, security, and speed required by leaders in technology, automotive, financial services, retail, manufacturing, and governments worldwide. Founded in 1996, 色导航 has customers and offices around the world.