Voice of Yorùbá
(Èdè Yorùbá)

Learn about the Yorùbá language, known as Èdè Yorùbá by its speakers, is a vibrant expression of identity, culture, and ancestral wisdom.

Overview

Yorùbá is a vibrant West African language heard in homes, markets, music, and festivals. Using the Latin alphabet with tone marks, it captures both meaning and melody.Explore clips and texts that reveal its rich rhythm, tone, and culture.

Sample Audio

Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

Play
Pause

Writing system

Yorùbá uses the Latin alphabet with tone marks and underdots for precise expression. Vowel pairs like e/ẹ and o/ọ are distinguished with underdots, while represents the “sh” sound. Tones—high, low, and mid—are shown with acute, grave, and (optionally) macron marks. In casual writing, tone marks are often dropped and inferred from context, while formal works restore them for clarity and learning. This system balances everyday readability with full phonological accuracy.

What’s Here Now

Urban Dialogue

Play
Pause

Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

Market Talk

Play
Pause

Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

Community Radio

Play
Pause

Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

Why It Matters
for AI

Including tonal, proverb-rich Yorùbá speech helps reduce bias in speech and translation models while supporting educational and cultural applications.

Speech Recognition

Improving speech and translation models with authentic Yorùbá tones and proverbs to promote fairness, education, and cultural preservation.

Translation

Including Yorùbá in AI speech data ensures fairer models, preserves culture, and advances language technology in Africa.

Information Access

Bringing information closer to Yorùbá speakers through AI that understands their language and culture.

Yoruba Datasets

Yoruba Corpus v1.0

Version: 1.0
Size: 2GB
License: CC-BY 4.0
DOI: 10.1234/swahili.001

Yoruba Corpus v1.0

Version: 1.0
Size: 2GB
License: CC-BY 4.0
DOI: 10.1234/swahili.001

Our platform digitally preserves Africa’s rich linguistic diversity by collecting audio, text, and community contributions to build a comprehensive database for research, learning, and AI model training.

Collaborators

Contact us if interested in collaborations. 

© 2025 All Rights Reserved.

Scroll to Top

Request Access

Request access to the Ogiek language datasets. Sign in to view and download curated audio and text resources for AI research, language preservation, and educational purposes.

Request Access

Contribute Data

Contribute your recordings and transcripts to help preserve the Ogiek language. Submit audio, text, and consent forms to support AI research, education, and cultural preservation.

Contribution form