Voice of Yoruba

Voice of Yorùbá
(Èdè Yorùbá)

Learn about the Yorùbá language, known as Èdè Yorùbá by its speakers, is a vibrant expression of identity, culture, and ancestral wisdom.

Overview

Yorùbá is a vibrant West African language heard in homes, markets, music, and festivals. Using the Latin alphabet with tone marks, it captures both meaning and melody.Explore clips and texts that reveal its rich rhythm, tone, and culture.

album-art
00:00

    Sample Audio

    Writing system

    Yorùbá uses the Latin alphabet with tone marks and underdots for precise expression. Vowel pairs like e/ẹ and o/ọ are distinguished with underdots, while represents the “sh” sound. Tones—high, low, and mid—are shown with acute, grave, and (optionally) macron marks. In casual writing, tone marks are often dropped and inferred from context, while formal works restore them for clarity and learning. This system balances everyday readability with full phonological accuracy.

    What’s Here Now

    Urban Dialogue

    Play
    Pause

    Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

    Market Talk

    Play
    Pause

    Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

    Community Radio

    Play
    Pause

    Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

    Why It Matters
    for AI

    Including tonal, proverb-rich Yorùbá speech helps reduce bias in speech and translation models while supporting educational and cultural applications.

    Speech Recognition

    Improving speech and translation models with authentic Yorùbá tones and proverbs to promote fairness, education, and cultural preservation.

    Translation

    Including Yorùbá in AI speech data ensures fairer models, preserves culture, and advances language technology in Africa.

    Information Access

    Bringing information closer to Yorùbá speakers through AI that understands their language and culture.

    Yoruba Datasets

    Yoruba Corpus v1.0

    Version: 1.0
    Size: 2GB
    License: CC-BY 4.0
    DOI: 10.1234/swahili.001

    Yoruba Corpus v1.0

    Version: 1.0
    Size: 2GB
    License: CC-BY 4.0
    DOI: 10.1234/swahili.001

    Our platform digitally preserves Africa’s rich linguistic diversity by collecting audio, text, and community contributions to build a comprehensive database for research, learning, and AI model training.

    Collaborators

    Contact us if interested in collaborations. 

    © 2026 All Rights Reserved.

    Scroll to Top

    Request Access

    Request access to the Ogiek language datasets. Sign in to view and download curated audio and text resources for AI research, language preservation, and educational purposes.

    Request Access

    Contribute Data

    Contribute your recordings and transcripts to help preserve the Ogiek language. Submit audio, text, and consent forms to support AI research, education, and cultural preservation.

    Contribution form