Voice of Ogiek

Spoken in Kenya (notably the Mau Forest region , Mt. Elgon and around northern Tanzania (Akiek).

Overview

The ogiek/okiek/akiek language is primarily spoken by the Ogiek people inhabiting forested regions of the Mau Forest Complex and around Mount Elgon. The Kenya 2009 census listed them at around 79,000 speakers.

Sample Audio

Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

Play
Pause

Writing system

Because Ogiek varieties are under-documented and widely assimilated into other languages, writing practices are still being standardized through community-led and researcher-supported efforts. Most current materials use a Latin alphabet with practical choices like ny, ng’ and occasional vowel doubling to mark length; tone is typically left unmarked outside linguistic work. Where community groups develop orthography guides, the priority is usability for local readers while remaining consistent enough for dictionaries, literacy materials, and digital search.

What’s Here Now

Urban Dialogue

Play
Pause

Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

Market Talk

Play
Pause

Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

Community Radio

Play
Pause

Transcript: Habari, unaendeleaje? (Hello, how are you doing?)

Why It Matters
for AI

The Ogiek language has limited digital presence; contributing well-curated recordings supports AI systems that reflect local voices, improving access to learning, communication, and cultural preservation.

Speech Recognition

Speech recognition data helps machines understand how Ogiek speakers sound in real life, improving voice assistants, transcription tools, and language preservation systems.

Translation

Translation datasets enable AI to learn how Ogiek words and meanings connect across languages, supporting accurate communication, education, and cultural exchange.

Information Access

Information access datasets help AI provide Ogiek speakers with relevant content, resources, and services in their language, promoting inclusivity and bridging digital gaps.

Ogiek Datasets

Ogiek Corpus v1.0

Version: 1.0
Size: 2GB
License: CC-BY 4.0
DOI: 10.1234/swahili.001

Ogiek Corpus v1.0

Version: 1.0
Size: 2GB
License: CC-BY 4.0
DOI: 10.1234/swahili.001

Our platform digitally preserves Africa’s rich linguistic diversity by collecting audio, text, and community contributions to build a comprehensive database for research, learning, and AI model training.

Collaborators

Contact us if interested in collaborations. 

© 2025 All Rights Reserved.

Scroll to Top

Request Access

Request access to the Ogiek language datasets. Sign in to view and download curated audio and text resources for AI research, language preservation, and educational purposes.

Request Access

Contribute Data

Contribute your recordings and transcripts to help preserve the Ogiek language. Submit audio, text, and consent forms to support AI research, education, and cultural preservation.

Contribution form