Low-resource language roadmap

Mapuzungun first. Community pathway before product claims.

OlyLive LangTech Lab is the research and deployment track for low-resource languages inside a real communication platform. The first phase centers on Mapuzungun, combining corpus access, community-grounded consultation and a staged validation path toward usable STT/TTS and translation tools.

Why this belongs inside OlyLive

The thesis is not philanthropic decoration. If OlyLive solves multilingual communication under real operational constraints, it can become the delivery layer for broader language access and preservation efforts.

Product realism

Low-resource language work should connect to deployed communication flows, not only offline demos or academic benchmarks.

Institutional value

Public entities, NGOs and educational systems need usable infrastructure, not merely symbolic support for inclusion.

Sustainable engine

A commercial communication platform can subsidize the cost of infrastructure, traffic and iteration required to serve underserved languages over time.

Current workstreams

Separate what we already have, what we are preparing now and what remains longer-horizon roadmap.

In progress

Mapuzungun

The first phase combines access to roughly 120 hours of Mapuzungun audio corpus with a technical path for low-resource STT/TTS fine-tuning and real-time translation experiments.

  • Corpus preparation, cleaning and speech-model readiness
  • Community consultation pathway with local stakeholders before stronger claims
  • Regional fieldwork requirement because Chubut variants should not be collapsed into a purely Chilean data assumption
Exploration

Aymara

Aymara is the next scaling step once the first low-resource pipeline is operational and institutionally grounded.

  • Designed for Andean regional scale, not as a one-off academic demo
  • Relevant to public service, education and cross-border coordination
  • Built after the Mapuzungun phase proves technical and institutional discipline
Roadmap

Patagonian Welsh

Patagonian Welsh represents a later-stage preservation and access track with a distinctive cultural case inside Patagonia itself.

  • Community and institutional relevance over vanity support
  • Potential voice and meeting access layer inside OlyLive
  • Built as a continuation of the low-resource language sequence

What we can say now, and what we still have to earn

Institutional readers should see discipline here: real assets, real conversations, no fake closure.

What is already concrete

  • OlyLive has access to roughly 120 hours of Mapuzungun audio corpus for model preparation
  • There is an active community pathway in Chubut through local contacts
  • Public wording remains limited to milestones that can be disclosed without third-party confirmation

What remains in progress

  • Formal community consultation and informed local presentation still need to happen
  • Regional fieldwork is still needed to avoid flattening Chubut variants into external corpus assumptions
  • Production-grade support comes only after model training, evaluation and responsible deployment

Roadmap sequence

The path is staged so the lab can accumulate real language assets, institutional trust and product integration capacity over time.

1

Now

Develop the Mapuzungun data pipeline, define concrete use cases and keep the work tied to deployable communication scenarios.

2

Next

Extend the inclusion layer toward Aymara with institution-facing use cases in education, public service and NGO coordination.

3

Later

Move into Patagonian Welsh as a product-integrated preservation and access track for communities and institutions.

How this connects to the company model

The long-term idea is to let public institutions and NGOs finance VM capacity, traffic and inclusive deployments while the commercial product keeps the infrastructure alive and improving.

Commercial engine

Revenue from multilingual communication for organizations funds the operational base: infra, product iteration and reliability.

Inclusion engine

Institution-backed deployments create a practical path for language access and educational inclusion without separating research from real-world delivery.

Interested in collaboration, corpus strategy or institutional pilots?

Use this page as the research-facing tab of OlyLive. It is meant to signal serious intent, technical prudence and room for collaboration.

Write to hello@olylive.org