Jobs on TAL
All jobsOnsiteEngineeringai/ml infra2-5 yearspython
OnsiteMid Levelai/ml infra

Voice AI Engineer

KGenBengaluru, Karnataka, IndiaPosted 19 May 2026

KGen is looking for a Voice AI Engineer to build and own the engineering pipelines for high-signal voice datasets focused on underserved Global South languages. The role involves designing end-to-end audio collection, annotation, and automated quality control systems, as well as building proprietary tooling atop existing ASR models. You will work with a diverse stack including Python, FFmpeg, and various audio-ML frameworks to deliver high-quality data to frontier AI labs. The successful candidate will be responsible for creating reproducible ML pipelines that track data lineage and performance metrics at scale.

Matched by TAL

50k new jobs listed every day. Install TAL to find more jobs like this.

Install TAL

Experience

2-5 years

Function

Engineering

Work mode

Onsite, India

Company

Tier 2

What you will work on

KGen is looking for a Voice AI Engineer to build and own the engineering pipelines for high-signal voice datasets focused on underserved Global South languages. The role involves designing end-to-end audio collection, annotation, and automated quality control systems, as well as building proprietary tooling atop existing ASR models. You will work with a diverse stack including Python, FFmpeg, and various audio-ML frameworks to deliver high-quality data to frontier AI labs. The successful candidate will be responsible for creating reproducible ML pipelines that track data lineage and performance metrics at scale.

TAL's take

Quality 65/1005/5 clarityTier 2 company

Strong specialized role in a high-growth sector with clear ownership of audio infrastructure at a recognized scale.

Extremely clear technical requirements and mission-focused scope related to audio data pipelines and Global South language support.

Must haves

  • 2–5 years engineering experience in speech AI or audio ML
  • Hands-on experience with ASR fine-tuning (Whisper, SpeechBrain, etc.)
  • Fluency with audio tooling (FFmpeg, librosa, torchaudio, etc.)
  • Strong Python development skills for pipeline orchestration
  • Experience building automated QC systems for audio data
  • Experience shipping reproducible ML environments using Docker/CI/CD

Tools and skills

pythonffmpeglibrosatorchaudiopyannotesoundfilewhisperwhisperxspeechbrainkaldidockerci/cd

Nice to have: mlflow, w&b, aws s3, gcs, fastapi.

About the company

Growth-stage startup with significant user base and revenue but not yet established as a global top-tier tech leader.