AI Voice Synthesis Developer

Company:  IF Recruitment Ltd
Location: Cambridge
Closing Date: 27/11/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description

We're seeking an exceptional AI Voice Synthesis Developer to join an innovative start-up. The ideal candidate will combine deep technical expertise in text-to-speech (TTS) systems with a passion for creating efficient, production-ready solutions that push the boundaries of what's possible in voice synthesis.

Key Responsibilities

  • Design and implement low-latency TTS systems optimised for minimal computing resources
  • Develop and optimise AI models for Real Time voice synthesis
  • Create efficient architectures that balance quality, speed, and resource utilisation
  • Collaborate with team members to integrate voice synthesis capabilities into our products
  • Research and implement state-of-the-art techniques in speech synthesis
  • Contribute to technical architecture decisions and product strategy

Skill Required

  • Strong programming skills with demonstrated experience in AI/ML frameworks (PyTorch, TensorFlow)
  • Expertise in speech processing, Digital Signal Processing, and audio engineering
  • Advanced Python programming
  • Experience with Azure
  • Proficiency in Real Time audio processing with target latency
  • Experience optimising models for edge deployment
  • Knowledge of audio compression techniques and format
  • Familiarity with audio quality metrics
  • Experience with audio processing libraries
  • Proficiency in version control (Git) and CI/CD pipelines
  • Previous work on TTS systems (commercial or lab)
  • Background in voice conversion or voice cloning technologies

AI/ML Platform Experience

  • Experience with Groq for high-performance inference
  • Familiarity with Deepgram's API and speech-to-text capabilities
  • Knowledge of large language model deployment and optimisation

Speech Technology Expertise

  • Deep understanding of modern TTS architectures:
    • Non-autoregressive models (FastSpeech 2, Glow-TTS)
    • Autoregressive models (Tacotron 2, YourTTS)
    • Flow-based models (Flow-TTS, WaveFlow)
  • Experience with vocoders:
    • HiFi-GAN
    • WaveNet
    • UnivNet
    • BigVGAN
Apply Now
Share this job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙