Site logo

Catapult Solutions Group

Job Type:
Contract
Location Type: Remote: open to all US residents

 

Deep Learning Scientist – Speech Synthesis

Location: 100% Remote (Anywhere in the U.S.)

Duration: 6-Month Contract

Position Overview

We are seeking a Deep Learning Scientist – Speech Synthesis to support the development of next-generation speech AI technologies. This role focuses on training and optimizing speech models, improving model performance, and solving complex machine learning challenges related to speech applications.

The ideal candidate has strong experience in speech synthesis (Text-to-Speech) or Speech-to-Text, deep learning, and Python development. Success in this role requires the ability to analyze model behavior, diagnose training issues, and improve model performance-not just collect or evaluate data.

Key Responsibilities

  • Train and optimize speech synthesis models, including mel spectrogram and vocoder models.
  • Analyze training metrics, validation losses, and model performance to identify root causes of model issues and recommend improvements.
  • Benchmark and optimize speech models across multiple use cases.
  • Improve speech data preparation, augmentation, filtering, and dataset quality.
  • Develop and refine high-quality training datasets for speech AI models.
  • Measure and characterize model accuracy, quality, and bias.
  • Collaborate with cross-functional teams to develop and deliver new speech AI features.
  • Participate in software development, design reviews, testing, and code reviews.
  • Troubleshoot technical issues and contribute to continuous model improvements.

Required Qualifications

  • Master’s degree or Ph.D. in Computer Science, Electrical Engineering, Artificial Intelligence, Applied Mathematics, Linguistics, Computational Linguistics, or a related field (or equivalent experience).
  • 3+ years of relevant industry experience.
  • Strong Python programming skills.
  • Strong understanding of machine learning and deep learning concepts.
  • Experience with Text-to-Speech (TTS), Speech Synthesis, or Speech-to-Text (STT) technologies.
  • Hands-on experience training deep learning models using PyTorch.
  • Ability to analyze training behavior, validation losses, and model performance to troubleshoot and improve machine learning models.
  • Knowledge of speech signal processing concepts, including FFT, MFCC, and mel spectrograms.
  • Strong understanding of software development fundamentals.
  • Experience using version control systems such as Git, Gerrit, or GitLab.
  • Excellent communication and collaboration skills.

Preferred Qualifications

  • Experience with deep learning architectures such as CNNs, RNNs, LSTMs, and Transformers.
  • Experience with voice cloning or multilingual speech systems.
  • Knowledge of text normalization (TN), inverse text normalization (ITN), or grapheme-to-phoneme (G2P) systems.
  • Fluency in one or more languages such as Spanish, Mandarin, German, Japanese, Russian, French, Arabic, Hindi, Korean, Italian, or Portuguese.
  • Interest in linguistics, phonetics, and speech technologies.
  • Strong C++ programming skills.
  • Familiarity with GPU technologies such as CUDA, cuDNN, or TensorRT.
  • Experience deploying machine learning models to cloud, data center, or embedded environments.

What We’re Looking For

The ideal candidate is someone who enjoys solving difficult machine learning problems and has hands-on experience training speech models. Beyond building models, we’re looking for someone who can investigate why a model is underperforming, analyze validation losses, identify root causes, and improve overall model quality and performance.

Additional Information

  • 100% remote position within the United States.
  • No specific U.S. time zone requirement.
  • This is a contract opportunity.
  • Opportunity to contribute to cutting-edge speech AI and deep learning technologies.
Print Job Listing
We use cookies to improve your experience on this website. By browsing this website, you agree to this use of cookies.

Cart

Your cart is currently empty.

Share