Catapult Solutions Group

Job Type:

Contract

Location Type: Remote: open to all US residents

Deep Learning Scientist – Speech Synthesis

Location: 100% Remote (Anywhere in the U.S.)

Duration: 6-Month Contract

Position Overview

We are seeking a Deep Learning Scientist – Speech Synthesis to support the development of next-generation speech AI technologies. This role focuses on training and optimizing speech models, improving model performance, and solving complex machine learning challenges related to speech applications.

The ideal candidate has strong experience in speech synthesis (Text-to-Speech) or Speech-to-Text, deep learning, and Python development. Success in this role requires the ability to analyze model behavior, diagnose training issues, and improve model performance-not just collect or evaluate data.

Key Responsibilities

Train and optimize speech synthesis models, including mel spectrogram and vocoder models.
Analyze training metrics, validation losses, and model performance to identify root causes of model issues and recommend improvements.
Benchmark and optimize speech models across multiple use cases.
Improve speech data preparation, augmentation, filtering, and dataset quality.
Develop and refine high-quality training datasets for speech AI models.
Measure and characterize model accuracy, quality, and bias.
Collaborate with cross-functional teams to develop and deliver new speech AI features.
Participate in software development, design reviews, testing, and code reviews.
Troubleshoot technical issues and contribute to continuous model improvements.

Required Qualifications

Master’s degree or Ph.D. in Computer Science, Electrical Engineering, Artificial Intelligence, Applied Mathematics, Linguistics, Computational Linguistics, or a related field (or equivalent experience).
3+ years of relevant industry experience.
Strong Python programming skills.
Strong understanding of machine learning and deep learning concepts.
Experience with Text-to-Speech (TTS), Speech Synthesis, or Speech-to-Text (STT) technologies.
Hands-on experience training deep learning models using PyTorch.
Ability to analyze training behavior, validation losses, and model performance to troubleshoot and improve machine learning models.
Knowledge of speech signal processing concepts, including FFT, MFCC, and mel spectrograms.
Strong understanding of software development fundamentals.
Experience using version control systems such as Git, Gerrit, or GitLab.
Excellent communication and collaboration skills.

Preferred Qualifications

Experience with deep learning architectures such as CNNs, RNNs, LSTMs, and Transformers.
Experience with voice cloning or multilingual speech systems.
Knowledge of text normalization (TN), inverse text normalization (ITN), or grapheme-to-phoneme (G2P) systems.
Fluency in one or more languages such as Spanish, Mandarin, German, Japanese, Russian, French, Arabic, Hindi, Korean, Italian, or Portuguese.
Interest in linguistics, phonetics, and speech technologies.
Strong C++ programming skills.
Familiarity with GPU technologies such as CUDA, cuDNN, or TensorRT.
Experience deploying machine learning models to cloud, data center, or embedded environments.

What We’re Looking For

The ideal candidate is someone who enjoys solving difficult machine learning problems and has hands-on experience training speech models. Beyond building models, we’re looking for someone who can investigate why a model is underperforming, analyze validation losses, identify root causes, and improve overall model quality and performance.

Additional Information

100% remote position within the United States.
No specific U.S. time zone requirement.
This is a contract opportunity.
Opportunity to contribute to cutting-edge speech AI and deep learning technologies.

Print Job Listing

Deep Learning Scientist, Speech Synthesis

Catapult Solutions Group

Job Overview

Featured Job Listings

HR & Expat Support Specialist

Office & Administrative / Operations Coordinator

Custom Programs Resident Director

Hybrid Payroll Individual Tax Staff

Hybrid Bilingual Japanese Tax Staff

Deep Learning Scientist, Speech Synthesis

Bookmark Details

Catapult Solutions Group

Job Overview

Featured Job Listings

HR & Expat Support Specialist

Office & Administrative / Operations Coordinator

Custom Programs Resident Director

Hybrid Payroll Individual Tax Staff

Hybrid Bilingual Japanese Tax Staff

Related Jobs

Director – Health, Safety & Risk

[WEBTOON] Editor

Maintenance Technician

Rheumatology Nurse Practitioner – Rockford, IL $95,000 – $145,000

Grocery Clerk

Executive Assistant

Cart

Share

Facebook

X

LinkedIn

Telegram

Tumblr

Whatsapp

VK

Bluesky

Threads

Mail