Altera Recruiting - Speech AI Engineer

Open job
AI Engineer
💰 Negotiable
📍 Tokyo, Japan
Closing date: 11 days left
Twine Jobs Twine
Based in Manchester, United Kingdom
Last online 4 months ago

AI Engineer is needed in Tokyo, Japan.

We are looking for an AI Engineer to join our “AI and Robotics” team. In this role, you will work on adding new AI-enabled features to our mobile hardware platforms. The team focuses on improving service efficiency for business partners through technologies such as Classical and Deep Learning-based Computer Vision, Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and Retrieval-Augmented Generation (RAG).

As an expert in Speech AI, you will handle tasks involving ASR models, Voice Activity Detection, Language Detection, Emotion Detection, Speaker Diarization, and Audio Cleaning. While our AI codebases are primarily in Python, programs running on edge hardware (Jetson boards) are written in C++ for seamless integration.

Responsibilities

  • Pipeline Development: Implement end-to-end speech processing pipelines for client-facing projects.
  • Research: Stay current with the latest achievements and papers in Machine Learning and Speech AI.
  • Deployment: Write performant, scalable code capable of being deployed to a large fleet of remote hardware units.

Requirements

Must-Have Skills:

  • Programming: Proficiency in Python and solid knowledge of C/C++.
  • DevOps/Tools: Experience with version control (Git) and containerization (Docker or Podman).
  • Deep Learning Fundamentals: _ Architectures: Encoder-Decoder, Transformers, RNNs.
  • Core Concepts: Supervised/Unsupervised training, classification, regression.
  • Evaluation Metrics: WER/CER (Word/Character Error Rate), Cross Entropy.
  • Speech AI Fundamentals: _ Audio preprocessing and Voice Activity Detection (VAD).
  • Speaker Diarization.
  • Specialized Libraries: Proficiency with the HuggingFace ecosystem, OpenAI Whisper, and NVIDIA NeMo.

Nice-to-Have Skills:

  • Education: Master’s degree in Computer Science or a Deep Learning-related field.
  • Practical Experience: Deploying ASR systems, Emotion Detection, or Speaker Diarization in real-world environments.
  • Advanced ASR Knowledge: Model distillation, fine-tuning strategies, and specialized evaluation.
  • Infrastructure: Knowledge of distributed systems, cloud computing, and high-performance computing (HPC).
  • Software Engineering: Strong system design, testing, and debugging fundamentals.
  • Hardware Acceleration: Familiarity with NVIDIA technologies (CUDA, TensorRT, Triton Inference Server).
  • Language: Ability to read/write Japanese.

Benefits

Work Schedule:

  • Flex Time: 8 hours/day, 5 days/week (between 07:00 and 22:00).
  • Remote Work: 2 days/week remote (up to 4 days based on performance).
  • Extended Leave: Long holiday policy allowing up to 1 month of continuous leave.

Environment:

  • Language: Fully English-speaking work environment within the Technology team.
  • Social: Company-sponsored monthly/quarterly team meals and recreational events (BBQs, training camps, etc.).

Financial Benefits:

  • Paid Leave: 15 days annually (cumulative up to 2 years).
  • Allowances: \* Full commuter allowance.
  • Housing Allowance, Child Allowance, Late-night Allowance
  • Growth: Learning Development Credit Program
  • Insurance: Comprehensive Health, Pension, and Employment insurance
  • Family Support: Maternity and Paternity leave
Posted 17 days ago

  • Apply


    Enter your email to apply

     

    By applying, you agree to our Terms.

    Already have an account? Sign in.

  • How It Works


    🔍

    Get quality leads

    Review job leads for free, filter by local or global clients, and get real time notifications for new opportunities.


    🎉

    Apply with ease

    Pick the best leads, unlock contact details, and apply effortlessly with Twine's AI application tools.


    📈

    Grow your career

    Showcase your work, pitch to the best leads, land new clients and use Twine’s tools to find more opportunities.