Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

Find Freelance Jobs>AI Engineer Jobs>AI Engineer Jobs in Boulder>Job Details

ML engineer

AI Engineer

💰 Negotiable

📍 Boulder, United States

Twine Jobs

Based in Manchester, United Kingdom

Last online 18 hours ago

AI Engineer is needed in Boulder, United States.

Client: Apple

Location: Boulder, CO

Contract: undefined

Job Description

Do you have a passion for computer vision and solving deep learning problems? The Video Engineering Data Analytics and Quality group is seeking an expert in evaluating machine learning and deep learning models, including foundation models and multimodal systems.

This role will play a critical part in crafting robust evaluation frameworks, using both traditional statistical methods and modern techniques like LLM-as-a-Judge! The ideal candidate combines strong analytical thinking, expertise in Python, and advanced knowledge of statistical methodologies and data quality standards.

This role involves collaboration with teams at Apple passionate about developing foundation models, including ML engineers, data scientists, and ML Infrastructure engineers to deliver amazing user experiences!

Key Responsibilities include:

Develop robust methodologies to assess the performance of foundation models (e.g., LLMs, vision-language models, etc.) across diverse tasks.
Leverage LLMs as judges to perform subjective and open-ended model evaluations (e.g., for summarization, reasoning, or multimodal generation tasks).
Build, curate, and lead evaluation datasets and benchmarks.
Collaborate with research, engineering, and product teams to define evaluation goals aligned with user experience and product quality.
Conduct failure analysis and uncover edge cases to improve model robustness.
Contribute to our tools and infrastructure to automate and scale evaluation processes.

Requirements

Preferred Qualifications:

Experience working with open-source evaluation tools like OpenEval, ELO-based ranking, or LLM-as-a-Judge frameworks.
Familiarity with prompt engineering, few-shot or zero-shot evaluation techniques.
Experience evaluating generative models (e.g., text generation, image generation).
Prior contributions to ML benchmarks or public evaluations.
Strong interpersonal skills.

Minimum Qualifications:

BS and a minimum of 3 years relevant industry experience.
Strong experience in evaluating supervised, unsupervised, and deep learning models.
Hands-on experience evaluating LLMs and using them as scoring/judging mechanisms.
Familiarity with multimodal models (e.g., image + text, video + audio) and related evaluation challenges.
Proficiency in Python and libraries such as NumPy, pandas, scikit-learn, PyTorch, or TensorFlow.
Solid understanding of statistical testing, sampling, confidence intervals, and metrics (e.g., precision/recall, BLEU, ROUGE, FID, etc.).
Strong documentation skills, including the ability to write technical reports and present to non-technical audiences.

Posted 6 months ago

No longer accepting applications

Get instant notifications for new AI Engineer jobs. Enter your email:

How It Works
🔍
Get quality leads
Review job leads for free, filter by local or global clients, and get real time notifications for new opportunities.
🎉
Apply with ease
Pick the best leads, unlock contact details, and apply effortlessly with Twine's AI application tools.
📈
Grow your career
Showcase your work, pitch to the best leads, land new clients and use Twine’s tools to find more opportunities.
Sign up for free

- Hire an expert AI Engineer
- Hire a AI Engineer in Boulder

ML engineer

Job Description

Requirements

No longer accepting applications

How It Works

Similar Jobs