I'm Samyukth Challa, an AI/ML Engineer based in the United States with a track record of building enterprise GenAI platforms, including LLM orchestration, RAG pipelines, and scalable inference systems for regulated environments. I thrive at the intersection of data, ML, and platform engineering, delivering secure, low-latency AI solutions for enterprise customers. I have hands-on experience with cloud GPU infrastructure, Kubernetes, MLOps, and embedding-rich pipelines, including real-time AI integration across finance and IT domains. I enjoy translating complex requirements into reliable, scalable AI solutions and collaborating across teams to accelerate business impact.

Samyukth Challa

I'm Samyukth Challa, an AI/ML Engineer based in the United States with a track record of building enterprise GenAI platforms, including LLM orchestration, RAG pipelines, and scalable inference systems for regulated environments. I thrive at the intersection of data, ML, and platform engineering, delivering secure, low-latency AI solutions for enterprise customers. I have hands-on experience with cloud GPU infrastructure, Kubernetes, MLOps, and embedding-rich pipelines, including real-time AI integration across finance and IT domains. I enjoy translating complex requirements into reliable, scalable AI solutions and collaborating across teams to accelerate business impact.

Available to hire

I’m Samyukth Challa, an AI/ML Engineer based in the United States with a track record of building enterprise GenAI platforms, including LLM orchestration, RAG pipelines, and scalable inference systems for regulated environments. I thrive at the intersection of data, ML, and platform engineering, delivering secure, low-latency AI solutions for enterprise customers.

I have hands-on experience with cloud GPU infrastructure, Kubernetes, MLOps, and embedding-rich pipelines, including real-time AI integration across finance and IT domains. I enjoy translating complex requirements into reliable, scalable AI solutions and collaborating across teams to accelerate business impact.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert

Work Experience

AI/ML Engineer at Scale AI
July 1, 2024 - Present
Designed and maintained LLM orchestration layers enabling dynamic routing across OpenAI, Claude, Llama, and fine-tuned proprietary models, enforcing latency, cost, and compliance constraints for enterprise workloads. Built and optimized retrieval augmented generation (RAG) pipelines using FAISS, Pinecone, and pgvector; improved enterprise query response latency by 35 percent via embedding caching, request batching, and optimized vector search. Developed scalable data ingestion and preprocessing pipelines for PDFs, tables, audio, and logs; integrated real-time inference services via REST and gRPC with autoscaling for multi-tenant AI applications. Implemented cost aware routing, selective retrieval, and fallback strategies to reduce LLM operating costs by 25 percent and supported tool calls and stateful workflows with human in the loop controls. Collaborated with cloud, SRE, and security teams to deploy GPU backed inference infrastructure on AWS and GCP; built evaluation and canary deplo
Data Scientist at NVIDIA
June 1, 2019 - December 1, 2022
Supported enterprise scale data prep and ETL pipelines using Python, SQL, Pandas, and Spark, processing multi terabyte structured and streaming datasets for internal AI services and analytics platforms. Improved data quality and pipeline reliability by implementing validation checks, anomaly detection rules, and schema monitoring across batch and near real-time data feeds. Assisted in integrating ML model outputs into production microservices by exposing models via REST/gRPC APIs for analytics, recommendation, and anomaly detection. Contributed to feature engineering pipelines for real-time and batch scoring, enabling 20–25 percent faster model iteration cycles. Supported GPU enabled cloud workflows on AWS EC2 GPU instances by automating dataset preparation and batch inference jobs, reducing idle GPU costs by 15 percent. Worked with DevOps and SRE to align data pipelines with Docker and Kubernetes based environments, ensuring compatibility with internal AI platform standards. Built s

Education

Masters’s Degree in Computer Science at University of Texas at Arlington
January 11, 2030 - April 9, 2026
Bachelor’s Degree in Information Technology at Sreenidhi Institute of Science and Technology
January 11, 2030 - April 9, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Software & Internet, Professional Services