Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I'm Samyukth Challa, an AI/ML Engineer based in the United States with a track record of building enterprise GenAI platforms, including LLM orchestration, RAG pipelines, and scalable inference systems for regulated environments. I thrive at the intersection of data, ML, and platform engineering, delivering secure, low-latency AI solutions for enterprise customers. I have hands-on experience with cloud GPU infrastructure, Kubernetes, MLOps, and embedding-rich pipelines, including real-time AI integration across finance and IT domains. I enjoy translating complex requirements into reliable, scalable AI solutions and collaborating across teams to accelerate business impact.…I'm Samyukth Challa, an AI/ML Engineer based in the United States with a track record of building enterprise GenAI platforms, including LLM orchestration, RAG pipelines, and scalable inference systems for regulated environments. I thrive at the intersection of data, ML, and platform engineering, delivering secure, low-latency AI solutions for enterprise customers. I have hands-on experience with cloud GPU infrastructure, Kubernetes, MLOps, and embedding-rich pipelines, including real-time AI integration across finance and IT domains. I enjoy translating complex requirements into reliable, scalable AI solutions and collaborating across teams to accelerate business impact.

Samyukth Challa

AI Engineer, Data Scientist, Full Stack Developer, +2





I'm Samyukth Challa, an AI/ML Engineer based in the United States with a track record of building enterprise GenAI platforms, including LLM orchestration, RAG pipelines, and scalable inference systems for regulated environments. I thrive at the intersection of data, ML, and platform engineering, delivering secure, low-latency AI solutions for enterprise customers. I have hands-on experience with cloud GPU infrastructure, Kubernetes, MLOps, and embedding-rich pipelines, including real-time AI integration across finance and IT domains. I enjoy translating complex requirements into reliable, scalable AI solutions and collaborating across teams to accelerate business impact.…I'm Samyukth Challa, an AI/ML Engineer based in the United States with a track record of building enterprise GenAI platforms, including LLM orchestration, RAG pipelines, and scalable inference systems for regulated environments. I thrive at the intersection of data, ML, and platform engineering, delivering secure, low-latency AI solutions for enterprise customers. I have hands-on experience with cloud GPU infrastructure, Kubernetes, MLOps, and embedding-rich pipelines, including real-time AI integration across finance and IT domains. I enjoy translating complex requirements into reliable, scalable AI solutions and collaborating across teams to accelerate business impact.

Available to hire

I’m Samyukth Challa, an AI/ML Engineer based in the United States with a track record of building enterprise GenAI platforms, including LLM orchestration, RAG pipelines, and scalable inference systems for regulated environments. I thrive at the intersection of data, ML, and platform engineering, delivering secure, low-latency AI solutions for enterprise customers.

I have hands-on experience with cloud GPU infrastructure, Kubernetes, MLOps, and embedding-rich pipelines, including real-time AI integration across finance and IT domains. I enjoy translating complex requirements into reliable, scalable AI solutions and collaborating across teams to accelerate business impact.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Work Experience

AI/ML Engineer at Scale AI

July 1, 2024 - Present

Designed and maintained LLM orchestration layers enabling dynamic routing across OpenAI, Claude, Llama, and fine-tuned proprietary models, enforcing latency, cost, and compliance constraints for enterprise workloads. Built and optimized retrieval augmented generation (RAG) pipelines using FAISS, Pinecone, and pgvector; improved enterprise query response latency by 35 percent via embedding caching, request batching, and optimized vector search. Developed scalable data ingestion and preprocessing pipelines for PDFs, tables, audio, and logs; integrated real-time inference services via REST and gRPC with autoscaling for multi-tenant AI applications. Implemented cost aware routing, selective retrieval, and fallback strategies to reduce LLM operating costs by 25 percent and supported tool calls and stateful workflows with human in the loop controls. Collaborated with cloud, SRE, and security teams to deploy GPU backed inference infrastructure on AWS and GCP; built evaluation and canary deplo

Data Scientist at NVIDIA

June 1, 2019 - December 1, 2022

Supported enterprise scale data prep and ETL pipelines using Python, SQL, Pandas, and Spark, processing multi terabyte structured and streaming datasets for internal AI services and analytics platforms. Improved data quality and pipeline reliability by implementing validation checks, anomaly detection rules, and schema monitoring across batch and near real-time data feeds. Assisted in integrating ML model outputs into production microservices by exposing models via REST/gRPC APIs for analytics, recommendation, and anomaly detection. Contributed to feature engineering pipelines for real-time and batch scoring, enabling 20–25 percent faster model iteration cycles. Supported GPU enabled cloud workflows on AWS EC2 GPU instances by automating dataset preparation and batch inference jobs, reducing idle GPU costs by 15 percent. Worked with DevOps and SRE to align data pipelines with Docker and Kubernetes based environments, ensuring compatibility with internal AI platform standards. Built s