Hi, I’m Monica Challa, a machine learning engineer focused on production-grade LLMs, retrieval-augmented generation, and secure, scalable MLOps on AWS. Over the past 5+ years, I’ve built end-to-end AI systems across financial services, enterprise software, and AdTech, driving measurable gains in model accuracy, latency, and adoption. I thrive in cross-functional teams and enjoy turning complex ML concepts into reliable, production-ready solutions with human-in-the-loop safety. In my current role at Scale AI, I lead end-to-end ML deployments, optimize real-time inference, and implement secure personalization and payments integrations. Previously at Accenture, I delivered real-time ML in SynOps, boosted CTR and fraud detection, and built large-scale data pipelines. I’m passionate about robust ML engineering, secure AI systems, and scalable cloud-native architectures.

Monica Challa

Hi, I’m Monica Challa, a machine learning engineer focused on production-grade LLMs, retrieval-augmented generation, and secure, scalable MLOps on AWS. Over the past 5+ years, I’ve built end-to-end AI systems across financial services, enterprise software, and AdTech, driving measurable gains in model accuracy, latency, and adoption. I thrive in cross-functional teams and enjoy turning complex ML concepts into reliable, production-ready solutions with human-in-the-loop safety. In my current role at Scale AI, I lead end-to-end ML deployments, optimize real-time inference, and implement secure personalization and payments integrations. Previously at Accenture, I delivered real-time ML in SynOps, boosted CTR and fraud detection, and built large-scale data pipelines. I’m passionate about robust ML engineering, secure AI systems, and scalable cloud-native architectures.

Available to hire

Hi, I’m Monica Challa, a machine learning engineer focused on production-grade LLMs, retrieval-augmented generation, and secure, scalable MLOps on AWS. Over the past 5+ years, I’ve built end-to-end AI systems across financial services, enterprise software, and AdTech, driving measurable gains in model accuracy, latency, and adoption. I thrive in cross-functional teams and enjoy turning complex ML concepts into reliable, production-ready solutions with human-in-the-loop safety.

In my current role at Scale AI, I lead end-to-end ML deployments, optimize real-time inference, and implement secure personalization and payments integrations. Previously at Accenture, I delivered real-time ML in SynOps, boosted CTR and fraud detection, and built large-scale data pipelines. I’m passionate about robust ML engineering, secure AI systems, and scalable cloud-native architectures.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert

Language

English
Advanced

Work Experience

Machine Learning Engineer at Scale AI
January 1, 2025 - Present
Led end-to-end ML deployment for enterprise-grade LLMs, achieving a 32% increase in accuracy via RLHF fine-tuning and domain-specific retrieval pipelines; reduced real-time inference latency by 45% through containerized deployments, GPU autoscaling, and prompt-chain compression on Kubernetes (EKS) and EC2; enhanced AI Copilot adoption by 60% through secure, vector-based personalization with user embeddings; built scalable RAG systems using FAISS and OpenSearch integrated with Hugging Face Transformers and PyTorch for multimodal LLMs; implemented real-time, fault-tolerant inference services with Triton Inference Server, TorchServe, and ONNX Runtime; established CI/CD pipelines for ML deployment and blue/green rollouts; integrated PCI-aligned Stripe and Plaid payments and AI-driven recommendations; set up monitoring with CloudWatch, Prometheus, Grafana; collaborated on automated model validation with human-in-the-loop safety thresholds.
Software Engineer at Accenture
September 1, 2019 - November 1, 2023
Delivered a 41% reduction in operational processing time by deploying real-time ML models into SynOps workflows for finance and procurement; Improved ad personalization CTR by 28% with a Transformer-based recommender served via SageMaker endpoints and Redis; Increased fraud detection accuracy by 35% using gradient-boosted models and anomaly detection; Built NLP and deep learning models with PyTorch, Transformers, and TensorFlow for content routing, document parsing, and ad relevance scoring; Engineered high-volume data pipelines with Spark, Airflow, and AWS Glue handling 50M+ daily events; Implemented real-time event-driven inference with Lambda, Kinesis, Kafka; Managed MLOps lifecycle with MLflow, SageMaker Pipelines, and Kubernetes; Architected multimodal systems combining Textract, SageMaker, and S3 for OCR and invoice processing; Optimized distributed training on SageMaker and EC2 GPUs; Integrated Stripe and Razorpay and ad platforms to enable secure monetization.

Education

Master’s in Information Technology Management at Webster University
January 11, 2030 - February 5, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Software & Internet, Professional Services, Media & Entertainment, Government

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert

Hire a AI Engineer

We have the best ai engineer experts on Twine. Hire a ai engineer in Santa Clara today.