Hi, I'm Akhil Reddy Ramana, an AI engineer with 5+ years of experience building scalable AI/ML and Generative AI solutions across cloud-native and enterprise environments. I specialize in LLMs, RAG architectures, vector databases, multi-agent systems, AI microservices, and MLOps pipelines. I enjoy turning complex business needs into robust production-grade AI systems using Python, LangChain, LangGraph, AutoGen, and platforms like AWS, GCP, and Azure ML. I focus on reducing latency, optimizing models, and automating end-to-end AI workflows to drive impact. I've collaborated with engineering, research, product, and business teams to ship AI features that improve operational efficiency, customer experience, and decision intelligence. I have experience building data architectures, ETL pipelines, and real-time streaming with Spark, Airflow, Kafka, and SQL, and I am committed to governance, observability, and responsible AI across cloud providers.

Akhil Reddy Ramana

Hi, I'm Akhil Reddy Ramana, an AI engineer with 5+ years of experience building scalable AI/ML and Generative AI solutions across cloud-native and enterprise environments. I specialize in LLMs, RAG architectures, vector databases, multi-agent systems, AI microservices, and MLOps pipelines. I enjoy turning complex business needs into robust production-grade AI systems using Python, LangChain, LangGraph, AutoGen, and platforms like AWS, GCP, and Azure ML. I focus on reducing latency, optimizing models, and automating end-to-end AI workflows to drive impact. I've collaborated with engineering, research, product, and business teams to ship AI features that improve operational efficiency, customer experience, and decision intelligence. I have experience building data architectures, ETL pipelines, and real-time streaming with Spark, Airflow, Kafka, and SQL, and I am committed to governance, observability, and responsible AI across cloud providers.

Available to hire

Hi, I’m Akhil Reddy Ramana, an AI engineer with 5+ years of experience building scalable AI/ML and Generative AI solutions across cloud-native and enterprise environments. I specialize in LLMs, RAG architectures, vector databases, multi-agent systems, AI microservices, and MLOps pipelines. I enjoy turning complex business needs into robust production-grade AI systems using Python, LangChain, LangGraph, AutoGen, and platforms like AWS, GCP, and Azure ML. I focus on reducing latency, optimizing models, and automating end-to-end AI workflows to drive impact.

I’ve collaborated with engineering, research, product, and business teams to ship AI features that improve operational efficiency, customer experience, and decision intelligence. I have experience building data architectures, ETL pipelines, and real-time streaming with Spark, Airflow, Kafka, and SQL, and I am committed to governance, observability, and responsible AI across cloud providers.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

English
Fluent

Work Experience

Gen AI Developer at Carilion Clinic
July 1, 2024 - Present
Architected LLM-based clinical and operational AI systems using LangChain, LangGraph, and AutoGen to support intelligent multi-agent collaboration for tasks such as prior authorization support, care-plan generation, chart summarization, and automated patient communication. Built scalable RAG pipelines integrating GPT-4, T5, FAISS, and Pinecone to retrieve patient history, clinical guidelines, and medical knowledgebases, enabling context-grounded reasoning with high factual accuracy for healthcare workflows. Developed AI microservices using FastAPI, async inference, Docker, and Kubernetes to serve low-latency clinical decision-support recommendations and automated documentation tools across high-traffic healthcare environments. Designed AI workflows on AWS Bedrock, Vertex AI, and Lambda to orchestrate medical data routing, HIPAA-compliant prompt flows, and multi-model pipelines for claims processing, triage, and physician-support tools. Applied advanced prompt engineering, instruction t
AI/ML Engineer at Virtusa – Client: Texas Department of Transportation (TxDOT)
March 1, 2023 - May 1, 2024
Built multi-agent LLM orchestration systems with LangChain, LangGraph, and AutoGen to handle complex tasks such as identifying relevant sections of the Texas Transportation Code, validating engineering requirements, recommending corrective actions, and retrieving cross-agency transportation data (TxDPS, TCEQ, FHWA). Designed and deployed scalable RAG ecosystems using FAISS and Pinecone capable of indexing millions of transportation assets, including project histories, maintenance manuals, bid proposals, safety bulletins, and construction standards—ensuring LLM responses are consistently grounded in verified TxDOT data sources. Developed highly responsive FastAPI-based inference services that power internal TxDOT portals used for permitting, right-of-way processing, fleet maintenance insights, project risk evaluations, and automated safety assessments. Conducted deep learning experimentation using PyTorch and TensorFlow to build classification and prediction models for roadway inciden
AI/Systems Engineer at Tata Consultancy Services
March 1, 2021 - August 1, 2022
Designed and implemented AI components for large banking platforms, integrating ML models with secure enterprise data pipelines, real-time APIs, and risk analytics systems used across retail and commercial banking operations. Built advanced text-processing and retrieval workflows that combined early LLMs with ML algorithms to analyze loan documents, KYC records, customer communications, regulatory guidelines, and financial statements significantly improving context relevance, fraud detection accuracy, and compliance review efficiency. Developed scalable backend microservices using FastAPI and Django REST, enabling low-latency ML inference for credit decisioning, transaction risk scoring, AML alert triaging, and automated customer-support insights. Built automated ETL pipelines using AWS Glue, S3, and Redshift, supporting high-volume banking data ingestion for credit risk models, AML rule engines, and financial forecasting workflows. Integrated Snowflake with cloud data layers to suppor
Machine Learning Engineer at Tech Mahindra Ltd.
December 1, 2020 - March 1, 2021
Engineered a variety of predictive and analytical ML models using PyTorch and TensorFlow, covering classification, anomaly detection, time-series forecasting, and custom neural architectures tailored to operational and analytical use cases. Designed fully automated data ingestion and transformation pipelines leveraging Airflow, Spark, and SQL, enabling consistent dataset preparation, large-scale preprocessing, and scheduled model training cycles. Developed distributed ETL pipelines using Spark to process millions of records, optimize feature computation, and support downstream analytics with significantly reduced processing time. Refactored and containerized ML workloads using Docker, packaging models and dependencies into reproducible environments and orchestrating scalable deployments on Kubernetes. Implemented event-driven, real-time inference systems using Kafka and Spark Streaming, allowing models to react instantly to incoming data streams and support adaptive, continuously updat

Education

Master of Science at University of Memphis
August 1, 2022 - May 1, 2024

Qualifications

Google Professional Machine Learning Engineer
January 11, 2030 - January 6, 2026
NVIDIA Deep Learning Institute (DLI) Certifications
January 11, 2030 - January 6, 2026
DeepLearning.AI Machine Learning Specialization
January 11, 2030 - January 6, 2026

Industry Experience

Computers & Electronics, Healthcare, Professional Services