I am an AI/ML engineer with 4 years of experience across Adobe, IBM, and academic research, delivering end-to-end LLM features from problem framing and data preparation to deployment. I focus on practical, scalable solutions that combine solid data foundations with robust ML production pipelines. I thrive in cross-functional teams and enjoy turning complex constraints into actionable, measurable outcomes. I’m passionate about building safe, efficient AI systems and continuously improving them with robust evaluation, observability, and clear tooling. My strengths lie in PyTorch, Transformers, RAG, and MLOps, and I enjoy crafting elegant, maintainable solutions that scale across environments (AWS/Azure) and platforms (FastAPI, Docker, vLLM, Redis).

Veera Venkata Sai Kalyan

I am an AI/ML engineer with 4 years of experience across Adobe, IBM, and academic research, delivering end-to-end LLM features from problem framing and data preparation to deployment. I focus on practical, scalable solutions that combine solid data foundations with robust ML production pipelines. I thrive in cross-functional teams and enjoy turning complex constraints into actionable, measurable outcomes. I’m passionate about building safe, efficient AI systems and continuously improving them with robust evaluation, observability, and clear tooling. My strengths lie in PyTorch, Transformers, RAG, and MLOps, and I enjoy crafting elegant, maintainable solutions that scale across environments (AWS/Azure) and platforms (FastAPI, Docker, vLLM, Redis).

Available to hire

I am an AI/ML engineer with 4 years of experience across Adobe, IBM, and academic research, delivering end-to-end LLM features from problem framing and data preparation to deployment. I focus on practical, scalable solutions that combine solid data foundations with robust ML production pipelines. I thrive in cross-functional teams and enjoy turning complex constraints into actionable, measurable outcomes.

I’m passionate about building safe, efficient AI systems and continuously improving them with robust evaluation, observability, and clear tooling. My strengths lie in PyTorch, Transformers, RAG, and MLOps, and I enjoy crafting elegant, maintainable solutions that scale across environments (AWS/Azure) and platforms (FastAPI, Docker, vLLM, Redis).

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Work Experience

AI Engineer at Adobe
January 1, 2025 - November 6, 2025
Shipped an LLM assistant with RAG over Pinecone/Elasticsearch and Transformers, lifting task completion 11% in a controlled A/B test. Served via FastAPI on AWS (ECS/ECR, S3) with Redis for caching/batching. Reduced p95 latency from 1.2s to 760ms by enabling vLLM with KV-cache, regulating batch windows, and request coalescing. Lowered inference cost per request by 18% through QLoRA instruction tuning and 8-bit/4-bit quantization (bitsandbytes/ONNX Runtime), while maintaining quality metrics (ROUGE/BERTScore). Implemented safety guardrails (PII redaction, policy filters) and OpenTelemetry/Prometheus dashboards. Introduced LangChain to orchestrate retrieval and reasoning flows (retriever → re-ranker → tool calls → generator), standardizing prompt templates and chains. Adopted Model Context Protocols (MCPs) to encapsulate connectors and enforce per-tenant scoping and audit trails, reducing integration defects by 20%.
Graduate Data Research Intern at California State University
December 1, 2024 - December 1, 2024
Prototyped domain Q&A with Llama-2 + SFT (LoRA/QLoRA) in PyTorch/Hugging Face; ROUGE-L improved from 0.31 to 0.38 on held-out sets. Built a small-scale RAG demo (FAISS + Elasticsearch hybrid) and optimized chunking/overlap. Engineered data pipelines in Spark/Airflow, stored artifacts in S3/Snowflake, and validated inputs with Great Expectations. Implemented incremental ingest & transform from PostgreSQL to S3 to Snowflake with partitioning and schema evolution.
Associate Machine Learning Engineer at IBM
August 1, 2023 - August 1, 2023
Built ingestion and ETL pipelines (Python/SQL/Spark) orchestrated with Airflow, landing raw and curated data on S3 and Snowflake with incremental loads and partitioning. Modeled the data warehouse (star schemas) and optimized load patterns, reducing per-query cost by 15% and BI query latency by 20%. Trained and tuned models for risk scoring/forecasting (XGBoost, Random Forest, Isolation Forest) improving AUC to 0.83 and MAPE to 12% on production datasets. Deployed services via FastAPI/Docker on AWS ECS/ECR with Redis caching; improved p95 latency from 850ms to 620ms. Monitored drift and performance with Evidently and pipeline telemetry, enabling retraining with small-bucket rollouts.

Education

Master of Science in Data Science at California State University
August 1, 2023 - May 1, 2025
Bachelor of Engineering in Mechanical Engineering at CVR College of Engineering
August 1, 2018 - May 1, 2022

Qualifications

Microsoft Certified: Azure Data Scientist Associate
January 11, 2030 - November 6, 2025
Databricks Generative AI Fundamentals
January 11, 2030 - November 6, 2025
AWS Certified AI Practitioner
January 11, 2030 - November 6, 2025

Industry Experience

Software & Internet, Media & Entertainment, Professional Services, Education