Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am an AI/ML engineer with 4 years of experience across Adobe, IBM, and academic research, delivering end-to-end LLM features from problem framing and data preparation to deployment. I focus on practical, scalable solutions that combine solid data foundations with robust ML production pipelines. I thrive in cross-functional teams and enjoy turning complex constraints into actionable, measurable outcomes. I’m passionate about building safe, efficient AI systems and continuously improving them with robust evaluation, observability, and clear tooling. My strengths lie in PyTorch, Transformers, RAG, and MLOps, and I enjoy crafting elegant, maintainable solutions that scale across environments (AWS/Azure) and platforms (FastAPI, Docker, vLLM, Redis).…I am an AI/ML engineer with 4 years of experience across Adobe, IBM, and academic research, delivering end-to-end LLM features from problem framing and data preparation to deployment. I focus on practical, scalable solutions that combine solid data foundations with robust ML production pipelines. I thrive in cross-functional teams and enjoy turning complex constraints into actionable, measurable outcomes. I’m passionate about building safe, efficient AI systems and continuously improving them with robust evaluation, observability, and clear tooling. My strengths lie in PyTorch, Transformers, RAG, and MLOps, and I enjoy crafting elegant, maintainable solutions that scale across environments (AWS/Azure) and platforms (FastAPI, Docker, vLLM, Redis).

Veera Venkata Sai Kalyan

Data Scientist, AI Engineer, Web Developer, +2





I am an AI/ML engineer with 4 years of experience across Adobe, IBM, and academic research, delivering end-to-end LLM features from problem framing and data preparation to deployment. I focus on practical, scalable solutions that combine solid data foundations with robust ML production pipelines. I thrive in cross-functional teams and enjoy turning complex constraints into actionable, measurable outcomes. I’m passionate about building safe, efficient AI systems and continuously improving them with robust evaluation, observability, and clear tooling. My strengths lie in PyTorch, Transformers, RAG, and MLOps, and I enjoy crafting elegant, maintainable solutions that scale across environments (AWS/Azure) and platforms (FastAPI, Docker, vLLM, Redis).…I am an AI/ML engineer with 4 years of experience across Adobe, IBM, and academic research, delivering end-to-end LLM features from problem framing and data preparation to deployment. I focus on practical, scalable solutions that combine solid data foundations with robust ML production pipelines. I thrive in cross-functional teams and enjoy turning complex constraints into actionable, measurable outcomes. I’m passionate about building safe, efficient AI systems and continuously improving them with robust evaluation, observability, and clear tooling. My strengths lie in PyTorch, Transformers, RAG, and MLOps, and I enjoy crafting elegant, maintainable solutions that scale across environments (AWS/Azure) and platforms (FastAPI, Docker, vLLM, Redis).

Available to hire

I am an AI/ML engineer with 4 years of experience across Adobe, IBM, and academic research, delivering end-to-end LLM features from problem framing and data preparation to deployment. I focus on practical, scalable solutions that combine solid data foundations with robust ML production pipelines. I thrive in cross-functional teams and enjoy turning complex constraints into actionable, measurable outcomes.

I’m passionate about building safe, efficient AI systems and continuously improving them with robust evaluation, observability, and clear tooling. My strengths lie in PyTorch, Transformers, RAG, and MLOps, and I enjoy crafting elegant, maintainable solutions that scale across environments (AWS/Azure) and platforms (FastAPI, Docker, vLLM, Redis).

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Work Experience

AI Engineer at Adobe

January 1, 2025 - November 6, 2025

Shipped an LLM assistant with RAG over Pinecone/Elasticsearch and Transformers, lifting task completion 11% in a controlled A/B test. Served via FastAPI on AWS (ECS/ECR, S3) with Redis for caching/batching. Reduced p95 latency from 1.2s to 760ms by enabling vLLM with KV-cache, regulating batch windows, and request coalescing. Lowered inference cost per request by 18% through QLoRA instruction tuning and 8-bit/4-bit quantization (bitsandbytes/ONNX Runtime), while maintaining quality metrics (ROUGE/BERTScore). Implemented safety guardrails (PII redaction, policy filters) and OpenTelemetry/Prometheus dashboards. Introduced LangChain to orchestrate retrieval and reasoning flows (retriever → re-ranker → tool calls → generator), standardizing prompt templates and chains. Adopted Model Context Protocols (MCPs) to encapsulate connectors and enforce per-tenant scoping and audit trails, reducing integration defects by 20%.

Graduate Data Research Intern at California State University

December 1, 2024 - December 1, 2024

Prototyped domain Q&A with Llama-2 + SFT (LoRA/QLoRA) in PyTorch/Hugging Face; ROUGE-L improved from 0.31 to 0.38 on held-out sets. Built a small-scale RAG demo (FAISS + Elasticsearch hybrid) and optimized chunking/overlap. Engineered data pipelines in Spark/Airflow, stored artifacts in S3/Snowflake, and validated inputs with Great Expectations. Implemented incremental ingest & transform from PostgreSQL to S3 to Snowflake with partitioning and schema evolution.

Associate Machine Learning Engineer at IBM

August 1, 2023 - August 1, 2023

Built ingestion and ETL pipelines (Python/SQL/Spark) orchestrated with Airflow, landing raw and curated data on S3 and Snowflake with incremental loads and partitioning. Modeled the data warehouse (star schemas) and optimized load patterns, reducing per-query cost by 15% and BI query latency by 20%. Trained and tuned models for risk scoring/forecasting (XGBoost, Random Forest, Isolation Forest) improving AUC to 0.83 and MAPE to 12% on production datasets. Deployed services via FastAPI/Docker on AWS ECS/ECR with Redis caching; improved p95 latency from 850ms to 620ms. Monitored drift and performance with Evidently and pipeline telemetry, enabling retraining with small-bucket rollouts.