I am a Generative AI Engineer with over 9 years of experience designing, fine-tuning, and deploying large language models and multimodal AI systems across diverse sectors such as finance, retail, and healthcare. I specialize in building scalable AI-powered pipelines, vector search systems, and hybrid AI architectures for regulated industries, leveraging cutting-edge techniques and cloud platforms to optimize performance and compliance. My expertise spans end-to-end GenAI application development, including robust content moderation, benchmark testing of frontier LLMs, and reinforcement learning for human-aligned outputs. I am passionate about creating innovative AI solutions with explainability, modular design, and seamless integration into enterprise environments, contributing to open-source projects and advancing the state-of-the-art in AI engineering.

Keerthi Reddy Bokka

I am a Generative AI Engineer with over 9 years of experience designing, fine-tuning, and deploying large language models and multimodal AI systems across diverse sectors such as finance, retail, and healthcare. I specialize in building scalable AI-powered pipelines, vector search systems, and hybrid AI architectures for regulated industries, leveraging cutting-edge techniques and cloud platforms to optimize performance and compliance. My expertise spans end-to-end GenAI application development, including robust content moderation, benchmark testing of frontier LLMs, and reinforcement learning for human-aligned outputs. I am passionate about creating innovative AI solutions with explainability, modular design, and seamless integration into enterprise environments, contributing to open-source projects and advancing the state-of-the-art in AI engineering.

Available to hire

I am a Generative AI Engineer with over 9 years of experience designing, fine-tuning, and deploying large language models and multimodal AI systems across diverse sectors such as finance, retail, and healthcare. I specialize in building scalable AI-powered pipelines, vector search systems, and hybrid AI architectures for regulated industries, leveraging cutting-edge techniques and cloud platforms to optimize performance and compliance.

My expertise spans end-to-end GenAI application development, including robust content moderation, benchmark testing of frontier LLMs, and reinforcement learning for human-aligned outputs. I am passionate about creating innovative AI solutions with explainability, modular design, and seamless integration into enterprise environments, contributing to open-source projects and advancing the state-of-the-art in AI engineering.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert

Work Experience

Gen AI Engineer at Target
April 1, 2024 - Present
Led distributed training of large language models using DeepSpeed and Hugging Face Accelerate to reduce training time by 40%. Improved multilingual LLM translation accuracy by 25% via adapter-based fine-tuning across 10+ languages. Developed educational content generation systems tailored to curricula and personalized learning. Built full-stack GenAI applications with React and Streamlit frontends and FastAPI backends. Created document Q&A platforms that reduced manual searches by 60%. Conducted A/B testing on model families and contributed to open-source LangChain modules. Collaborated across teams to integrate GenAI into enterprise SaaS platforms with CI/CD pipelines, Docker, and Kubernetes. Developed inference services on Triton Inference Server and advanced AI agents with function-calling and persistent memory. Curated datasets with RLHF-like preference modeling, automated business intelligence reporting, and linked GenAI outputs with knowledge graphs for explainability. Implemente
AI/ML Engineer at McKesson
March 31, 2024 - August 26, 2025
Developed and deployed supervised and unsupervised ML models using Scikit-learn, TensorFlow, and PyTorch for clinical analytics with Azure ML and GCP Vertex AI. Fine-tuned CNNs, RNNs, and transformers for image classification and sequence tasks, reducing error rates by 18%. Built NLP pipelines for sentiment analysis, named entity recognition, and summarization, cutting manual review by 40%. Designed anomaly detection increasing fraud recall by 22%. Delivered scalable REST APIs for ML models with real-time prediction services processing over 10 million records daily. Automated model retraining and feature store updates ensuring high uptime. Used hyperparameter tuning and transfer learning to enhance model performance. Developed generative models for medical imaging augmentation and attention-based forecasting outperforming baselines by 20%. Deployed AI models to edge devices with monitoring dashboards. Orchestrated ETL pipelines integrating APIs, databases, and streaming sources. Ensure
AI/ML Engineer at UBS
August 31, 2022 - August 26, 2025
Automated ETL pipelines using Apache Airflow, AWS Glue, Lambda, and S3 for ingesting high-volume financial data. Built financial data marts in GCP BigQuery and deployed ML models for fraud detection and credit risk scoring using AWS SageMaker and GCP Vertex AI. Conducted real-time loan default risk analysis with Spark on AWS clusters, improving decision latency. Developed event-driven fraud detection workflows leveraging AWS Lambda and GCP Cloud Functions, increasing processing speeds. Created interactive dashboards using Tableau and Power BI connected to AWS Redshift and GCP BigQuery. Applied cost optimization strategies to data storage and query processing. Conducted A/B testing and uplift modeling on cloud datasets. Developed predictive models for customer segmentation and churn deployed to cloud inference endpoints. Automated fraud alerts with Lambda triggers and GCP Pub/Sub. Built monitoring dashboards tracking model drift and performance. Ensured pipeline security and compliance.
Data Scientist at Marathon Petroleum
February 29, 2020 - August 26, 2025
Designed predictive analytics models using AWS SageMaker for equipment failure forecasting and refinery throughput optimization. Automated ETL workflows with AWS Glue, Lambda, and S3 for near real-time refinery sensor data ingestion. Built data warehouses on Amazon Redshift and used GCP Dataflow and BigQuery for high-throughput refinery SCADA and IoT data analytics. Developed time-series forecasting models using Prophet, LSTM, and ARIMA with GCP Vertex AI. Integrated Kafka, AWS Lambda, and GCP Pub/Sub for event-driven anomaly detection. Created computer vision workflows for pipeline and equipment inspection automation. Applied explainable AI techniques (SHAP, LIME) for transparency and compliance in operations. Deployed serverless APIs exposing predictive maintenance insights to operator dashboards. Optimized BigQuery with partitioning and clustering to reduce query latency. Deployed containerized ML models on AWS ECS and GCP AI Platform for scalable inference across refinery sites.
Python Developer at Apollo Tyres
July 31, 2018 - August 26, 2025
Designed and automated ETL workflows on AWS Lambda and S3, reducing data processing time by 40%. Developed Python-based data pipelines integrated with AWS services to streamline data ingestion and transformation. Built and deployed machine learning models leveraging Scikit-learn with data stored in S3. Integrated CI/CD pipelines on AWS for automated testing and deployment, decreasing release errors by 30%. Created event-driven workflows with AWS Lambda and cron jobs for automated data refresh and alerting. Managed version control and containerized Python applications for scalable execution. Optimized AWS S3 storage and lifecycle policies, cutting storage costs by 20% while maintaining compliance. Deployed serverless APIs on AWS Lambda for low-latency data access integrated with internal systems.

Education

Add your educational history here.

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Retail, Healthcare, Manufacturing