I am a data scientist with nearly 5 years building and deploying ML/GenAI systems in production. I have led end-to-end development across data pipelines, model training, CI/CD, and monitoring, delivering impact on cost, latency, and business outcomes. I thrive owning delivery, collaborating across teams, and maintaining models post-deployment. In my roles, I have migrated infrastructure, built AI-powered search, fine-tuned LLMs, and automated data workflows to drive measurable results.

Rahul Teja B

I am a data scientist with nearly 5 years building and deploying ML/GenAI systems in production. I have led end-to-end development across data pipelines, model training, CI/CD, and monitoring, delivering impact on cost, latency, and business outcomes. I thrive owning delivery, collaborating across teams, and maintaining models post-deployment. In my roles, I have migrated infrastructure, built AI-powered search, fine-tuned LLMs, and automated data workflows to drive measurable results.

Available to hire

I am a data scientist with nearly 5 years building and deploying ML/GenAI systems in production. I have led end-to-end development across data pipelines, model training, CI/CD, and monitoring, delivering impact on cost, latency, and business outcomes.

I thrive owning delivery, collaborating across teams, and maintaining models post-deployment. In my roles, I have migrated infrastructure, built AI-powered search, fine-tuned LLMs, and automated data workflows to drive measurable results.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Data Scientist at Pitney Bowes
July 1, 2023 - Present
Led migration of GenAI infrastructure from Azure ML to Microsoft Fabric, enhancing scalability and governance; cut cloud costs by 30% and reduced pipeline runtimes by 35% through CI/CD optimization. Deployed RAG-based generative search with FAISS and Pinecone, cutting retrieval latency by 40% and improving multi-hop query accuracy by 25%. Fine-tuned LLMs (LLaMA, Falcon, Mistral) with LoRA and Hugging Face; established prompt safety protocols reducing unsafe outputs by 90%, enabling secure enterprise rollout. Automated data workflows with LangChain/LangGraph, integrated multimodal inputs, and reduced manual reporting by 25%. Built churn prediction and marketing attribution models (150+ features) deployed via a Streamlit app, delivering 15%+ uplift in campaigns and 30-40% efficiency gains. Predicted loan defaults using XGBoost (AUC 0.76) on 12M+ Snowflake records; collaborated to build scalable ELT pipelines and BI dashboards for KPI monitoring.
Graduate Machine Learning Engineer at Michigan Technological University
August 1, 2022 - April 1, 2023
Launched a campus-facing Q&A assistant (RAG) that answers student/faculty questions from approved sources (syllabi, policies, lecture notes, lab manuals), with source citations in every reply. Built a lightweight document processing pipeline in Python to parse PDFs/HTML, normalize text, and emit chunked, metadata-tagged content for indexing. Implemented embeddings + FAISS for indexing & retrieval; tuned chunk size/overlap and metadata filters to improve answer quality while keeping lookup latency low. Published a FastAPI service with OpenAPI docs used by a web UI/chat widget; added request/response logging and trace IDs for basic observability. Set up an evaluation harness with an instructor-curated golden set and periodic spot checks; tracked accuracy and p95 latency. Added safety and grounding controls (allowlisted sources, basic PII filters, provenance tags) to ensure responses remain compliant with department guidelines.
Machine Learning Engineer at Digital Nirvana (Client: Bloomberg USA)
January 1, 2021 - July 1, 2022
Delivered broadcast signal and metadata analysis for 150+ live channels on MonitorIQ, ensuring 100% FCC and EBU compliance and supporting regulatory and QA requirements. Designed and automated 40+ Power BI dashboards deployed via EC2/Redshift, tracking audio/video loss, caption errors, and SCTE anomalies, reducing manual QA effort by 30% and accelerating issue detection. Processed 5+ TB of metadata (subtitles, ads, logs) using SQL/Python workflows with AWS Glue, Athena, and S3, enabling real-time insights for internal and external stakeholders. Optimized data pipelines across Lambda and Glue, improving ad detection accuracy by 15% and boosting marketing analytics efficiency. Collaborated with AI/ML teams to annotate 20K+ media segments, accelerating training for facial recognition, logo detection, and transcript verification models.
Graduate Machine Learning Engineer at Michigan Tech
August 1, 2022 - April 1, 2023
Launched campus Q&A assistant (RAG) that answers student/faculty questions from approved sources with source citations. Built a lightweight document processing pipeline to parse PDFs/HTML, normalize text, and emit chunked, metadata-tagged content for indexing. Implemented embeddings + FAISS for indexing/retrieval; tuned chunk size/overlap and metadata filters to improve answer quality with low lookup latency. Published a FastAPI service with OpenAPI docs and basic observability. Established evaluation harness with a golden set and spot checks to monitor accuracy and latency, and added safety/grounding controls to ensure compliance with guidelines.

Education

Master’s in Data Science at Michigan Technological University
January 11, 2030 - January 7, 2026
Master of Science in Data Science at Michigan Technological University
January 11, 2030 - January 7, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Media & Entertainment, Education, Financial Services, Professional Services