I am a data scientist with 8+ years of experience delivering scalable ML and GenAI solutions across healthcare, financial services, and media. I design end-to-end data pipelines, build predictive models, and deploy production-grade AI systems on Azure and AWS to drive measurable business impact. I thrive in cross-functional teams and enjoy turning complex data into actionable insights, governance-ready deployments, and delightful user experiences. I specialize in RAG-powered chatbots, vector databases, and large language model workflows, including model fine-tuning, retrieval augmentation, and real-time analytics. I am passionate about mentoring colleagues, embracing responsible AI practices, and continuously learning new technology to solve real-world problems.

Arun Prakash Kata

I am a data scientist with 8+ years of experience delivering scalable ML and GenAI solutions across healthcare, financial services, and media. I design end-to-end data pipelines, build predictive models, and deploy production-grade AI systems on Azure and AWS to drive measurable business impact. I thrive in cross-functional teams and enjoy turning complex data into actionable insights, governance-ready deployments, and delightful user experiences. I specialize in RAG-powered chatbots, vector databases, and large language model workflows, including model fine-tuning, retrieval augmentation, and real-time analytics. I am passionate about mentoring colleagues, embracing responsible AI practices, and continuously learning new technology to solve real-world problems.

Available to hire

I am a data scientist with 8+ years of experience delivering scalable ML and GenAI solutions across healthcare, financial services, and media. I design end-to-end data pipelines, build predictive models, and deploy production-grade AI systems on Azure and AWS to drive measurable business impact. I thrive in cross-functional teams and enjoy turning complex data into actionable insights, governance-ready deployments, and delightful user experiences.

I specialize in RAG-powered chatbots, vector databases, and large language model workflows, including model fine-tuning, retrieval augmentation, and real-time analytics. I am passionate about mentoring colleagues, embracing responsible AI practices, and continuously learning new technology to solve real-world problems.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Intermediate
See more

Language

English
Fluent

Work Experience

Data Scientist at Gainwell Technologies
September 1, 2024 - November 6, 2025
Designed and implemented a Retrieval-Augmented Generation (RAG) based healthcare chatbot to assist clinicians by retrieving patient journey summaries, answering medical queries, and recommending next-best actions. Built end-to-end ingestion, embedding generation, retrieval (FAISS), and LLM inference via AWS Bedrock; deployed via FastAPI on EKS; implemented HIPAA-compliant data handling and model monitoring.
Sr Data Scientist at Optum
August 1, 2024 - August 1, 2024
Developed risk management, fraud detection, and anomaly detection models for trading and investment operations, achieving up to 30% improvement in risk assessment accuracy. Built RAG pipelines for financial knowledge retrieval using FAISS/Pinecone with Azure OpenAI; deployed secure endpoints via Azure API Management and Functions; modernized cloud data workflows with Azure Data Factory and Synapse; implemented LoRA/QLoRA fine-tuning; mentored juniors.
Data Scientist at NBCUniversal
February 1, 2022 - February 1, 2022
Built AI-powered content recommendation systems for NBC's streaming platform; developed NLP pipelines for closed-caption transcripts; designed computer vision models for ad insertion and brand safety; created predictive audience segmentation and real-time dashboards in Tableau.
Machine Learning Engineer at British Telecom
July 1, 2020 - July 1, 2020
Built predictive churn models, NLP pipelines for ticket routing, and real-time network anomaly detection; designed cross-selling recommendations; engineered Hadoop/Spark data pipelines; ensured GDPR data anonymization; deployed AI-powered chatbots integrated with customer support; established dashboards for monitoring.
Machine Learning Engineer at DXC Technologies
December 1, 2017 - December 1, 2017
Explored SAP user data to identify trends; built predictive models to forecast access needs and risks; documented business requirements (BRDs) and collaborated with cross-functional teams to productionize models; mentored junior data scientists and analysts.
Data Scientist at DXC Technologies
December 1, 2017 - December 1, 2017
Explored and analyzed SAP user data to identify trends and optimize access controls using clustering and classification, improving security and governance. Developed predictive models to forecast user access needs, coordinated with stakeholders to define requirements and write BRDs, and defined data mappings to integrate ML models into existing systems. Led mirror-to-production testing and User Acceptance Testing (UAT), mentoring junior data scientists and analysts while improving reporting capabilities.

Education

Add your educational history here.

Qualifications

Master’s in Computer Science
January 11, 2030 - January 1, 2021
Master’s in Computer Science
January 11, 2030 - January 1, 2021

Industry Experience

Healthcare, Media & Entertainment, Financial Services, Professional Services, Software & Internet, Telecommunications