I'm W.P. Roshan, a Data Science Engineer with 11+ years in the AI/ML industry. I specialize in building and deploying ML and DL models, including LLMs and RAG systems, and I love turning complex data into actionable insights. I enjoy solving real-world problems across Healthcare, Digital marketing. Telco, Insurance, and EduTech with scalable AI/ML solutions. I'm proficient in Python and MLOps, including Docker and Kubernetes, and I focus on optimizing ML pipelines, governance, and scalable AI solutions. I'm excited to bring my experience to AI-driven innovation.

W.P. Roshan

I'm W.P. Roshan, a Data Science Engineer with 11+ years in the AI/ML industry. I specialize in building and deploying ML and DL models, including LLMs and RAG systems, and I love turning complex data into actionable insights. I enjoy solving real-world problems across Healthcare, Digital marketing. Telco, Insurance, and EduTech with scalable AI/ML solutions. I'm proficient in Python and MLOps, including Docker and Kubernetes, and I focus on optimizing ML pipelines, governance, and scalable AI solutions. I'm excited to bring my experience to AI-driven innovation.

Available to hire

I’m W.P. Roshan, a Data Science Engineer with 11+ years in the AI/ML industry. I specialize in building and deploying ML and DL models, including LLMs and RAG systems, and I love turning complex data into actionable insights. I enjoy solving real-world problems across Healthcare, Digital marketing. Telco, Insurance, and EduTech with scalable AI/ML solutions.

I’m proficient in Python and MLOps, including Docker and Kubernetes, and I focus on optimizing ML pipelines, governance, and scalable AI solutions. I’m excited to bring my experience to AI-driven innovation.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more

Language

English
Fluent
Sinhala, Sinhalese
Advanced

Work Experience

Lead Data Science Engineer at Andela
January 1, 2024 - November 27, 2025
Led the design and deployment of scalable ML systems, including anomaly/outlier, churn, and automatic valuation models on serverless platforms. Built multi-modal, agentic RAG pipelines and CDP integrations, and implemented ML observability, data lineage, and model monitoring using ZenML, MLflow, BentoML, and CometML. Replaced legacy pipelines with modern FTI-based architecture, boosting usability and performance by ~40%. Implemented data versioning and governance via DVC, Great Expectations, and Apache Atlas, achieving higher data quality and governance. Leveraged LangChain and LlamaIndex to develop LLM-backed AI apps.
Senior Data Science Engineer at Gapstars (Client: Spotr.ai, Netherlands)
December 1, 2023 - December 1, 2023
Built a remote property inspection AI system processing 500+ properties monthly. Developed CV-based risk assessment and underwriting models and automated damage assessment for claims using Mask R-CNN and Faster R-CNN. Implemented roof material classification using CNNs with LiDAR data, and deployed remote sensing workflows. Reengineered Airflow pipelines with mage.ai, increasing performance by ~60%, and strengthened data governance with Great Expectations and Apache Atlas by ~75%. Introduced LLM- and GPT-enhanced agentic RAG pipelines for real estate and finance sectors.
Senior Data Science Engineer at Adventus Education
May 1, 2023 - May 1, 2023
Architected big data pipelines and ETL processes to improve EdTech platform performance by 40%. Built ML models to identify at-risk students with ~92% accuracy and drove a 23% reduction in at-risk student levels. Enhanced churn prediction and retention strategies with ~93% accuracy.
Lead Data & BI Engineer at DataMTX Labs
March 1, 2022 - March 1, 2022
Led a high-impact data engineering team, delivering projects 90% faster and achieving 60% improvement in data processing efficiency. Implemented data warehouse (Redshift) and data lake (S3), driving a 70% efficiency gain. Built time-series forecasting models (ARIMA, Auto-ARIMA, SARIMA, SARIMAX) with 89–94% accuracy, improving forecast precision by 40%.
Senior Data Science Engineer at DigitalXLabs
October 1, 2021 - October 1, 2021
Architected AdStudio platform for AdTech using Spark, Kafka, Druid, and Cassandra, increasing efficiency and scalability by ~30%. Developed price optimization and recommendation models using CDP, DMP, and CRM data with 97% accuracy, boosting conversions by 7%.
Big Data Team Lead at Cloud Solutions International
July 1, 2019 - July 1, 2019
Automated data workflows with Apache Airflow, improving data accuracy by 30% and reducing manual errors. Enhanced IoT metrics visibility by 50% via ELK stack, enabling faster decision-making.
Senior Engineer (AI/ML) at IQVIA (formerly IMS Health)
January 1, 2019 - January 1, 2019
Built real-time drug adverse event tracking using Spark, Hadoop, Kafka, Nifi, and Nutch; created KPI dashboards improving response times by ~40%. Developed OCR/HTR services with PyTorch (CTC) achieving ~85% transcription accuracy and 20% faster processing. Implemented NLP/NLU systems with CoreNLP, SparkNLP, spaCy, and NLTK achieving strong F1 and entity recognition.

Education

MSc in Big Data Analytics (Reading) at Robert Gordon University
January 11, 2030 - November 27, 2025
Bachelor of Technology in Software Engineering (Level 5) at Open University of Sri Lanka
January 11, 2030 - November 27, 2025
Certified Scrum Master at Scrum Alliance
January 11, 2030 - November 27, 2025
Diploma in Software Engineering at Technical Engineering College, Sri Lanka
January 11, 2030 - November 27, 2025

Qualifications

Certified Scrum Master
January 11, 2030 - November 27, 2025

Industry Experience

Healthcare, Education, Media & Entertainment, Financial Services, Software & Internet, Professional Services, Real Estate & Construction, Other

Experience Level

Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
See more