I am a Senior Data Scientist with strong expertise in Python, NLP, and large language models, focused on building robust, scalable AI-driven data products. I have designed, built, and deployed AI-powered platforms using LangChain, retrieval-augmented generation (RAG) architectures, and Google Cloud Platform, with an emphasis on clean, testable code and reliable system design. I am highly curious about emerging AI technologies and tools, and I actively experiment with new approaches to improve model behaviour, retrieval quality, and user-facing performance. I am a proven technical leader and mentor who communicates complex ideas clearly and delivers innovative data science solutions for high-impact, real-world applications.

Saksham Taneja

I am a Senior Data Scientist with strong expertise in Python, NLP, and large language models, focused on building robust, scalable AI-driven data products. I have designed, built, and deployed AI-powered platforms using LangChain, retrieval-augmented generation (RAG) architectures, and Google Cloud Platform, with an emphasis on clean, testable code and reliable system design. I am highly curious about emerging AI technologies and tools, and I actively experiment with new approaches to improve model behaviour, retrieval quality, and user-facing performance. I am a proven technical leader and mentor who communicates complex ideas clearly and delivers innovative data science solutions for high-impact, real-world applications.

Available to hire

I am a Senior Data Scientist with strong expertise in Python, NLP, and large language models, focused on building robust, scalable AI-driven data products. I have designed, built, and deployed AI-powered platforms using LangChain, retrieval-augmented generation (RAG) architectures, and Google Cloud Platform, with an emphasis on clean, testable code and reliable system design.

I am highly curious about emerging AI technologies and tools, and I actively experiment with new approaches to improve model behaviour, retrieval quality, and user-facing performance. I am a proven technical leader and mentor who communicates complex ideas clearly and delivers innovative data science solutions for high-impact, real-world applications.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert

Language

English
Fluent

Work Experience

Senior Data Scientist at Office for Statistics Regulation
June 1, 2024 - Present
Led high-impact AI and data science projects using Python, NLP, and large language models (LLMs), focusing on RAG pipelines, model optimisation, and scalable deployment. Designed, built, and deployed HorizonScan-AI, an end-to-end AI platform using LangChain, Gemini API, FAISS, Docker, and Google Cloud, enabling secure, multi-user access. Worked extensively in cloud environments, designing and deploying scalable AI applications using Google Cloud Platform (GCP), including Cloud Run, App Engine, Virtual Machines, and Cloud Storage within Linux-based production environments. Improved model accuracy and relevance through advanced prompt engineering, retrieval strategies, and evaluation workflows and demonstrated in-depth understanding of foundation models and large language model architectures. Managed and mentored team members, providing technical guidance, code reviews, and support across AI development workflows. Applied Python-based AI frameworks (TensorFlow, PyTorch, Keras) for model
Data Scientist at Office for National Statistics
June 1, 2022 - June 1, 2024
Core contributor to the transformation of the Consumer Prices Index, delivering scalable data pipelines supporting nationally critical economic statistics. Developed reproducible, production-ready data pipelines processing billions of rows of retailer data, converting unstructured data into structured analytical outputs. Applied Python, R, SQL, Google Cloud Platform, Airflow, Git and Terraform to deliver robust analytical systems. Worked closely with Data Analysts, Data Engineers, and Product Owners to deliver robust analytics systems tailored to business and stakeholder needs and embedded within operational workflows. Implemented cloud-based DevOps practices, including deployment, monitoring, and system integration for production-ready tools. Led and mentored team members on effective use of engineering and data science tools. Appointed Retailer Pipeline Lead in recognition of strong technical delivery and rapid skill progression. Applied Agile project management to plan, track, and d
Data Science Graduate Programme at Office for National Statistics
October 1, 2022 - October 1, 2024
Selected for a highly competitive civil service graduate programme specialising in data science and machine learning. Gained hands-on experience across effective programming, statistics, ML, NLP, and data visualisation, with strong emphasis on reproducibility and code quality. Led a team in a hackathon to deliver a machine learning solution using scikit-learn and TensorFlow, receiving strong positive feedback for both technical execution and clear communication. Mentored junior cohort members, supporting their technical and professional development.

Education

MSc Data Science & Artificial Intelligence at University of Liverpool
January 11, 2030 - February 2, 2026
MSc Neuroscience at University of Sussex
January 11, 2030 - February 2, 2026

Qualifications

Add your qualifications or awards here.

Industry Experience

Media & Entertainment, Government, Professional Services