Sudhanshu Akarshe

Available to hire

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Language

English
Fluent

Work Experience

Data Engineer at Capgemini Technology Services
April 1, 2023 - September 8, 2025
Orchestrated ETL and ELT projects in Azure, achieving a 40% reduction in cloud infrastructure costs and 25% faster data processing. Engineered complex SQL queries across Oracle, Snowflake, Sybase, and S3, improving query performance by 45% and report accuracy by 25%. Developed PySpark/Spark SQL scripts in Databricks to transform staging data from Delta Lake on ADLS, cutting data integration time by 30%. Built a KQL-driven log analytics dashboard for real-time monitoring of Azure pipelines, identifying 95% of data quality issues within 10 minutes. Established semantic data models in Tableau reducing report load times by 35% and eliminating manual client reporting. Managed real-time ingestion from Apache Kafka into Delta Lake using Structured Streaming, processing 100k+ events daily with sub-10s latency and 99.9% availability. Streamlined CI/CD with Azure DevOps, automating 85% of ELT deployments via YAML configurations and environment triggers. Enforced strict version control through fe
Junior Data Engineer at Virag Enterprises
November 1, 2020 - September 8, 2025
Developed and deployed SSIS-based ETL pipelines to ingest and transform sales and operational data, reducing manual workflows by 30% and improving report delivery turnaround. Automated ELT pipelines into Redshift using AWS Glue and S3, enabling near real-time analytics and cutting manual ingestion time by 60%. Optimized SQL queries to validate data integrity across staging and production, reducing execution time by 25% and improving reporting reliability. Implemented PowerShell scripts for log cleanup, permission updates, and monitoring tasks, reducing manual admin workload by 15 hours per week and improving operational efficiency by 35%. Designed standardized Excel-SQL reporting templates, reducing manual reporting dependencies and improving insight accessibility by 15%. Applied GDPR compliance checks masking or anonymizing sensitive data and preventing 15+ potential regulatory violations annually.
Data Engineer at Capgemini Technology Services
April 1, 2023 - September 8, 2025
Orchestrated ETL and ELT projects in Azure to ensure seamless transition; optimized performance with 40% cloud cost reduction and 25% faster data processing. Engineered complex SQL queries and views for data extracted from Oracle, Snowflake, Sybase, and S3; accelerated query performance by 45% and improved report data accuracy by 25%. Developed PySpark/Spark SQL scripts in Databricks to transform staging data from Delta Lake on ADLS, reducing data integration time by 30%. Built a KQL-driven log analytics dashboard for real-time monitoring of Azure pipelines, identifying 95% of data quality issues within 10 minutes. Established semantic data models in Tableau, reducing report load time by 35% and eliminating manual client reporting, increasing delivery efficiency by 40%. Managed real-time data ingestion from Apache Kafka into Delta Lake using Structured Streaming in Databricks, processing 100k+ events daily with <10s latency and 99.9% availability. Streamlined CI/CD pipelines using Azur
Junior Data Engineer at Virag Enterprises
November 1, 2020 - September 8, 2025
Developed and deployed SSIS-based ETL pipelines to ingest and transform sales and operational data, reducing manual workflows by 30% and improving report delivery turnaround. Automated ELT pipelines into Redshift using AWS Glue and S3 for near real-time analytics, cutting manual ingestion time by 60%. Optimized complex SQL queries in SSMS to ensure data integrity across staging and production environments, reducing query execution time by 25% and improving reporting reliability. Implemented PowerShell scripts for systematized log cleanup, permission updates, and monitoring tasks, reducing manual admin workload by 15 hours per week and improving operational efficiency by 35%. Designed standardized Excel-SQL reporting templates, reducing manual reporting dependencies and improving insight accessibility by 15% through self-service dashboards. Applied GDPR compliance checks in reporting framework, masking or anonymizing sensitive data and preventing 15+ potential regulatory violations annu
ML Engineer at Capgemini Technology Services
April 1, 2023 - September 8, 2025
Led NLP pipeline development and deployment for regulatory medical datasets. Engineered document retrieval and Q&A systems using Elasticsearch and FAISS to enable researchers quick access to medical records, reducing manual search time by 6–8 hours weekly. Built retrieval-augmented search pipelines combining TF-IDF, BM25, and FAISS embeddings, increasing accuracy on unstructured pharma datasets by 25%. Fine-tuned BERT, BioBERT, and RoBERTa for regulatory document analysis, cutting classification and entity extraction errors by 15%. Developed Python-based API integrations for spaCy with custom NLP workflows, reducing model inference latency by 30% and supporting high-volume queries with near-zero downtime. Configured Azure ML pipelines for training and deploying NLP models, achieving average deployment times under 20 minutes. Constructed CI/CD pipelines in GitLab and Jenkins for ML model versioning, testing, and deployment to Kubernetes clusters, reducing release cycles by 45% and imp
Cloud & Automation Engineer at Virag Enterprises
November 1, 2020 - September 8, 2025
Developed ML forecasting models (scikit-learn, pandas) to predict weekly production demand with 85% accuracy, improving inventory planning efficiency by 20%. Engineered supervised ML models in Python (scikit-learn) to predict sales trends, improving forecast accuracy by 18% across 3 business units. Built NLP-based email classification tool to auto-categorize 200+ client communications weekly, reducing response times by 30% and improving client satisfaction scores by 15%. Designed and deployed interactive dashboards for real-time production and sales monitoring, cutting decision-making delays from days to hours and improving operational agility by 40%.

Education

MSc. in Business Analytics at Dublin Business School
April 1, 2023 - April 1, 2024
B.E in Computer at Pune University
August 1, 2016 - April 1, 2020
MSc in Business Analytics at Dublin Business School
April 1, 2023 - April 1, 2024
B.E in Computer at Pune University
August 1, 2016 - April 1, 2020
MSc. in Business Analytics at Dublin Business School
April 1, 2023 - April 1, 2024
B.E in Computer at Pune University
August 1, 2016 - April 1, 2020

Qualifications

Microsoft Azure Fabric Data Engineer Associate
January 11, 2030 - September 8, 2025
Microsoft Azure Data Engineer Associate
January 11, 2030 - September 8, 2025
Microsoft Azure AI Fundamentals
January 11, 2030 - September 8, 2025
Microsoft Power BI Associate
January 11, 2030 - September 8, 2025
Python (HackerRank)
January 11, 2030 - September 8, 2025
Microsoft Azure Fabric Data Engineer Associate
January 11, 2030 - September 8, 2025
Microsoft Azure Data Engineer Associate
January 11, 2030 - September 8, 2025
Microsoft Azure AI Fundamentals
January 11, 2030 - September 8, 2025
Microsoft Power BI Associate
January 11, 2030 - September 8, 2025
Python (HackerRank)
January 11, 2030 - September 8, 2025
Microsoft Azure AI Fundamentals
January 11, 2030 - September 8, 2025
Microsoft Azure Data Fundamentals
January 11, 2030 - September 8, 2025
Microsoft Azure Data Engineer Associate
January 11, 2030 - September 8, 2025
Microsoft Power BI Associate
January 11, 2030 - September 8, 2025
Python (HackerRank)
January 11, 2030 - September 8, 2025

Industry Experience

Software & Internet, Professional Services, Computers & Electronics, Media & Entertainment, Other, Healthcare, Life Sciences