Hi, I'm Aditya Dubey, an experienced Data Engineer with over a decade in software development and more than five years specializing in cloud-based data pipelines, PySpark, and modern data platforms. I enjoy designing scalable ETL pipelines and automating workflows to help businesses process large volumes of data efficiently and securely. I have worked extensively with cloud platforms like Azure and AWS, and have experience managing distributed computing environments, containerization with Docker, and orchestration tools like Airflow. I'm always eager to collaborate on challenging projects and continue growing my skills in data engineering and cloud technology.

Aditya Dubey

Hi, I'm Aditya Dubey, an experienced Data Engineer with over a decade in software development and more than five years specializing in cloud-based data pipelines, PySpark, and modern data platforms. I enjoy designing scalable ETL pipelines and automating workflows to help businesses process large volumes of data efficiently and securely. I have worked extensively with cloud platforms like Azure and AWS, and have experience managing distributed computing environments, containerization with Docker, and orchestration tools like Airflow. I'm always eager to collaborate on challenging projects and continue growing my skills in data engineering and cloud technology.

Available to hire

Hi, I’m Aditya Dubey, an experienced Data Engineer with over a decade in software development and more than five years specializing in cloud-based data pipelines, PySpark, and modern data platforms. I enjoy designing scalable ETL pipelines and automating workflows to help businesses process large volumes of data efficiently and securely.

I have worked extensively with cloud platforms like Azure and AWS, and have experience managing distributed computing environments, containerization with Docker, and orchestration tools like Airflow. I’m always eager to collaborate on challenging projects and continue growing my skills in data engineering and cloud technology.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
See more

Language

English
Fluent
Hindi
Fluent

Work Experience

Freelance Python Developer at Self Employed
July 1, 2023 - Present
Design, develop, and maintain scalable and robust backend applications using Python frameworks like Django and Flask. Work with relational and NoSQL databases. Deploy and manage web applications on AWS services such as EC2, S3, RDS, Lambda, API Gateway, and CloudWatch. Set up CI/CD pipelines using AWS CodePipeline, GitHub Actions, or Jenkins. Implement best practices for authentication, authorization, and data protection. Handle load testing, performance tuning, and code optimization. Package applications using Docker and deploy in containerized environments. Integrate third-party APIs and handle asynchronous tasks using Celery and Redis. Maintain documentation and collaborate with cross-functional teams using Agile methodologies.
Senior Software Engineer at JobsForHer Pvt Ltd
June 30, 2023 - July 28, 2025
Designed and developed scalable ETL pipelines using PySpark to process large data volumes. Wrote and maintained PySpark scripts for data transformation and enrichment. Containerized PySpark and Airflow applications using Docker. Developed and maintained DAGs in Apache Airflow for automating data workflows.
Senior Software Engineer at Volansys Technologies Pvt Ltd
January 31, 2023 - July 28, 2025
Designed and developed scalable ETL pipelines using PySpark for structured and unstructured data. Maintained PySpark scripts and containerized applications with Docker. Developed web application modules for hardware device installation and testing. Managed AWS cloud infrastructure changes, developed Lambda functions, and deployed application code to EC2 instances. Created automation scripts using Testrail tools and AWS SDK.
Senior Software Engineer at DB Xento System Pvt Ltd
June 30, 2020 - July 28, 2025
Developed Python scripts to control and manage IoT devices using Raspberry Pi and MQTT. Created scripts for Alexa skills and AWS IoT core services. Managed Google Cloud Platform IoT services, Pub/Sub, OAuth 2.0, and Smart Device Management APIs. Maintained PostgreSQL stored procedures and optimized RabbitMQ and analytics tools. Mentored junior team members and followed Agile SDLC practices. Acted as a Virtual DevOps for AWS and Google Cloud. Developed APIs to control IoT and third-party devices.
Technical Assistant (Web Developer) at Indian Institute of Science Education and Research, Bhopal
December 31, 2017 - July 28, 2025
Gathered requirements from departments and managed progress updates. Developed and tested Python networking scripts. Created and maintained MySQL database schemas. Handled project documentation, website hosting, Linux server configuration, and code deployment.
Software Developer at Mobiweb Technology Pvt Ltd
May 31, 2016 - July 28, 2025
Communicated directly with US clients to gather requirements. Developed and tested website modules using Agile methodology. Created and maintained MySQL database schemas. Worked with third-party services such as PayPal, PubNub, mPDF, and FPDF. Managed website hosting, Linux server configuration, code deployment, and source code management on GitHub and Bitbucket.
Software Developer at Sahayog Microfinance Pvt. Ltd
September 30, 2015 - July 28, 2025
Gathered user requirements and developed ERP application modules. Created and maintained database schemas. Developed reports in various formats including PDF and Excel. Integrated third-party APIs and libraries for reporting.
Freelance Data Engineer | PySpark Developer at Freelance
July 1, 2023 - Present
Designed and implemented scalable ETL pipelines using PySpark on Azure Databricks for processing large volumes of structured and semi-structured data. Built and orchestrated data pipelines integrating with Azure Data Lake Storage, Azure SQL Database, and Azure Synapse Analytics. Managed Databricks clusters, jobs, and permissions for optimized compute and security. Automated pipeline execution using Databricks Jobs API integrated with schedulers like Airflow and Azure Data Factory. Wrote efficient PySpark code for data cleaning, transformation, aggregation, and joining using DataFrame and SQL APIs. Optimized Spark jobs with partitioning, caching, broadcast joins, and other tuning techniques. Developed reusable PySpark notebooks and Python scripts for batch and streaming use cases. Monitored job executions using Airflow's logging and UI.
Senior Software Engineer at JobsForHer Pvt Ltd
June 30, 2023 - July 28, 2025
Designed and developed scalable ETL pipelines using PySpark for processing structured and unstructured data. Maintained PySpark scripts for data transformation, cleaning, and enrichment to support analytics and reporting. Containerized PySpark and Airflow apps with Docker for consistent dev and deployment environments. Developed and maintained DAGs in Apache Airflow to automate end-to-end data pipelines and workflows. Contributed to the JobsForHer Portal, Herkey Web APP projects.
Senior Software Engineer at Volansys Technologies Pvt Ltd
January 31, 2023 - July 28, 2025
Designed and built scalable ETL pipelines using PySpark for large volume structured and unstructured data. Maintained PySpark scripts for data transformation and enrichment. Containerized PySpark and Airflow applications with Docker. Developed web application modules for hardware device installation and testing. Managed AWS cloud changes, deployed AWS Lambda functions, and handled EC2 instance deployments. Automated workflows using Python scripts with Testrail tools and AWS SDK. Projects included Onelink and FirstAlertDB.
Senior Software Engineer at DB Xento System Pvt Ltd
June 30, 2020 - July 28, 2025
Developed Python scripts on Raspberry Pi to control/manage Z-Wave, WiFi devices and logged their activities. Implemented MQTT communication and AWS IoT core services including Alexa skills integration, Lambda functions, IAM account management to store device states. Utilized Google Assistant and GCP services (IoT, Pub/Sub, OAuth 2.0, Smart Device Management APIs). Wrote PostgreSQL stored procedures and functions. Managed RabbitMQ and New Relic APM analytics. Mentored junior team members and ensured adherence to Agile SDLC, maintained project documentation, developed APIs to control IoT and third party devices. Projects included Entratamation.
Technical Assistant (Web Developer) at Indian Institute of Science Education and Research, Bhopal
December 31, 2017 - July 28, 2025
Gathered requirements from departments and provided progress updates. Developed Python scripts for networking modules. Tested modules and websites for the institute and departments. Created and maintained MySQL database schemas. Managed project documentation, website hosting, Linux server configuration, and code deployment. Projects included IISER website, Service Request Form, LDAP registration.
Software Developer at Mobiweb Technology Pvt. Ltd.
May 31, 2016 - July 28, 2025
Managed teams and communicated directly with US clients for requirement gathering and progress updates. Developed and tested modules and websites using Agile methodology. Created and maintained MySQL schemas. Managed project documentation. Integrated third party services such as PayPal, PubNub, mPDF, and FPDF. Handled website hosting, Linux server configuration, and code deployment. Used GitHub and Bitbucket for source code management. Projects included Myfanwagon, Belibitv, Classified business.
Software Developer at Sahayog Microfinance Pvt. Ltd
September 30, 2015 - July 28, 2025
Gathered requirements from end users. Developed and tested different ERP application modules. Created and maintained database schemas and generated reports as per requirements. Utilized third party APIs and libraries to create PDF and Excel reports. Projects included Kamdhenu, Doc collection.

Education

Bachelor Of Engineering at Truba Institute of Engineering and Information Technology, Bhopal
January 1, 2009 - December 31, 2013

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Telecommunications, Education, Financial Services, Real Estate & Construction, Healthcare, Computers & Electronics, Professional Services, Manufacturing
    paper adventure-work-data-engineer-project

    This project demonstrates the design and implementation of a dynamic, scalable ETL pipeline using Azure Data Factory (ADF), Databricks, and Synapse Analytics. The pipeline is built following the medallion architecture pattern, enabling structured data refinement through bronze, silver, and gold layers. It integrates dynamic control flows, such as lookup and foreach activities in ADF, to manage the orchestration of multiple data sources. The transformed data is used to create external tables in Synapse for efficient analytical querying and reporting.

    Github: https://www.twine.net/signin