Available to hire
I’m Sameer Nimse, a data engineer with 5+ years of experience designing and implementing cloud-native, large-scale data platforms. I specialize in building ETL/ELT pipelines, optimizing big data processing with Spark and PySpark, and integrating ML workflows to deliver actionable insights across AWS, Azure, Databricks, and Snowflake.
I focus on making data accessible and actionable by partnering with cross-functional teams to deliver dashboards and models that inform business decisions. I’m passionate about scalable data architectures, data quality, and driving operational efficiency through automation and robust monitoring.
Language
English
Fluent
Work Experience
Data Engineer at The Keelworks Foundation
February 1, 2025 - November 5, 2025Engineered data pipelines leveraging Gmail API and Google Workspace metadata to process 30,000+ user interactions, enabling real-time anomaly detection and improved visibility. Automated Google Workspace account management and Excel data extraction workflows to accelerate phishing incident resolution. Designed Power BI dashboards visualizing user activity and phishing trends; integrated AWS CloudWatch alerts to achieve 94.9% uptime and reduce issue detection time. Implemented Airflow-based ETL workflows to automate log processing, cutting pipeline execution time by 35%.
Data Engineer at Physis Investments
August 1, 2024 - August 1, 2024Established scalable sustainability data pipelines integrating 330+ ESG indicators into a centralized ecosystem, improving data accessibility for dashboards, stakeholders, and chatbot interfaces. Built advanced text embedding and vectorization workflows to enhance an LLM-powered scoring algorithm, boosting classification accuracy and response relevance by 90%. Incorporated LangChain and OpenAI-powered Q&A workflows, enabling intelligent query handling and increasing chatbot engagement by 77% across users. Deployed API monitoring solutions with AWS CloudWatch and Boto3, reducing issue diagnosis time by 85% and improving overall system stability and uptime. Created data quality validation frameworks using Python and SQL, ensuring 99% accuracy in ESG datasets and improving trust in sustainability reporting.
Data Engineer at AT&T
August 1, 2023 - August 1, 2023Formulated and initiated ETL pipelines on AWS to process and cleanse 5TB+ of daily network performance data, improving downstream data quality and reliability by 25%. Automated large-scale data ingestion workflows using Kafka and Spark, significantly reducing reporting latency by 40% and enabling near real-time analytics. Optimized complex SQL queries and implemented advanced data warehouse partitioning techniques, cutting report generation time from hours to minutes and lowering cloud compute costs by 15%. Authored comprehensive data models and architecture documentation, ensuring clarity, consistency, and knowledge transfer for 10+ cross-functional data scientists and analysts. Introduced robust data validation and monitoring frameworks with AWS CloudWatch and custom anomaly detection scripts, increasing system reliability and reducing data pipeline failures by 30%.
Data Engineer at St. Louis University
December 1, 2021 - December 1, 2021Boosted campaign engagement and conversions by 37% by constructing ETL pipelines with Python and Snowflake, streamlining performance tracking and reporting for digital advertising campaigns. Elevated campaign efficiency by analyzing 11 active Facebook Superhero U ads in Tableau, enabling discontinuation of underperforming ads and optimizing overall budget allocation. Secured 92% data accuracy by designing a robust validation framework and building performance dashboards in Tableau, equipping stakeholders with reliable insights for strategic decision-making.
Data Engineer at Sigma Galvanizing Pvt. Ltd.
April 1, 2021 - April 1, 2021Developed interactive Power BI dashboards to monitor zinc consumption and energy efficiency, enabling data-driven tracking and reducing overall material waste by 20%. Consolidated supply chain, production, and financial data through Excel automation, improving reporting workflows and boosting overall reporting efficiency by 30%. Translated complex dashboard insights into actionable strategies for stakeholders, strengthening cross-departmental alignment and driving better procurement decisions. Launched SQL-based data pipelines to integrate production and financial datasets, ensuring higher accuracy, consistency, and reliability across reporting systems. Streamlined monthly inventory and cost analysis reports by automating workflows, reducing manual processing time by 40% and accelerating strategic planning.
Education
Master of Science in Information Systems at Northeastern University, Boston, MA
September 1, 2022 - December 1, 2024Bachelor of Engineering in Electronics and Telecommunication at University Of Mumbai, Mumbai, India
August 1, 2018 - June 1, 2022Master of Science in Information Systems at Northeastern University
September 1, 2022 - December 1, 2024Bachelor of Engineering in Electronics and Telecommunication at University Of Mumbai
August 1, 2018 - June 1, 2022Qualifications
Google Data Analytics Professional Certification
January 11, 2030 - November 5, 2025Python Specialization
January 11, 2030 - November 5, 2025DP -900 Azure Data Fundamentals
January 11, 2030 - November 5, 2025Databricks Fundamentals
January 11, 2030 - November 5, 2025Google Data Analytics Professional Certification
January 11, 2030 - November 5, 2025Python Specialization
January 11, 2030 - November 5, 2025Azure Data Fundamentals
January 11, 2030 - November 5, 2025Databricks Fundamentals
January 11, 2030 - November 5, 2025Industry Experience
Telecommunications, Software & Internet, Professional Services, Education, Financial Services
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Boston today.