Available to hire
Hi, I’m Vinod Bodduppally, a data engineer who loves turning messy data into reliable insights. With 5 years focused on payroll, IoT, and travel analytics, I build scalable ETL pipelines, real-time streaming solutions, and AI/ML workflows that drive measurable business impact.
Along the way, I’ve become proficient in Python, SQL, Spark, Airflow, Kafka, Snowflake, AWS, Azure, and Tableau, and I enjoy collaborating across teams to solve complex data problems with clarity and creativity.
Skills
Language
English
Fluent
Work Experience
Data Engineer at Paychex
June 1, 2025 - PresentLed AI-assisted ETL pipelines using AWS Glue, Snowflake, and Python, processing 12M+ payroll records daily; leveraged LLMs to detect schema drift, increasing load stability and saving ~15 engineer-hours per week. Integrated Kafka streams and Airflow DAGs with anomaly-detection models in SageMaker to provide real-time payroll error alerts across 8 enterprise systems, reducing incident resolution time from 4 hours to under 45 minutes. Deployed PySpark-based data validation enhanced with LLM prompts, generating QA scripts that cut data quality issues by ~1,200 per month and boosting stakeholder trust. Implemented metadata enrichment with LangChain embeddings and AWS Glue Catalog for semantic tagging of 2TB+ daily data, improving governance traceability across 7 business domains.
Data Engineer at Honeywell
December 1, 2024 - June 1, 2025Built data pipelines in Azure Data Factory and Databricks, merging 9M+ IoT events daily and reducing latency from 6 hours to 90 minutes through focused coordination. Integrated Kafka streams with PySpark and MLflow, powering anomaly models that analyzed 1.5 TB telemetry data and reduced downtime by 230 cases per quarter through analytical insight. Deployed LLM-based tagging via LangChain and Python, classifying 40K+ equipment logs and accelerating root-cause discovery by 8 hours per incident through close collaboration.
Data Analyst at Airbnb
December 1, 2018 - August 1, 2022Designed and automated ETL workflows using Apache Spark, Airflow, and SQL, consolidating 30M+ booking, transaction, and telemetry records daily for finance, operations, and system diagnostics. Integrated data sets from AWS S3, Redshift, PostgreSQL, and Snowflake, enabling real-time analytics across 10+ global markets and reducing reporting latency from 9.2s to 1.9s. Built 45+ Power BI and Tableau dashboards tracking customer behavior and operational KPIs, uncovering $7M+ in new opportunities and influencing retention and up-sell strategies. Developed Python automations (Pandas, NumPy, PySpark) to cleanse and analyze 5M+ records/month, improving data integrity and saving 15+ staff hours weekly. Modeled and optimized data pipelines on Azure Synapse and Data Factory, supporting cloud migration and segmentation analysis for 400K+ customer cohorts.
Education
Master's in Information Technology Management at Campbellsville University, USA
January 1, 2023 - August 1, 2024Bachelor of Technology in Mechanical Engineering at Sphoorthy Engineering College, India
June 1, 2015 - December 1, 2019Qualifications
Industry Experience
Software & Internet, Travel & Hospitality, Financial Services, Professional Services, Telecommunications
Skills
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Jersey City today.