I am a Senior AI Data Engineer with over 10 years of experience in designing, developing, and deploying scalable, secure, and cloud-native data pipelines and platforms. I have worked extensively across insurance, healthcare, banking, and retail domains, delivering real-time and batch data solutions that align with business outcomes. My expertise spans Big Data technologies, cloud platforms like AWS and Azure, and programming languages including Python, SQL, and R. Passionate about modernization and automation, I specialize in building data architectures that support AI and machine learning pipelines. I thrive in Agile environments and am skilled at collaborating with cross-functional teams to create high-performance data solutions that drive predictive analytics, fraud detection, claims scoring, and customer insights.

Venugopal Reddy

I am a Senior AI Data Engineer with over 10 years of experience in designing, developing, and deploying scalable, secure, and cloud-native data pipelines and platforms. I have worked extensively across insurance, healthcare, banking, and retail domains, delivering real-time and batch data solutions that align with business outcomes. My expertise spans Big Data technologies, cloud platforms like AWS and Azure, and programming languages including Python, SQL, and R. Passionate about modernization and automation, I specialize in building data architectures that support AI and machine learning pipelines. I thrive in Agile environments and am skilled at collaborating with cross-functional teams to create high-performance data solutions that drive predictive analytics, fraud detection, claims scoring, and customer insights.

Available to hire

I am a Senior AI Data Engineer with over 10 years of experience in designing, developing, and deploying scalable, secure, and cloud-native data pipelines and platforms. I have worked extensively across insurance, healthcare, banking, and retail domains, delivering real-time and batch data solutions that align with business outcomes. My expertise spans Big Data technologies, cloud platforms like AWS and Azure, and programming languages including Python, SQL, and R.

Passionate about modernization and automation, I specialize in building data architectures that support AI and machine learning pipelines. I thrive in Agile environments and am skilled at collaborating with cross-functional teams to create high-performance data solutions that drive predictive analytics, fraud detection, claims scoring, and customer insights.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Work Experience

Senior AI Data Engineer at Liberty Mutual
May 1, 2023 - Present
Led the transformation of claims operations by developing a modern data and intelligence platform supporting automation, real-time insights, and risk-based decisioning. Engineered Delta Lakehouse architecture on Azure Data Lake Gen2 and Databricks for scalable data ingestion, processing, and curation. Built batch and streaming pipelines using Azure Data Factory and Event Hubs and developed PySpark ETL workflows to normalize and enrich claims data. Provisioned Azure Synapse and Power BI datasets enabling near real-time KPI access. Architected and optimized Snowflake financial data marts and implemented data governance with Unity Catalog and Azure Purview. Automated infrastructure provisioning with Azure DevOps and Terraform and integrated Azure Cognitive Services and Azure OpenAI for advanced AI-driven summarization and semantic search. Collaborated with Responsible AI teams to ensure governance compliance.
ML Data Engineer at HCSC
May 1, 2023 - August 26, 2025
Led the design and deployment of a HIPAA-compliant cloud-native data and ML platform on AWS to modernize healthcare operations, enabling predictive modeling for clinical and claims insights. Built high-volume ingestion pipelines using Informatica Cloud, Kafka, and AWS Lambda with sub-2 second latency. Orchestrated pipelines using Apache Airflow and built transformation pipelines on Spark EMR to normalize and standardize clinical and claims data. Developed validation frameworks improving data quality by 35%. Created dimensional marts using dbt in Snowflake and Redshift with CI/CD promotion. Managed experiment tracking and model lifecycle via MLflow and deployed models with Docker and ECS. Implemented GenAI pipelines using Amazon Bedrock and LangChain for compliance explanations. Delivered Tableau dashboards enabling actionable insights for care managers. Automated infrastructure provisioning with Terraform and CloudFormation.
Data Engineer at Bank of America
May 1, 2021 - August 26, 2025
Designed and deployed hybrid data architectures on AWS S3, Redshift, and Snowflake to centralize ingestion and analytics workflows supporting real-time monitoring and regulatory compliance. Built Kafka and Spark Structured Streaming pipelines for high-volume transaction data enabling sub-second fraud detection. Developed AML risk scoring logic in PySpark and SQL accelerating investigator decision-making. Orchestrated workflows using Glue, Step Functions, and DynamoDB for KYC updates. Modernized legacy batch pipelines migrating from Hadoop to Snowflake and S3. Automated CI/CD pipelines with Terraform, Jenkins, and GitHub Actions for consistent deployments. Implemented data governance and audit trails supporting SOX and GDPR. Delivered Power BI dashboards providing fraud exposure and data quality metrics to compliance teams.
Data Engineer at Safeway
January 1, 2019 - August 26, 2025
Architected a distributed AWS-based data platform using S3, EMR, Glue, and Redshift Spectrum to support real-time and batch data processing for retail inventory, sales forecasting, and pricing analytics. Established Hadoop ecosystem infrastructure with HDFS, Hive, and Spark enabling scalable data storage and schema-on-read transformations. Ingested data from Teradata, Oracle, and SQL Server into data lake using Talend and Sqoop, converting to efficient Parquet and Avro formats. Built PySpark pipelines on EMR for inventory and pricing data aggregation. Orchestrated ETL pipelines with Glue and Talend applying data cleansing and validation. Automated infrastructure provision with Terraform and containerized workloads with Docker deployed on ECS. Managed MongoDB and Elasticsearch NoSQL stores for product metadata and search capabilities. Automated CI/CD deployments with Jenkins and GitHub and developed Tableau dashboards for retail business metrics monitoring.
Python Developer at FuGenX Technologies
June 1, 2017 - August 26, 2025
Designed Python-based ETL workflows to automate extraction, transformation, and loading from files, APIs, and RDBMS, reducing manual processing time by 40%. Implemented SSIS packages and scheduling to integrate SQL Server and Oracle data for finance and operations reporting. Optimized SQL queries and stored procedures for faster reporting system responses. Developed log parsing frameworks improving system troubleshooting. Built Python APIs for data integration ensuring timely data synchronization between internal and partner systems. Enhanced data quality with validation and cleansing routines. Established version control with Git and GitLab and automated builds and deployments using Jenkins. Collaborated with business analysts to deliver ad-hoc reporting solutions supporting campaign tracking and operational decisions.

Education

Bachelor of Technology in Computer Science at GITAM University, Hyderabad
August 1, 2011 - June 1, 2015

Qualifications

Add your qualifications or awards here.

Industry Experience

Financial Services, Healthcare, Retail, Software & Internet, Professional Services