Available to hire
Hi, I’m Sean Lancaster. I’m a Senior Data Engineer focused on ML, delivering scalable, AI-powered data solutions for healthcare, finance, and insurance. I work with Python, Scala, Spark, TensorFlow, PyTorch, and cloud platforms to build end-to-end data pipelines, NLP, CV models, and enterprise analytics that speed decision-making and improve outcomes.
I enjoy collaborating with clinicians and stakeholders to turn complex data into actionable insights, design robust data architectures, and guide projects from concept to deployment while upholding data privacy and governance.
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Language
English
Fluent
Work Experience
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - PresentManaged the development of AI-focused healthcare data solutions and ETL pipelines for Cigna, built with Python, Scala, Apache Spark, TensorFlow, PyTorch, Google Cloud AI, AWS, and Databricks to achieve a 45% gain in document processing and a 20% reduction in processing effort. Designed scalable ETL workflows to load large clinical datasets into analytical environments integrated with downstream AI models to speed clinicians' access to records and decision making. Built real-time data integration pipelines using Python, Kafka, Scala, Dataflow, and Databricks, reducing integration latency by 60% and enabling real-time access to patient data for 500+ clinicians. Implemented OCR-based data extraction for insurance cards and referral forms using OpenCV, TensorFlow, and Java, improving intake data accuracy by 40%. Optimized LLM-based summarization on healthcare data to reduce clinician reading time by 30%. Deployed FastAPI-based microservices exposing NLP/ML models for real-time processing.
Data Scientist at Vention
November 30, 2023 - September 9, 2025Designed and applied predictive models to forecast asset prices using XGBoost and ARIMA, increasing prediction accuracy by 25% and enabling higher-leverage investments; scaled datasets with Apache Spark. Built portfolio optimization solutions using Scikit-learn and Pandas to improve risk-adjusted performance by 20%. Created generative AI models (GANs) to simulate financial markets, improving predictive accuracy by 30% leveraging Azure Synapse Analytics, Azure Data Factory, and Azure Data Lake. Built financial decision engines using Azure Event Hubs for data ingestion and Spark for streaming, reducing decision latencies from 20+ minutes to seconds. Optimized large-scale data transformations leveraging Azure Data Lake Storage Gen2 in Azure Databricks, increasing processing speed by 35%. Developed ETL pipelines with Scala/Spark to load data into Snowflake, completing data prep 30% faster. Developed Monte Carlo risk models and random forests on Azure Synapse and Apache Spark, improving all
Data Engineer at TWST Events
August 31, 2021 - September 9, 2025Designed and oversaw end-to-end ETL pipelines connecting internal insurance systems, third-party vendors, and customer-engagement platforms, reducing processing time by 30% and improving downstream data quality for analytics. Improved relational databases (MySQL/PostgreSQL) for fast storage/retrieval with 25% performance gains. Built scalable data lakes on AWS S3 with Apache Kafka to support real-time ingestion of telematics and mobile app data for claims processing. Developed analytical data warehouse using Amazon Redshift and Google BigQuery, enabling fast ad hoc querying and a 27% reduction in time-to-insight. Implemented Hadoop and Apache Spark for big data processing with a 21% performance improvement. Built real-time pipelines with Apache Kafka and Apache Flink, reducing pipeline latency by 20%. Ensured regulatory compliance and security; redesigned data processing flows to cut delivery load time by 30%.
Software Engineer at Signature Technology Group
July 31, 2015 - September 9, 2025Enhanced system responsiveness by designing and implementing scalable backend services in Java with RESTful APIs; modernized front-end with JavaScript/Node.js for better user experience. Implemented a microservices-based architecture to improve modularity and achieve a 20% performance gain. Improved query performance through SQL optimization and schema design for MySQL/PostgreSQL. Reduced application load times by 25% and improved maintainability by migrating from monolithic codebases to modular services. Implemented secure authentication/authorization with OAuth2/JWT and integrated third-party services via REST/SOAP. Automated build/deploy pipelines with Jenkins, Docker, and Git to cut deployment time by 40% and boost release reliability.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - PresentManaged the development of AI-focused healthcare data solutions and ETL pipelines using Python, Scala, Spark, TensorFlow, PyTorch, Google Cloud AI, AWS, and Databricks to improve document processing by 45% and reduce processing effort by 20%. Designed ETL workflows to scale data loads of large clinical datasets into analytical environments and integrate with downstream AI models to speed clinicians' access to records and decision making. Built and optimized real-time data integration pipelines with Python, Kafka, Scala, Google Cloud Dataflow, and Databricks to reduce latency by 60% for more than 500 clinicians. Implemented OCR-based data extraction with OpenCV and TensorFlow for insurance cards and forms, improving intake data accuracy by 40%. Deployed FastAPI-based microservices exposing NLP/ML models for real-time processing and integrated with internal tools, reducing response latency by 30%. Led HIPAA-compliant NLP pipelines using spaCy and Hugging Face transformers to extract enti
Data Scientist at Vention
November 1, 2023 - September 9, 2025Designed predictive models to forecast asset prices using XGBoost and ARIMA, boosting predictive accuracy and enabling high-leverage investments. Scaled data processing with Apache Spark for larger datasets. Developed portfolio optimization models to improve risk-adjusted performance by ~20%. Built generative AI models (GANs) for functional market simulations to enhance strategic forecasting by ~30% using Azure Synapse, Data Factory, and Data Lake. Implemented live decision engines with Azure Event Hubs and Spark for rapid market analysis, cutting decision latency from minutes to seconds. Optimized data transformations in Azure Databricks and migrated data loads to Snowflake, speeding ETL by ~35%. Developed CLV models with XGBoost to inform targeted marketing and improved retention of high-value customers. Created dashboards in Tableau/Power BI integrated with Azure Synapse for real-time market visibility.
Data Engineer at TWST Events
August 1, 2021 - September 9, 2025Designed and built end-to-end ETL pipelines integrating internal insurance systems, third-party vendors, and customer engagement platforms, reducing overall processing time by 30% and improving downstream data quality. Optimized relational databases (MySQL, PostgreSQL) for fast query performance (≈25%). Built and deployed scalable data lakes on AWS S3 with Apache Kafka to enable real-time telemetry and claims analytics. Developed a data warehouse using Amazon Redshift and Google BigQuery to enable fast, ad hoc reporting for actuarial and underwriting teams (27% faster). Implemented real-time data pipelines with Kafka and Flink, reducing latency by 20%. Ensured regulatory compliance and security, and redesigned data flows to improve delivery by ~30%.
Software Engineer at Signature Technology Group
July 1, 2015 - September 9, 2025Built scalable backend services in Java with RESTful APIs and modern frontend features. Migrated monolithic code to modular microservices, improving performance and maintainability by ~20-30%. Optimized SQL queries and database schemas (MySQL, PostgreSQL) to achieve up to 30% faster query performance. Implemented OAuth2/JWT-based security, REST/SOAP integrations, and CI/CD pipelines with Jenkins and Git. Participated in Agile ceremonies and code reviews to improve team collaboration.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - PresentLed AI-focused healthcare data solutions and ETL pipelines using Python, Scala, Spark, TensorFlow, PyTorch on GCP/AWS/Databricks. Achieved 45% improvement in document processing and 20% reduction in manual processing. Implemented real-time data integration pipelines with Python, Kafka, Scala, and Dataflow, reducing latency for 500+ clinicians. Built HIPAA-compliant NLP pipelines with spaCy and transformer models to support clinical workflows.
Data Scientist at Vention
November 1, 2023 - September 9, 2025Designed predictive models for asset price using XGBoost and ARIMA; improved trend prediction by 25%. Developed portfolio optimization and risk modeling solutions with Python (scikit-learn, Pandas); boosted portfolio performance by 20%. Built GAN-based market simulations and real-time decision engines on Azure/Spark, cutting decision latencies from minutes to seconds. Implemented ETL pipelines loading data into Snowflake; enhanced data prep by ~30%.
Data Engineer at TWST Events
August 1, 2021 - September 9, 2025Designed end-to-end ETL pipelines linking internal systems, vendors, and engagement platforms; reduced processing time by 30% and improved data quality for analytics. Optimized MySQL/PostgreSQL schemas for faster queries; built scalable data lakes on AWS S3 with Kafka for real-time ingestion. Created Redshift/BigQuery data warehouse for actuarial and underwriting reporting; reduced time to insight by 27%. Implemented real-time pipelines with Kafka & Flink; reduced latency by 20%. Ensured regulatory/security controls for insurance data.
Software Engineer at Signature Technology Group
July 1, 2015 - September 9, 2025Developed scalable backend services with Java and microservices; improved modularity and performance. Built front-end features with JavaScript/Node.js and integrated with back-end APIs. Implemented secure authentication (OAuth2/JWT) and REST/SOAP services for third-party integrations. Automated build/deploy pipelines with Jenkins, Docker, and Git; reduced deployment time by 40% and increased release reliability. Participated in Agile ceremonies and code reviews.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - PresentLed the development of AI-focused healthcare data solutions and ETL pipelines for Cigna, using Python, Scala, Spark, TensorFlow and PyTorch, on Google Cloud, AWS and Databricks, achieving a 45% gain in document processing and a 20% reduction in processing effort. Designed scalable ETL workflows to load large clinical datasets into analytics environments and integrated downstream AI models to speed clinicians' access to records and decision-making.
Data Scientist at Vention
November 1, 2023 - September 9, 2025Designed and applied predictive models to forecast asset prices using XGBoost and ARIMA; built data pipelines with Apache Spark; developed portfolio optimization strategies with Python; explored generative AI approaches to market simulation; deployed decision engines on Azure Event Hubs and Databricks to reduce decision latencies and improve market timing.
Data Engineer at TWST Events
August 1, 2021 - September 9, 2025Designed and oversaw end-to-end ETL pipelines integrating internal insurance systems, third-party vendors, and customer-engagement platforms; built scalable data lakes on AWS S3 with Kafka and Flink-style streaming, and an analytical warehouse on Redshift and BigQuery, enabling faster actuarial and underwriting insights and 21% performance improvements in data processing.
Software Engineer at Signature Technology Group
July 1, 2015 - September 9, 2025Developed scalable backend services in Java with microservices and RESTful APIs; built frontend components with JavaScript; improved performance and security, implemented CI/CD pipelines with Jenkins and Docker; migrated from monolithic code to modular services, leading to faster releases.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - PresentLed development of AI-driven healthcare data solutions and ETL pipelines for payer/provider ecosystems, leveraging Python, Scala, Spark, TensorFlow, PyTorch, Google Cloud, AWS, and Databricks. Achieved 45% improvement in document processing and a 20% reduction in manual processing effort. Designed scalable ETL workflows to ingest large clinical datasets and expose AI models to clinicians, accelerating decision making. Built real-time data integration pipelines with Python, Kafka, Scala, Dataflow, and Databricks, reducing data integration latency by 60% and delivering real-time patient data to 500+ clinicians. Implemented OCR-based data extraction from insurance cards and forms using OpenCV, TensorFlow, and Java, increasing intake accuracy by 40%. Optimized LLM-based summarization on unstructured notes to reduce clinician review time by 30%. Deployed FastAPI microservices to expose NLP/ML models, enabling integration with internal tools and reducing response latency by 30%. Collaborated
Data Scientist at Vention
November 1, 2023 - September 9, 2025Designed and applied predictive models to forecast asset prices using XGBoost and ARIMA, achieving ~25% improvement in predictive accuracy. Scaled data processing with Apache Spark and Azure Synapse/Data Lake using Azure Data Factory to handle larger datasets. Built portfolio optimization tools in Python that improved risk-adjusted returns by ~20%. Explored generative AI models (GANs) to simulate market scenarios, increasing predictive accuracy by ~30%. Implemented streaming data ingestion and live decisioning via Azure Event Hubs and Spark in Databricks; built dashboards in Tableau/Power BI connected to Azure Synapse and Data Explorer.
Data Engineer at TWST Events
August 1, 2021 - September 9, 2025Designed end-to-end ETL pipelines integrating internal insurance systems, third-party vendors, and customer-engagement platforms, reducing processing time by 30% and improving data quality for analytics. Built scalable data lakes on AWS S3 with Kafka for real-time ingestion, enabling low-latency insights for actuarial and underwriting teams. Developed analytics data warehouse using Amazon Redshift and Google BigQuery; aided big data processing with Hadoop and Spark, achieving 21% performance improvements. Created real-time data pipelines with Kafka and Flink, reducing latency by 20%. Implemented data security protocols and compliance measures, and improved deployment pipelines with Jenkins and GitLab.
Software Engineer at Signature Technology Group
July 1, 2015 - September 9, 2025Developed scalable backend services in Java with microservices and REST APIs; modernized front-end with JavaScript and Node.js. Implemented secure authentication (OAuth2, JWT); integrated third-party services via REST and SOAP. Reworked monolithic code into modular services, reducing load times by 25% and improving maintainability. Led CI/CD with Jenkins, Docker, and Git workflows, cutting deployment time by 40% and improving release reliability. Participated in Agile ceremonies and code reviews.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - PresentLed the development of AI-focused healthcare data solutions and ETL pipelines using Python, Scala, Apache Spark, TensorFlow, PyTorch, Google Cloud AI, AWS, and Databricks, resulting in a 45% gain in document processing and a 20% reduction in manual effort. Designed scalable ETL workflows to load large clinical datasets and integrate downstream AI models to speed clinicians' access to records. Built real-time data integration pipelines with Python, Apache Kafka, Scala, Google Cloud Dataflow, and Databricks, reducing integration latency by 60% and enabling real-time access to patient data for 500+ clinicians. Implemented an OCR-based solution using OpenCV, TensorFlow, and Java to retrieve and validate data from insurance cards and referral forms, improving intake data accuracy by 40%. Optimized LLM-based summarization on clinical notes to reduce clinician reading time by 30%. Deployed FastAPI-based microservices exposing NLP/ML models for real-time processing and designed HIPAA-compliant
Data Scientist at Vention
November 1, 2023 - September 9, 2025Designed and applied predictive models to forecast asset prices using XGBoost and ARIMA, achieving a 25% improvement in forecasting accuracy and enabling high-leverage investments. Developed portfolio optimization models with Scikit-learn and Pandas, increasing portfolio performance by 20%. Built generative AI models (GANs) to simulate market scenarios, enhancing predictive insight by 30% via Azure Synapse Analytics, Azure Data Factory, and Azure Data Lake. Implemented streaming and decision engines with Azure Event Hubs and Apache Spark, reducing decision latencies from minutes to seconds. Optimized data transformations on Azure Data Lake Storage Gen2 in Azure Databricks, boosting processing speed by 35%. Developed ETL pipelines in Scala/Spark to load data into Snowflake for fast analytics. Created Monte Carlo-based risk models and random forests in Azure Synapse and Spark, yielding 20% better risk allocations. Built customer lifetime value models using XGBoost to target high-value cu
Data Engineer at TWST Events
August 1, 2021 - September 9, 2025Designed and oversaw end-to-end ETL pipelines integrating internal insurance systems, third-party vendors, and customer-engagement platforms, reducing overall processing time by 30% and improving downstream data quality for analytics. Built scalable data lakes on AWS S3 with Apache Kafka to support real-time ingestion of telematics and mobile app data. Developed an analytical data warehouse using Amazon Redshift and Google BigQuery to enable fast ad hoc queries and reporting, reducing time to insight for actuarial and underwriting teams by 27%. Leveraged Hadoop and Apache Spark for large data processing; built real-time pipelines with Kafka and Flink, lowering pipeline latency by 20%. Strengthened data security and regulatory compliance; automated deployment pipelines with Jenkins and GitLab CI to reduce release times by 18%.
Software Engineer at Signature Technology Group
July 1, 2015 - September 9, 2025Developed scalable backend services in Java with microservices and RESTful APIs; migrated monolithic code to modular services, achieving a ~20-25% performance and resource-efficiency gain. Improved query performance through SQL optimization on MySQL and PostgreSQL. Implemented OAuth2/JWT authentication and secure data handling; automated build and deployment pipelines with Jenkins and Docker to cut deployment time by 40%. Participated in Agile ceremonies and code reviews to promote quality and collaboration.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - PresentLed the development of AI-focused healthcare data solutions and ETL pipelines for Cigna, leveraging Python, Scala, Spark, TensorFlow, PyTorch, Google Cloud AI, AWS, and Databricks to drive a 45% gain in document processing and 20% reduction in processing effort. Designed scalable ETL workflows to load large clinical datasets for real-time analytics, reducing clinician access time. Built real-time data pipelines using Python, Kafka, Scala, Google Cloud Dataflow, and Databricks, achieving a 60% latency reduction and enabling real-time access for 500+ clinicians. Implemented OCR data capture with Python, OpenCV, and TensorFlow to improve insurance intake data accuracy by 40% and eliminate manual entry. Optimized LLM-based summarization on healthcare notes to reduce clinician reading time by 30%. Exposed NLP models via FastAPI microservices and deployed cloud-native ingestion pipelines with Dataflow and Cloud Functions, cutting manual ETL tasks by 35%. Ensured HIPAA-compliant NLP workflows
Data Scientist at Vention
November 1, 2023 - October 10, 2025Designed predictive models to forecast asset prices using XGBoost and ARIMA, achieving a 25% improvement in market trend predictions and scaling datasets with Apache Spark. Developed portfolio optimization solutions with Scikit-learn and Pandas, increasing portfolio performance by 20%. Built generative AI models (GANs) to simulate markets, boosting prediction accuracy by up to 30% when combined with Azure Data Factory and Azure Data Lake. Implemented live decision engines using Azure Event Hubs and Spark streaming, reducing decision latencies from minutes to seconds. Optimized data processing on Azure Data Lake Gen2 with Databricks, accelerating data prep by ~30%. Implemented Monte Carlo risk modeling and CLV models with XGBoost for targeted marketing, and delivered real-time dashboards via Tableau and Power BI.
Data Engineer at TWST Events
August 1, 2021 - October 10, 2025Designed and oversaw end-to-end ETL pipelines integrating internal insurance systems, third-party vendors, and customer engagement platforms, reducing processing time by 30% and improving downstream data quality for analytics. Optimized MySQL and PostgreSQL schemas for fast data storage and retrieval (≈25% performance gains). Built scalable data lakes on AWS S3 with Kafka for real-time telematics and mobile data, enabling low-latency insights for claims processing. Developed an analytical data warehouse on Redshift and BigQuery for actuarial and underwriting reporting, reducing time-to-insight by 27%. Supported big data processing with Hadoop and Spark, and built real-time data pipelines with Kafka and Flink to reduce latency by 20%. Ensured regulatory compliance and security for sensitive data and redesigned data flows to improve delivery by 30%.
Software Engineer at Signature Technology Group
July 1, 2015 - October 10, 2025Enhanced system responsiveness by designing scalable backend services in Java and RESTful APIs; migrated from monolithic to modular microservices, achieving ~20% performance gains. Implemented front-end features with JavaScript/Node.js to improve user experience and integration with back-end services. Used OAuth2/JWT for secure authentication, built and maintained CI/CD pipelines with Jenkins and Docker, and promoted Agile practices with code reviews and ceremonies. Improved query performance through optimized SQL and relational databases (MySQL, PostgreSQL). Reduced deployment times by ~40% and strengthened release reliability.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - PresentLed the development of AI-focused healthcare data solutions and ETL pipelines for Cigna, leveraging Python, Scala, Apache Spark, TensorFlow, PyTorch, Google Cloud AI, AWS, and Databricks to achieve a 45% gain in document processing and reduce processing effort by 20%. Designed scalable ETL workloads to load large clinical datasets into analytical environments and accelerate clinician access to records. Built real-time data pipelines using Python, Apache Kafka, Scala, Google Cloud Dataflow, and Databricks, reducing latency by 60% so 500+ clinicians access patient data in real time. Implemented an OCR solution with Python, OpenCV, and TensorFlow to boost insurance intake data accuracy by 40% and eliminate manual data-entry bottlenecks. Optimized LLM-based summarization (GPT-3, BERT) on healthcare notes to cut clinician reading time by 30%. Deployed FastAPI-based microservices exposing NLP/ML models for real-time processing and integrated Epic Clarity data models for cross-system analytic
Data Scientist at Vention
November 1, 2023 - October 10, 2025Developed predictive and generative AI-driven analytics to advance asset pricing and portfolio insights. Built models to forecast asset prices with XGBoost and ARIMA, achieving a 25% improvement in trend predictions. Scaled data processing with Apache Spark to support larger datasets. Created portfolio optimization models in Python (scikit-learn, pandas) that improved portfolio performance by 20% for clients. Built GAN-based simulations to model complex market dynamics, achieving a 30% gain in predictive accuracy when deployed in Azure Synapse Analytics, Azure Data Factory, and Azure Data Lake. Implemented live decision engines using Azure Event Hubs and Spark streaming to reduce investment decision latencies from minutes to seconds. Optimized ETL from Azure Data Lake Gen2 to Databricks to accelerate model training by ~35%. Developed risk modeling frameworks with Monte Carlo simulations and random forests on Azure Synapse/Azure ML and Spark, improving allocation decisions by ~20%. Buil
Data Engineer at TWST Events
August 1, 2021 - October 10, 2025Designed and oversaw end-to-end ETL pipelines integrating internal insurance systems, third-party vendors, and customer engagement platforms, reducing overall processing time by 30% and improving downstream data quality for analytics. Modeled and optimized MySQL/PostgreSQL databases for fast data storage and retrieval, achieving ~25% performance improvements. Built scalable data lakes on AWS S3 with Kafka for real-time telematics and mobile data, enabling low-latency, automated analytics for claims processing. Developed a data warehouse using Amazon Redshift and Google BigQuery for fast ad hoc querying, cutting time to insight by 27%. Supported large-scale big data processing with Hadoop and Spark (21% performance gain). Created real-time data pipelines with Kafka and Flink to process risk data with 20% lower latency. Ensured regulatory compliance and data security through cross-functional governance and security protocols. Reengineered data flows to simplify delivery pipelines, achiev
Software Engineer at Signature Technology Group
July 1, 2015 - October 10, 2025Enhanced backend services with Java and RESTful APIs; implemented a microservices-based architecture to improve modularity and reduce resource usage by 20%. Modernized front-end features with JavaScript/Node.js to improve user experiences and integration with back-end systems. Implemented OAuth2/JWT-based authentication to safeguard data and comply with security standards. Replaced monolithic code with modular services to reduce deployment times by 40%. Used Jenkins, Docker, and Git-based pipelines for continuous integration and deployment, increasing release reliability. Contributed in Agile ceremonies and peer code reviews to promote best practices in software development.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - November 4, 2025Led the development of AI-driven healthcare data solutions and ETL pipelines using Python, Scala, Spark, TensorFlow, PyTorch, and cloud platforms (GCP, AWS, Databricks). Achieved a 45% gain in document processing and a 20% reduction in processing effort. Designed scalable ETL workflows to accelerate loading of large clinical datasets and enable real-time AI model access for clinicians. Built real-time data pipelines with Kafka, Dataflow, and Databricks reducing latency by 60% and enabling real-time access for over 500 clinicians. Implemented HIPAA-compliant NLP pipelines (spaCy, HuggingFace, BERT/GPT) for entity extraction and automated summaries, improving coding and billing accuracy. Created FastAPI microservices for model exposure, integrated Epic Clarity data models, and delivered dashboards with Power BI. Automated model updates with Kubeflow and GitHub CI/CD, cutting deployment overhead by 18% and improving regulatory reporting efficiency.
Data Scientist at Vention
November 1, 2023 - November 1, 2023Designed and applied predictive models (XGBoost, ARIMA) to forecast asset prices, achieving a 25% improvement in market-trend forecasting. Built portfolio optimization solutions and demonstrated a 20% performance uplift. Developed generative AI models (GANs) for market simulations and deployed data pipelines on Azure Databricks and Azure Data Lake. Created risk and CLV models, and delivered real-time dashboards using Tableau/Power BI integrated with Azure Synapse and Data Explorer. Strengthened data security with IAM and governance practices.
Data Engineer at TWST Events
August 1, 2021 - August 1, 2021Designed end-to-end ETL pipelines connecting internal insurance systems, third-party vendors, and customer engagement platforms, reducing processing time by 30% and improving downstream data quality. Built scalable data lakes on AWS S3 with Kafka and Flink for real-time processing. Implemented analytical data warehouse solutions on Amazon Redshift and Google BigQuery for actuarial and underwriting analytics, enabling faster insights by 27%. Ensured data security and regulatory compliance while standardizing data flows and improving data delivery performance by 30%.
Software Engineer at Signature Technology Group
July 1, 2015 - July 1, 2015Enhanced system responsiveness by designing scalable backend services in Java and RESTful APIs, and modernizing monolithic codebases into modular microservices for improved performance. Implemented secure authentication (OAuth2, JWT) and integrated third-party services via REST and SOAP. Led CI/CD pipelines with Jenkins and Docker, reducing deployment time by 40% and improving release reliability. Participated in Agile ceremonies and code reviews to maintain code quality.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - November 11, 2025Led development of AI-focused healthcare data solutions and ETL pipelines for clients including Cigna, leveraging Python, Scala, Apache Spark, TensorFlow, PyTorch, Google Cloud AI, AWS, and Databricks to improve document processing by 45% and reduce processing effort by 20%. Designed real-time data pipelines, HIPAA-compliant NLP solutions using spaCy and Hugging Face, and FastAPI microservices to expose models; implemented cloud-native ingestion and governance practices.
Senior Data Engineer (AI-Driven) at Uncommon Analytics
December 1, 2023 - November 26, 2025Led the development of AI-focused healthcare data solutions and ETL pipelines for Cigna, leveraging Python, Scala, Apache Spark, TensorFlow, PyTorch, Google Cloud AI, AWS, and Databricks to deliver a 45% gain in document processing and reduce processing effort by 20%. Designed scalable ETL workflows to load large clinical datasets into analytics environments and integrated downstream AI models to speed clinicians' access to records. Built real-time data pipelines using Python, Apache Kafka, Scala, Google Cloud Dataflow, and Databricks, cutting latency by 60% and enabling real-time access for 500+ clinicians. Developed an OCR solution with Python, OpenCV, and TensorFlow to improve insurance intake data accuracy by 40% and eliminate manual data-entry bottlenecks. Optimized LLM-based summarization models on healthcare data to reduce clinician reading time by ~30%. Implemented HIPAA-compliant NLP pipelines with spaCy and Hugging Face transformers for NER and automated summaries; deployed F
Data Scientist at Vention
November 30, 2023 - November 30, 2023Designed predictive models to forecast asset prices using XGBoost and ARIMA, achieving a 25% improvement in market trend predictions and enabling higher-value investments through scalable Spark-based pipelines. Developed portfolio optimization solutions in Python (scikit-learn, Pandas) to support risk-balanced, diversified portfolios, boosting performance by 20%. Created generative AI models (GANs) to simulate markets, increasing forecast accuracy by 30% with Azure Synapse Analytics, Azure Data Factory, and Azure Data Lake. Built financial decision engines using Azure Event Hubs and Spark streaming, reducing decision latencies from minutes to seconds. Optimized data transformations on Azure Data Lake Gen2 in Azure Databricks, accelerating processing by 35%. Implemented ETL pipelines in Scala and Spark to load data into Snowflake; completed data prep 30% faster. Developed Monte Carlo risk frameworks and CLV models with XGBoost; improved risk allocations and marketing targeting. Built da
Data Engineer at TWST Events
August 31, 2021 - August 31, 2021Designed and oversaw end-to-end ETL pipelines integrating insurance systems, third-party vendors, and customer engagement platforms; reduced processing time by 30% and improved data quality for analytics. Optimized MySQL and PostgreSQL performance by 25%. Built scalable data lakes on AWS S3 with Kafka for real-time ingestion of telematics and mobile data, enabling low-latency insights for claims processing. Developed analytics data warehouse on Redshift and BigQuery, reducing time to insight by 27%. Used Hadoop and Spark for big data processing, improving performance by 21%. Implemented real-time pipelines with Kafka and Flink, reducing latency by 20%. Ensured regulatory compliance and security, and redesigned data flows to reduce delivery load time by 30%.
Software Engineer at Signature Technology Group
July 31, 2015 - July 31, 2015Enhanced backend services in Java with scalable microservices, achieving 20% performance gains and modularity. Implemented front-end features with JavaScript/Node.js to improve user experience. Designed asynchronous Java microservices to increase modularity and efficiency. Achieved 30% query performance improvements with MySQL/PostgreSQL through indexing and optimization. Reduced load times by 25% by reworking monolithic codebases into modular services. Implemented OAuth2/JWT security and integrated REST and SOAP APIs for third-party services. Established CI/CD pipelines with Jenkins, Docker, and Git for faster releases, cutting deployment times by 40% and improving reliability. Participated in Agile ceremonies and code reviews.
Data Scientist at Vention
September 1, 2021 - November 1, 2023Designed predictive models to forecast asset prices using XGBoost and ARIMA; developed portfolio optimization solutions to improve client portfolios, leading to enhanced performance. Built generative AI models (GANs) to simulate market scenarios and accelerate strategic planning. Constructed risk models with Monte Carlo simulations and CLV models for targeted marketing; delivered dashboards via Tableau and Power BI with Azure Synapse, Azure Data Factory, and Azure Data Lake. Strengthened data security and compliance across cloud environments and reduced decision latency from minutes to seconds.
Data Engineer at TWST Events
April 1, 2016 - August 1, 2021Designed end-to-end ETL pipelines integrating internal insurance systems, third-party vendors, and customer platforms; reduced processing time by 30% and improved downstream data quality. Built scalable data lakes on AWS S3 with Kafka, and analytical warehouses in Redshift and BigQuery for actuarial and underwriting teams. Developed real-time data pipelines with Kafka and Flink to support risk assessment and regulatory reporting, and implemented data security controls to maintain compliance.
Software Engineer at Signature Technology Group
October 1, 2012 - July 1, 2015Designed scalable backend services in Java with RESTful APIs; developed frontend features using JavaScript/Node.js; created a microservices-based architecture to improve modularity and performance. Implemented OAuth2/JWT security, automated build and deployment pipelines with Jenkins, Docker, and Git-based workflows, and contributed to Agile Scrum ceremonies to improve delivery.
Education
Bachelor of Science in Computer Science at University of Phoenix
September 1, 2008 - May 1, 2012Master of Science in Computer Science at University of Phoenix
April 1, 2012 - June 1, 2014Master of Science in Computer Science at University of Phoenix
April 1, 2012 - June 1, 2014Bachelor of Science in Computer Science at University of Phoenix
September 1, 2008 - May 1, 2012Master of Science in Computer Science at University of Phoenix
April 1, 2012 - June 1, 2014Bachelor of Science in Computer Science at University of Phoenix
September 1, 2008 - May 1, 2012Master of Science in Computer Science at University of Phoenix
April 1, 2012 - June 1, 2014Bachelor of Science in Computer Science at University of Phoenix
September 1, 2008 - May 1, 2012Master of Science in Computer Science at University of Phoenix at Phoenix
April 1, 2012 - June 1, 2014Bachelor of Science in Computer Science at University of Phoenix at Phoenix
September 1, 2008 - May 1, 2012Master of Science in Computer Science at University of Phoenix
April 1, 2012 - June 1, 2014Bachelor of Science in Computer Science at University of Phoenix
September 1, 2008 - May 1, 2012Master of Science in Computer Science at University of Phoenix
April 1, 2012 - June 1, 2014Bachelor of Science in Computer Science at University of Phoenix
September 1, 2008 - May 1, 2012Master of Science in Computer Science at University of Phoenix
April 1, 2012 - June 1, 2014Bachelor of Science in Computer Science at University of Phoenix
September 1, 2008 - May 1, 2012Qualifications
Industry Experience
Healthcare, Financial Services, Professional Services, Software & Internet, Other, Life Sciences, Education, Media & Entertainment
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Tempe today.