I am a Senior Data Engineer with 8+ years of experience designing and delivering end-to-end data solutions for real-time analytics, business intelligence, and AI applications. I have a strong background in data pipelines, modeling, and platform architecture across AWS, Azure, and GCP environments. I thrive on transforming complex data into actionable insights and reliable reporting that empower smarter decisions. I enjoy collaborating with product, data science, and UX teams to align data models with key KPIs and ship measurable business impact. I excel at building scalable data platforms, automating workflows, and delivering dashboards that drive engagement and revenue.

Yan Cheng Liu

I am a Senior Data Engineer with 8+ years of experience designing and delivering end-to-end data solutions for real-time analytics, business intelligence, and AI applications. I have a strong background in data pipelines, modeling, and platform architecture across AWS, Azure, and GCP environments. I thrive on transforming complex data into actionable insights and reliable reporting that empower smarter decisions. I enjoy collaborating with product, data science, and UX teams to align data models with key KPIs and ship measurable business impact. I excel at building scalable data platforms, automating workflows, and delivering dashboards that drive engagement and revenue.

Available to hire

I am a Senior Data Engineer with 8+ years of experience designing and delivering end-to-end data solutions for real-time analytics, business intelligence, and AI applications. I have a strong background in data pipelines, modeling, and platform architecture across AWS, Azure, and GCP environments. I thrive on transforming complex data into actionable insights and reliable reporting that empower smarter decisions.

I enjoy collaborating with product, data science, and UX teams to align data models with key KPIs and ship measurable business impact. I excel at building scalable data platforms, automating workflows, and delivering dashboards that drive engagement and revenue.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
See more

Language

Bashkir
Advanced

Work Experience

Senior Data Engineer at Airbnb
January 1, 2022 - Present
Built the Host Profile Data Service for Airbnb Rooms using Scala, Kafka, and Flink, integrating host activity into Snowflake pipelines that powered the Host Passport feature and improved data freshness by ~23%. Designed the Atlas Schema and Databricks + Spark pipelines for Airbnb Services & Experiences data into a unified model, reducing query latency by ~20% and simplifying business reporting. Created Looker dashboards on top of dbt and Snowflake, providing actionable engagement and revenue metrics for product and business teams. Developed ML feature pipelines with Python, TensorFlow, and Vertex AI to enhance personalization and boost trip-planning engagement. Automated 200+ Airflow workflows in AWS with Terraform, maintaining 99% reliability and full observability via Prometheus and Grafana. Tuned Delta Lake performance on Databricks, reducing refresh times and compute costs through incremental processing and optimized partitioning. Collaborated with PMs, Data Scientists, and UX team
Senior Data Engineer at Meta
January 1, 2019 - January 1, 2021
Rebuilt a large-scale A/B testing pipeline using Spark and Snowflake, improving data freshness by 16% and reliability across thousands of experiments. Modernized the Metric Monitor with new streaming and batch pipelines, reducing alert latency by 17% and improving anomaly detection accuracy. Optimized Spark + Presto jobs for experimentation analysis, improving compute efficiency and reducing job failures through query tuning and parallel execution. Built Airflow + Snowflake pipelines for experiment metrics, automating ingestion and reporting to serve product and analytics teams. Partnered with Product Science and Infrastructure teams to standardize experiment schemas and develop interactive dashboards in Scuba, improving data visibility and decision speed.
Senior Data Engineer at Instagram
January 1, 2017 - January 1, 2019
Built a new client-side logging pipeline for Instagram Stories, capturing high-volume engagement with improved throughput and data consistency. Designed and deployed an end-to-end data solution for Explore Brand Safety, integrating content integrity models with ad-delivery metrics for safer recommendations. Developed exposure-based A/B testing metrics for Instagram Ads, enabling fine-grained performance attribution and automated experiment analysis. Collaborated with cross-functional ML and Ads Engineering teams to align event schemas and experiment frameworks across Instagram and Facebook. Drove adoption of modern data modeling and monitoring standards, improving visibility and reliability of metrics powering Instagram's ad-ranking systems.
Senior MTS / Developer, Analytics Platform at Athenahealth
January 1, 2016 - January 1, 2017
Built a large-scale data ingestion system to transport distributed relational data into a cloud-based MPP warehouse, driving over 10x reduction in query latency for analytics and reporting workloads. Developed a HIPAA-compliant transformation service that enabled secure, self-service data access for client analytics while maintaining regulatory integrity.

Education

Master of Science in Computer Science at Georgia Institute of Technology
January 1, 2013 - January 1, 2015

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services, Media & Entertainment