I am a Senior Data Engineer with 10+ years of experience building scalable data platforms and AI training pipelines. I specialize in delivering reliable data systems and robust data workflows that enable scalable analytics and AI initiatives. I thrive in detail-oriented environments, applying strong Python and SQL skills to drive data quality, governance, and actionable insights. In my roles across leading tech companies, I’ve built end-to-end data pipelines, championed data observability, and partnered with product and engineering teams to improve data consistency and downstream usability. I’m passionate about translating complex data challenges into maintainable solutions and guiding teams toward impactful data-driven decisions.

Roland Wilson

I am a Senior Data Engineer with 10+ years of experience building scalable data platforms and AI training pipelines. I specialize in delivering reliable data systems and robust data workflows that enable scalable analytics and AI initiatives. I thrive in detail-oriented environments, applying strong Python and SQL skills to drive data quality, governance, and actionable insights. In my roles across leading tech companies, I’ve built end-to-end data pipelines, championed data observability, and partnered with product and engineering teams to improve data consistency and downstream usability. I’m passionate about translating complex data challenges into maintainable solutions and guiding teams toward impactful data-driven decisions.

Available to hire

I am a Senior Data Engineer with 10+ years of experience building scalable data platforms and AI training pipelines. I specialize in delivering reliable data systems and robust data workflows that enable scalable analytics and AI initiatives. I thrive in detail-oriented environments, applying strong Python and SQL skills to drive data quality, governance, and actionable insights.

In my roles across leading tech companies, I’ve built end-to-end data pipelines, championed data observability, and partnered with product and engineering teams to improve data consistency and downstream usability. I’m passionate about translating complex data challenges into maintainable solutions and guiding teams toward impactful data-driven decisions.

See more

Experience Level

Expert
Expert
Expert
Expert

Work Experience

Senior Data Engineer at Databricks
December 4, 2024 - Present
Led design and implementation of Lakehouse architecture using Databricks, Delta Lake, and Iceberg to support telemetry, product analytics, and internal reporting across engineering and business teams. Owned real-time and batch data pipelines using Kafka and Spark Structured Streaming, enabling low-latency ingestion for operational monitoring. Built ETL/ELT pipelines from microservices and APIs into S3 and ADLS backing storage, standardizing data contracts and schema evolution. Implemented bronze, silver, and gold data modeling layers and dbt-based transformations for analytics datasets; improved pipeline reliability with data observability, monitoring, alerting, and Great Expectations data quality checks; optimized Spark workloads for performance and cost. Collaborated on telemetry schemas and event contracts and integrated multi-cloud workflows for downstream reporting using BigQuery.
Senior Data Engineer at Palantir Technologies
March 1, 2022 - November 30, 2024
Built scalable data pipelines using Python, Spark, and distributed systems to ingest structured and unstructured data into a unified platform. Designed transformation layers producing canonical datasets aligned to shared data models, enabling reusable data products for analytics and operational workflows. Developed document processing pipelines including parsing, chunking, metadata enrichment, and indexing to support semantic search and unstructured data processing. Built embedding pipelines and feature engineering workflows enabling retrieval-based AI systems and semantic search applications. Owned streaming and batch data pipelines using Kafka and Spark to support real-time ingestion and scheduled refreshes for operational intelligence use cases. Implemented data lineage, governance, RBAC, and metadata management to meet compliance requirements in regulated environments. Developed REST API integrations and API data pipelines connecting internal and external systems into the platform.
Data Engineer at Thoughtworks
October 1, 2019 - February 28, 2022
Designed and implemented ETL and ELT pipelines using Azure Data Factory and Azure Databricks to integrate ERP, operational systems, and legacy reporting data into a centralized data platform. Built PySpark-based distributed data processing workflows to handle large-scale structured and semi-structured enterprise datasets. Developed dimensional data models and data warehousing solutions supporting financial reporting, reconciliation, and audit-ready analytics. Implemented workflow orchestration using Airflow and ADF, managing dependencies, scheduling, and monitoring across pipelines. Established data quality frameworks including validation checks, schema enforcement, and anomaly detection to improve trust in reporting datasets. Implemented data governance, RBAC, and metadata management to ensure secure and compliant access across enterprise systems. Standardized ingestion patterns and reusable transformation components across domains, improving maintainability and delivery speed. Enable
Backend Engineer at HatchWorks
June 1, 2016 - September 30, 2019
Led development of Python-based backend services and REST APIs for data ingestion of user events, transactions, and application logs, integrating third-party APIs and internal systems into unified data pipelines. Owned ETL pipelines using Airflow for batch processing of clickstream and session-level data, implementing data transformation, schema validation, and data quality checks to produce analytics datasets. Designed PostgreSQL data models and analytics engineering datasets, including KPI modeling tables supporting Tableau and Power BI dashboards for customer analytics and reporting. Developed feature engineering pipelines for machine learning use cases such as churn prediction and customer segmentation. Integrated API data pipelines across internal services and third-party systems to consolidate fragmented product and customer data. Improved analytics query performance through SQL tuning, indexing strategies, and query optimization, reducing dashboard latency and improving usabilit

Education

Bachelor's degree at New Mexico State University
January 1, 2012 - January 1, 2016

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet