I'm a data engineer who loves turning complex data into reliable, governed data products. I specialize in end-to-end dbt development and Lakehouse architectures, with a strong focus on SQL and operational excellence. I enjoy collaborating with cross-functional teams to define data contracts and deliver production-ready datasets. In Azure Databricks environments, I lead with CI/CD, automated testing, and robust data governance. I'm passionate about building scalable data platforms that empower ML, BI, and analytics while optimizing performance and cost.

Ghanasham Saravaiahgari

I'm a data engineer who loves turning complex data into reliable, governed data products. I specialize in end-to-end dbt development and Lakehouse architectures, with a strong focus on SQL and operational excellence. I enjoy collaborating with cross-functional teams to define data contracts and deliver production-ready datasets. In Azure Databricks environments, I lead with CI/CD, automated testing, and robust data governance. I'm passionate about building scalable data platforms that empower ML, BI, and analytics while optimizing performance and cost.

Available to hire

I’m a data engineer who loves turning complex data into reliable, governed data products. I specialize in end-to-end dbt development and Lakehouse architectures, with a strong focus on SQL and operational excellence. I enjoy collaborating with cross-functional teams to define data contracts and deliver production-ready datasets.

In Azure Databricks environments, I lead with CI/CD, automated testing, and robust data governance. I’m passionate about building scalable data platforms that empower ML, BI, and analytics while optimizing performance and cost.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert

Language

English
Fluent
Swedish
Intermediate

Work Experience

Data Engineer at Alfa Laval
December 1, 2024 - March 1, 2025
Operationalized a full Lakehouse architecture based on Delta Lake and Unity Catalog, applying Medallion layering to standardize, govern, and operationalize enterprise data models. Led day-to-day dbt model development, testing, documentation, and incremental builds for 50+ models. Implemented Bronze/Silver/Gold layers, ingested Dynamics 365 via ADF and Databricks notebooks with schema drift handling and metadata governance. Harmonized reference/master data across regions; published curated datasets to Gold layer for BI consumption in Microsoft Fabric Warehouse. Enforced RBAC and catalog-level permissions, GDPR masking rules, and CI/CD via Azure DevOps for automated dbt build/test/deploy. Optimized performance with Z-ORDER, partition pruning, Delta caching, and incremental refresh strategies. Collaborated with AI/ML engineers to define data contracts and validate feature stores.
Data Engineer at Diggibyte
December 1, 2022 - July 1, 2023
Built and maintained data pipelines in Databricks using PySpark and SQL to process structured and semi-structured datasets into analytics-ready, governed tables. Developed dbt staging models and tests, documented transformation logic, and aligned with Lakehouse modelling standards. Implemented Medallion Architecture with Parquet and Delta Lake principles, standardized raw ingestion, and harmonized to business-ready dimensional models. Managed ingestion via ADF pipelines and Airflow DAGs for automated, reliable schedules. Created modular ETL utilities and improved pipeline observability; collaborated with BI Analysts to enable reporting and ML training.
Data Engineer (Freelance Projects) – Aprobo AI at Aprobo AI
August 1, 2025 - Present
Developed a production-grade AI chatbot with a retrieval-augmented generation (RAG) architecture, Kafka-based event streaming to decouple microservices, Dockerized FastAPI components, and vector embeddings in ChromaDB. Implemented memory/context carryover for natural multi-turn interactions, enabling scalable customer support with minimal human intervention. Deployed as containerized services and orchestrated the message flow from inbound queries through retrieval and LLM inference.
Data Engineer (Freelance) – Reciva at Reciva
August 1, 2025 - Present
Designed an event-driven document processing pipeline for automated receipt classification (vendor, amount, date, tax). Implemented a Redis-backed job queue to decouple Go-based upload services from Python OCR workers, developed Python OCR/parsing modules, and orchestrated ingestion, validation, and reconciliation with Airflow. Stored processed data in PostgreSQL and delivered a React UI for bulk uploads and manual corrections; containerized components with Docker.

Education

Add your educational history here.

Qualifications

Add your qualifications or awards here.

Industry Experience

Software & Internet, Professional Services, Media & Entertainment

Experience Level

Expert
Expert
Expert
Expert
Expert