Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

I am a data engineer with over 3 years of experience owning production ETL/ELT pipelines on Databricks and AWS. I have designed Spark-based data pipelines processing 5M+ records/day across finance and reporting domains with a 99.9% job SLA, implemented Delta Lake schemas, partitioning strategies, and automated data quality controls to ensure reliable analytics workloads. I prioritize production reliability, observability, CI/CD, and cost-aware performance tuning. I enjoy collaborating with business stakeholders to translate reporting requirements into scalable data models and reusable pipelines, and I contribute to on-call rotations, incident RCA, and permanent fixes to reduce repeat issues.…I am a data engineer with over 3 years of experience owning production ETL/ELT pipelines on Databricks and AWS. I have designed Spark-based data pipelines processing 5M+ records/day across finance and reporting domains with a 99.9% job SLA, implemented Delta Lake schemas, partitioning strategies, and automated data quality controls to ensure reliable analytics workloads. I prioritize production reliability, observability, CI/CD, and cost-aware performance tuning. I enjoy collaborating with business stakeholders to translate reporting requirements into scalable data models and reusable pipelines, and I contribute to on-call rotations, incident RCA, and permanent fixes to reduce repeat issues.

Shaileja Kuthuru

Data Scientist, Data Analyst, Developer, +2





I am a data engineer with over 3 years of experience owning production ETL/ELT pipelines on Databricks and AWS. I have designed Spark-based data pipelines processing 5M+ records/day across finance and reporting domains with a 99.9% job SLA, implemented Delta Lake schemas, partitioning strategies, and automated data quality controls to ensure reliable analytics workloads. I prioritize production reliability, observability, CI/CD, and cost-aware performance tuning. I enjoy collaborating with business stakeholders to translate reporting requirements into scalable data models and reusable pipelines, and I contribute to on-call rotations, incident RCA, and permanent fixes to reduce repeat issues.…I am a data engineer with over 3 years of experience owning production ETL/ELT pipelines on Databricks and AWS. I have designed Spark-based data pipelines processing 5M+ records/day across finance and reporting domains with a 99.9% job SLA, implemented Delta Lake schemas, partitioning strategies, and automated data quality controls to ensure reliable analytics workloads. I prioritize production reliability, observability, CI/CD, and cost-aware performance tuning. I enjoy collaborating with business stakeholders to translate reporting requirements into scalable data models and reusable pipelines, and I contribute to on-call rotations, incident RCA, and permanent fixes to reduce repeat issues.

Available to hire

I am a data engineer with over 3 years of experience owning production ETL/ELT pipelines on Databricks and AWS. I have designed Spark-based data pipelines processing 5M+ records/day across finance and reporting domains with a 99.9% job SLA, implemented Delta Lake schemas, partitioning strategies, and automated data quality controls to ensure reliable analytics workloads.

I prioritize production reliability, observability, CI/CD, and cost-aware performance tuning. I enjoy collaborating with business stakeholders to translate reporting requirements into scalable data models and reusable pipelines, and I contribute to on-call rotations, incident RCA, and permanent fixes to reduce repeat issues.

Skills

Experience Level

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Expert

Intermediate

Work Experience

Data Engineer at Citco Group Limited

January 1, 2024 - Present

Owned 6+ production ETL/ELT pipelines on Databricks (Spark + Delta Lake), processing 5M+ records/day across finance and reporting domains with 99.9% SLA. Designed Delta Lake schemas, partitioning strategies, and incremental MERGE-based upserts to support scalable analytics workloads. Implemented automated data quality framework (schema validation, null thresholds, reconciliation checks) reducing recurring production defects by ~85%. Tuned Spark SQL workloads (partition pruning, join strategies, caching) reducing average query latency by ~30% (8.0s to 5.6s) and lowering compute cost. Built monitoring and alerting using CloudWatch; participated in on-call rotation, performed RCA, and delivered permanent fixes to reduce repeat incidents. Standardized pipeline templates and reusable modules, cutting manual effort by ~60% and saving ~120 engineering hours/month. Managed CI/CD using Git + Jenkins, enforcing code reviews and automated testing to improve deployment reliability across environme

Python Data Engineer at Getida (An SIB Company)

November 1, 2022 - December 1, 2023

Built Python + SQL data pipelines processing 10M+ records across multiple data sources, delivering analytics-ready datasets for finance and operations teams. Automated recurring data preparation and reporting workflows, reducing processing time by ~70% and saving ~15 hours/week of analyst effort. Designed SQL data models and Power BI dashboards supporting 10+ KPIs used in executive weekly/monthly reviews and operational decision-making. Implemented reconciliation logic and audit checks achieving 99.5% accuracy across multi-source reporting pipelines. Developed NLP-based sentiment analysis on 50K+ customer reviews, driving product improvements that increased customer satisfaction by ~18%. Partnered with business stakeholders to translate reporting requirements into scalable data models and reusable pipelines.

Python Developer (Product & Insights) at NJR Infotech Private Limited

August 1, 2021 - July 1, 2022

Developed Python-based REST APIs handling 200+ automated requests/day for internal reporting and analytics applications. Built extraction and transformation pipelines using Python and SQL, producing standardized datasets consumed by analysts and business teams. Optimized SQL queries and backend processing, improving report stability and reducing long-running queries. Led code reviews and improved internal documentation, reducing onboarding time and improving long-term maintainability.

Python Intern at Defence Research and Development Organisation (DRDO)

January 1, 2021 - March 1, 2021

Automated Python scripts to clean, analyze, and visualize experimental datasets (500K+ data points), improving analysis efficiency by ~40%. Built KPI dashboards and reporting outputs (Power BI/Tableau) for research tracking and project reviews.