Available to hire
Hi, I’m Anjana Jha. I’m a Lead Big Data Engineer with 12+ years of experience delivering end-to-end data platforms that combine streaming, batch, ML, and analytics. I’ve helped financial services, healthcare, energy, and telecom organizations modernize legacy systems and migrate workloads to the cloud.
I thrive on turning complex requirements into secure, scalable solutions using cloud-native, hybrid architectures across AWS, Azure, GCP, and on-prem Hadoop. I enjoy mentoring teams, collaborating with analysts and data scientists, and continuously improving data governance, security, and operational excellence.
Skills
See more
Language
English
Fluent
Work Experience
Lead Cloud & Data Engineer at Exelon Corporation
January 1, 2023 - PresentArchitected secure, high-throughput streaming pipelines using Apache Kafka (on-prem) and Azure Event Hubs to ingest real-time meter and grid telemetry data (>150K events/sec) with minimal latency. Implemented end-to-end encryption and access controls (TLS, SASL, and Azure RBAC). Built a centralized data lake on Azure Blob Storage Gen2 and curated warehouse layers in Azure Synapse, enabling sub-second queries and <5-minute SLA for analytics. Optimized Databricks Spark ETL to process streaming data into analytics-ready tables, cutting batch times ~40% while handling >1 PB of historical data. Orchestrated pipelines with Airflow and Azure Data Factory; implemented data lineage; performed large-scale backfills on autoscaling Databricks clusters with spot instances, reducing compute costs by ~50%. Enforced governance with Azure AD RBAC and NSGs; codified infrastructure with Terraform and ARM templates; monitored end-to-end with Azure Monitor. Implemented serverless Azure Functions to automat
Senior Data & Cloud Engineer at Ameriprise Financial
May 1, 2020 - December 1, 2022Designed end-to-end ETL pipelines using Azure Data Factory and Azure Databricks (PySpark/Scala) to ingest, cleanse, and standardize data into an Azure Data Lake. Optimized Spark jobs with partitioning and caching, reducing daily processing times by >40%. Modeled data warehouses in Azure Synapse Analytics, later integrating with Snowflake hosted on AWS for advanced analytics. Implemented secure data access policies across clouds (Azure RBAC and AWS IAM/KMS) with encryption in transit and at rest; automated policy audits via CI/CD. Established proactive monitoring with Azure Monitor and AWS CloudWatch; automated deployments with Azure DevOps, Terraform, and ARM; trained teams on Databricks and Snowflake, boosting release velocity by ~60%.
Senior Cloud & Data Engineer at US Cellular
February 1, 2018 - April 1, 2020Designed ETL with AWS Glue, EMR, and PySpark/Scala transforming TBs of telecom data daily. Built real-time streaming pipelines with Kafka + Spark Streaming on EMR for low-latency insights. Designed data lakes on S3 with partitioning; enabled efficient querying via Athena and Redshift Spectrum; modeled enterprise data warehouses in Redshift. Implemented AWS Step Functions and Data Pipeline for orchestration with robust error handling. Monitored Spark jobs, crawlers, and query performance with CloudWatch; delivered Power BI dashboards backed by Redshift and Athena. Authored runbooks and best practices; achieved ~50% ETL runtime reduction and supported telecom compliance.
Data Engineer at Amgen
February 1, 2015 - January 1, 2018Designed scalable ingestion pipelines with Kafka and Pub/Sub; PySpark on Dataproc and Dataflow to clean and transform petabyte-scale genomic and clinical trial data. Used Cloud Dataprep for pre-processing and anomaly detection. Integrated on-prem HDFS/Hive/NiFi with GCP services during phased cloud migration. Modeled schemas in BigQuery for low-latency multi-terabyte queries; deployed Bigtable and HBase for scalable semi-structured storage; implemented Cloud Functions for event-driven transformations. Wrote Spark SQL queries for financial and clinical trial extraction; implemented CI/CD with Jenkins and Git. Tuned Kafka clusters for throughput and exactly-once ingestion. Ensured data governance with IAM and VPC Service Controls; delivered BI dashboards in Tableau/Power BI for compliance and research.
Hadoop & Big Data Engineer at Nike
January 1, 2014 - January 1, 2015Administered Hadoop clusters (Cloudera CDH, Hortonworks HDP) and migrated MapReduce to Spark (PySpark/Scala) for retail analytics. Built high-throughput Kafka pipelines for real-time e-commerce events with secure ACLs. Designed data lakes on S3 with partitioned schemas; queried via Athena and Redshift Spectrum. Modeled warehouses in Redshift; delivered dashboards in Power BI. Implemented Oozie-based ETL, Hive and Impala for interactive analytics; built disaster recovery and data quality checks; established CI/CD pipelines with Jenkins and Git; performed data migrations to ensure resilience and scalability.
BI Developer at Cosm Inc.
March 1, 2013 - December 1, 2013BI Developer supporting a major financial services client. Collected, cleaned, and integrated data from multiple sources; built Tableau dashboards; performed data analysis to identify trends and opportunities; developed predictive models in Python to forecast churn and assess credit risk.
Education
Qualifications
Master of Science in Data Science and Computational Intelligence
January 11, 2030 - January 13, 2026Bachelor of Computer Applications (BCA)
January 11, 2030 - January 13, 2026Cloudera Certified Associate (CCA) Data Analyst
January 11, 2030 - January 13, 2026Confluent Certified Developer for Apache Kafka (CCDAK) Associate
January 11, 2030 - January 13, 2026Master of Science in Data Science and Computational Intelligence
January 11, 2030 - January 13, 2026Bachelor of Computer Applications (BCA)
January 11, 2030 - January 13, 2026Cloudera Certified Associate (CCA) Data Analyst
January 11, 2030 - January 13, 2026Confluent Certified Developer for Apache Kafka (CCDAK) Associate
January 11, 2030 - January 13, 2026Industry Experience
Energy & Utilities, Financial Services, Healthcare, Retail, Telecommunications, Life Sciences, Software & Internet, Professional Services, Manufacturing
Skills
See more
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Chicago today.