I am a Sr. Python / PySpark Developer with 9+ years of backend/data engineering experience, focused on building GenAI-enabled data pipelines, LLM integrations, and scalable ETL architectures. I design and deploy production-grade AI-powered data platforms in regulated environments, emphasizing safety, privacy, and observability. My work blends hands-on engineering with architectural leadership to deliver grounded, scalable solutions. I thrive in cross-functional teams, translating business needs into robust data & AI workflows, mentoring colleagues, and championing responsible AI practices in healthcare and financial services projects.

Venkanna Babu Kolla

I am a Sr. Python / PySpark Developer with 9+ years of backend/data engineering experience, focused on building GenAI-enabled data pipelines, LLM integrations, and scalable ETL architectures. I design and deploy production-grade AI-powered data platforms in regulated environments, emphasizing safety, privacy, and observability. My work blends hands-on engineering with architectural leadership to deliver grounded, scalable solutions. I thrive in cross-functional teams, translating business needs into robust data & AI workflows, mentoring colleagues, and championing responsible AI practices in healthcare and financial services projects.

Available to hire

I am a Sr. Python / PySpark Developer with 9+ years of backend/data engineering experience, focused on building GenAI-enabled data pipelines, LLM integrations, and scalable ETL architectures. I design and deploy production-grade AI-powered data platforms in regulated environments, emphasizing safety, privacy, and observability. My work blends hands-on engineering with architectural leadership to deliver grounded, scalable solutions.

I thrive in cross-functional teams, translating business needs into robust data & AI workflows, mentoring colleagues, and championing responsible AI practices in healthcare and financial services projects.

See more

Experience Level

Expert
Expert
Expert
Expert
Expert
Expert
Expert

Work Experience

Sr. Python Developer/Agentic AI Engineer at Prudential Financial
September 1, 2023 - Present
Designed GenAI architectures on AWS EMR; integrated GPT-4, Cohere, and OpenAI APIs; built multi-agent workflows with LangGraph/AutoGen for document analysis and data enrichment; implemented retrieval-augmented generation with FAISS/ChromaDB for grounded responses; exposed outputs via secure FastAPI microservices; established guardrails, schema validation, and monitoring; implemented CI/CD and automated security validation.
Python Developer/Agentic AI Engineer at Dignity Health
August 1, 2022 - September 1, 2023
Architected AI-enabled ETL pipelines on Azure for healthcare data enrichment; deployed GenAI-enabled APIs with FastAPI; implemented HL7 v2 to FHIR transformations and PHI masking; designed multi-step clinical document processing workflows with LangGraph; ensured HIPAA compliance and monitoring; contributed to real-time data interoperability.
PySpark Developer at Amazon
September 1, 2019 - August 1, 2021
Developed customer segmentation models; deployed CI/CD pipelines for EMR deployments; implemented CDC with Debezium and Kafka; tuned Spark jobs with broadcast joins/partitioning and caching; integrated data into Redshift, Snowflake, and PostgreSQL; built scalable PySpark ETL pipelines and data serving components.
PySpark Developer at Technologies Innovations Pvt Ltd-India
August 1, 2017 - September 1, 2019
Developed reusable PySpark ETL framework for metadata-driven ingestion; migrated on-prem Informatica workloads to PySpark on AWS EMR; implemented data lake governance, SCD/CDC patterns with Delta Lake; built streaming pipelines using Kafka/Kinesis; enabled metadata lineage and CI/CD.
Software Developer at Hudda InfoTech Private Limited
June 1, 2016 - May 1, 2017
Developed GUI multithreaded PySide6 applications; built Django web services and real-time fraud detection pipelines with PySpark; automated testing with PyTest; implemented SSO and web integrations; contributed to Terraform/AWS-based deployments.
Python/PySpark Developer at Hexxen
August 1, 2021 - June 1, 2022
Developed PySpark ETL pipelines; built real-time streaming with Kafka/AWS Kinesis; created reusable PySpark libraries and Golang utilities; automated validation of SCD Type 1 & 2; orchestrated Spark jobs via Airflow; implemented metadata governance and data exploration using OpenSearch.

Education

Bachelor's in Electronics and Communication Engineering at Jain University, India
January 11, 2030 - January 1, 2017
Master’s in Information Technology and Management at Webster University, USA
January 11, 2030 - January 1, 2022

Qualifications

PySpark Certification Associate Developer
January 11, 2030 - February 17, 2026

Industry Experience

Healthcare, Financial Services, Software & Internet, Professional Services, Life Sciences