Available to hire
I’m an AIML engineer with over 4 years of hands-on experience designing, training, and deploying machine learning and generative AI systems across enterprise and healthcare domains.
I’m proficient in Python, PyTorch, Airflow, and AWS, and I’ve built predictive models using XGBoost and Random Forest, NLP pipelines, as well as optimizing LLM inference with vLLM, TensorRT-LLM, and DeepSpeed. At MetLife I boosted large-language-model throughput by 42% while reducing GPU costs through optimized multi-GPU orchestration.
Experience Level
Work Experience
AIML Engineer - LLM Optimization & Inference Platform at MetLife
September 1, 2024 - November 27, 2025Led optimization of large-language-model inference using vLLM and TensorRT-LLM, reducing latency by 42% for 3B-parameter variants; deployed quantized INT8 GPTQ pipelines with KV-cache streaming and head pruning, cutting GPU memory by 35% and enabling parallel serving on commodity A100 clusters; integrated DeepSpeed MII for multi-GPU orchestration and gradient-free inference, achieving near-linear scalability on 8-GPU nodes via NCCL optimization and CUDA Graphs; built an inference monitoring stack with Triton Server, Prometheus, and CloudWatch for real-time latency metrics and auto-scaling; converted fine-tuned models to ONNX Runtime for edge-ready deployments, accelerating token throughput by 1.7x; developed modular retrieval and prompt-evaluation workflows using LangChain aligned with compliance; automated CI/CD for model packaging and rollout via Docker and GitHub Actions, shortening release cycles from weekly to daily and boosting deployment reliability by 50%.
Machine Learning Engineer - Predictive Health Risk Modeling & ML Pipeline Automation Platform at Sage Softtech
July 1, 2023 - July 1, 2023Designed and deployed patient readmission risk models using Random Forest and XGBoost on 50M+ historical claims and EHR data, improving early detection of high-risk patients by 18%; built end-to-end ML pipeline in Python and Airflow automating data ingestion, feature generation, model training, and evaluation; reduced retraining time from 4 hours to under 45 minutes; deployed inference APIs via FastAPI and Docker on AWS EC2, integrating real-time predictions into the care management dashboard accessed by 200+ physicians daily; implemented model lifecycle tracking and drift monitoring with MLflow and Prometheus, enabling reproducible experiments and timely retraining; processed unstructured physician notes with spaCy and NER to extract comorbidities, medication patterns, and discharge summaries; enriched data boosted predictive AUC by 0.09 and informed chronic-care management programs.
Education
Master of Science in Artificial Intelligence at Yeshiva University
January 11, 2030 - May 1, 2025Qualifications
Industry Experience
Healthcare, Software & Internet, Professional Services
Experience Level
Hire a Data Scientist
We have the best data scientist experts on Twine. Hire a data scientist in Jersey City today.