Available to hire
Hi, I’m Omar Shehab, a computer science and applied mathematics student at NYU Abu Dhabi. I am passionate about natural language processing, machine learning, and building end-to-end systems that blend linguistics and data science.
I thrive in collaborative research environments and enjoy turning complex data into practical insights. Through my work across CAMEL Lab, Kirmizialtin Lab, and the Computer-Human Intelligence Lab, I have developed scalable ML pipelines, robust software, and a taste for tackling real-world NLP and ML challenges.
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Language
English
Fluent
Work Experience
Software Engineering Intern at Kirmizialtin Lab
May 1, 2025 - PresentDesigned and implemented distributed computing algorithms for large-scale molecular data processing, optimizing performance for datasets with over 10,000 molecular structures using parallel processing techniques. Built scalable data pipeline with community detection algorithms achieving a 0.94 modularity score, showing expertise in algorithm optimization and performance tuning. Developed fault-tolerant systems for molecular simulation workflows, implementing robust error handling and recovery mechanisms ensuring reliable processing of critical scientific computations.
Software Engineering Intern at Computer-Human Intelligence Lab
April 1, 2024 - PresentEngineered scalable video processing system with multi-tiered architecture supporting speaker diarization and real-time transcription, serving concurrent users with 95% uptime and low-latency response. Optimized distributed system performance through 4-core parallel processing and intelligent caching strategies, achieving 70% performance improvement and 40% cost reduction in the production environment. Implemented robust 3-layer caching architecture (Memory→Redis→PostgreSQL) with fault-tolerance mechanisms, achieving 95% cache hit rate while handling high-throughput query processing. Designed and executed system evaluations with 20+ users, demonstrating 67%-89% reduction in query response time through algorithmic optimization and system design improvements.
Software Engineering Intern at Kirmizialtin Lab
May 1, 2025 - PresentDesigned and implemented distributed computing algorithms to optimize large-scale molecular data processing for datasets containing over 10,000 molecular structures. Built scalable data pipelines using community detection algorithms with a modularity score of 0.94. Developed fault-tolerant systems with robust error handling for molecular simulation workflows, ensuring reliability in critical scientific computations.
Software Engineering Intern at Computer-Human Intelligence Lab
April 1, 2024 - PresentEngineered a scalable video processing system with a multi-tiered architecture supporting speaker diarization and real-time transcription, achieving 95% uptime and low-latency response for concurrent users. Improved distributed system performance by 70% using 4-core parallel processing and caching strategies, reducing costs by 40%. Implemented a robust 3-layer caching architecture (Memory, Redis, PostgreSQL) with fault tolerance and a 95% cache hit rate. Led system evaluations showing 67%-89% reduction in query response times through algorithmic and design optimizations.
NLP Research Assistant at CAMEL Lab, NYU Abu Dhabi
January 1, 2025 - October 28, 2025Built a TF-IDF pipeline with scikit-learn (character/word n-grams, OneVsRestClassifier, MultinomialNB) to predict dialect neutrality across 26 MADAR dialects. Analyzed dialect confusability with Shannon entropy metrics and built visualization pipelines for regional clustering heatmaps. Collaborated in a 2-person modeling team within a 6-person research group on feature engineering and evaluation.
Machine Learning Engineering Intern at Kirmizialdin Lab
May 1, 2025 - October 28, 2025Engineered iterative feature selection for 6000 MOFs, reducing 2200→300 features via ensemble learning (Random Forest, XGBoost) across 4 rounds. Accelerated downstream regressor workflows from 1 month to weeks iteration cycles, enabling faster experimentation. Collaborated in a 4-person subteam within a 15-person lab, coordinating outputs with regressor team.
Software Engineering Intern at Computer-Human Intelligence Lab
October 1, 2024 - October 1, 2024Built a RAG system with three-tier caching (L1 in-memory, L2 Redis, L3 Postgres), achieving 95.8% hit rate and < 500ms queries. Developed a Node.js streaming pipeline processing 1-hour videos in ~157s with 4 parallel jobs, cutting time by 65% and costs to <$2/dataset. Designed and implemented end-to-end system architecture independently from requirements to deployment.
Education
B.Sc. at New York University (NYU) Abu Dhabi
September 1, 2022 - July 23, 2025B.Sc. at New York University (NYU) Abu Dhabi
September 1, 2022 - August 6, 2025B.Sc. Computer Science & Applied Mathematics at New York University Abu Dhabi
September 1, 2022 - May 1, 2026Qualifications
Advancement Opportunity Grant
January 11, 2030 - October 28, 2025HackHarvard at Harvard University
October 1, 2025 - October 28, 2025Summer Research Grant
June 1, 2025 - October 28, 2025Industry Experience
Software & Internet, Computers & Electronics, Life Sciences, Education
Skills
Experience Level
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Expert
Intermediate
Intermediate
Intermediate
Hire a Developer
We have the best developer experts on Twine. Hire a developer in Abu Dhabi today.