Hello, I’m a Hindi linguist, voice professional, and AI data specialist with experience contributing both OCR-ready Hindi text-image datasets and Hindi/Indian English voice data for AI model training. On the OCR side, I create and annotate Hindi (Devanagari) text images suitable for training and evaluating OCR systems. This includes clean image capture, accurate ground-truth transcription, and structured annotations aligned with model-training requirements. On the voice / TTS side, I’m a native Hindi female speaker with extensive experience recording high-quality Hindi and Hinglish speech data for text-to-speech, speech recognition, IVR, and AI voice systems. I’ve delivered enterprise-grade voice datasets following strict audio, consistency, and script-adherence guidelines. What I offer: • Hindi OCR datasets – printed text images with verified transcriptions and optional line- or word-level bounding boxes • Hindi & Hinglish voice datasets – neutral and expressive female voice suitable for TTS and ASR • Strong understanding of AI data quality standards, accuracy checks, and common OCR/speech failure cases • Ability to deliver pilot samples first, then scale to larger structured datasets • Clear documentation and usage-rights clarity for commercial AI training In addition, I’ve worked as an AI trainer and response evaluator, which helps me align data creation with real model-training needs rather than surface-level annotation. I’d be happy to share samples or discuss dataset specifications such as duration, annotation format, or delivery structure. Thank you for your time, and I look forward to collaborating. Best regards, Meenakshi Verma Hindi Linguist | OCR & Voice TTS Data Contributor

Meenakshi Verma

Hello, I’m a Hindi linguist, voice professional, and AI data specialist with experience contributing both OCR-ready Hindi text-image datasets and Hindi/Indian English voice data for AI model training. On the OCR side, I create and annotate Hindi (Devanagari) text images suitable for training and evaluating OCR systems. This includes clean image capture, accurate ground-truth transcription, and structured annotations aligned with model-training requirements. On the voice / TTS side, I’m a native Hindi female speaker with extensive experience recording high-quality Hindi and Hinglish speech data for text-to-speech, speech recognition, IVR, and AI voice systems. I’ve delivered enterprise-grade voice datasets following strict audio, consistency, and script-adherence guidelines. What I offer: • Hindi OCR datasets – printed text images with verified transcriptions and optional line- or word-level bounding boxes • Hindi & Hinglish voice datasets – neutral and expressive female voice suitable for TTS and ASR • Strong understanding of AI data quality standards, accuracy checks, and common OCR/speech failure cases • Ability to deliver pilot samples first, then scale to larger structured datasets • Clear documentation and usage-rights clarity for commercial AI training In addition, I’ve worked as an AI trainer and response evaluator, which helps me align data creation with real model-training needs rather than surface-level annotation. I’d be happy to share samples or discuss dataset specifications such as duration, annotation format, or delivery structure. Thank you for your time, and I look forward to collaborating. Best regards, Meenakshi Verma Hindi Linguist | OCR & Voice TTS Data Contributor

Available to hire

Hello,

I’m a Hindi linguist, voice professional, and AI data specialist with experience contributing both OCR-ready Hindi text-image datasets and Hindi/Indian English voice data for AI model training.

On the OCR side, I create and annotate Hindi (Devanagari) text images suitable for training and evaluating OCR systems. This includes clean image capture, accurate ground-truth transcription, and structured annotations aligned with model-training requirements.

On the voice / TTS side, I’m a native Hindi female speaker with extensive experience recording high-quality Hindi and Hinglish speech data for text-to-speech, speech recognition, IVR, and AI voice systems. I’ve delivered enterprise-grade voice datasets following strict audio, consistency, and script-adherence guidelines.

What I offer:
• Hindi OCR datasets – printed text images with verified transcriptions and optional line- or word-level bounding boxes
• Hindi & Hinglish voice datasets – neutral and expressive female voice suitable for TTS and ASR
• Strong understanding of AI data quality standards, accuracy checks, and common OCR/speech failure cases
• Ability to deliver pilot samples first, then scale to larger structured datasets
• Clear documentation and usage-rights clarity for commercial AI training

In addition, I’ve worked as an AI trainer and response evaluator, which helps me align data creation with real model-training needs rather than surface-level annotation.

I’d be happy to share samples or discuss dataset specifications such as duration, annotation format, or delivery structure.

Thank you for your time, and I look forward to collaborating.

Best regards,
Meenakshi Verma
Hindi Linguist | OCR & Voice TTS Data Contributor

See more

Experience Level

Language

Hindi
Fluent
English
Fluent

Work Experience

Head of Department (English) at Lotus Valley International School
April 1, 2010 - February 28, 2023
Mentored 60+ teachers and 10,000+ students. Created ESOL and English assessments for multiple organizations. Supported a Canadian firm in developing CELBAN tests. Created TESOL assessments and taught learners for Progression Formation (France). Taught English to students from Algeria, France, Spain, and Russia; designed English curriculum (ages 6–12) for Awesome Productions (Singapore). Assessed English audio for candidate evaluation. Spoken English educator — Fluency Talks (Brazil). Resource person for educational workshops.
Expert – New Business & Courses at Fluency Academy
July 1, 2022 - February 28, 2023
Market research for India, course design, lesson planning, scripting, proofreading, and video production.
Prompt Engineer, Response Evaluator at DataAnnotation.tech
November 1, 2023 - May 1, 2024
Annotated and labeled over 5,000 Hindi data points following strict quality guidelines. Trained 3 ML models. Worked on research, safety, factuality, and brevity models.
Data Annotation Expert (Hindi – Visual Cultural Knowledge) at SALT Lab, Stanford University
November 1, 2024 - January 1, 2025
Annotated data related to cultural practices, traditions, and insights, ensuring accuracy and relevance.
AI Trainer at Turing
October 1, 2025 - Present
AI training, evaluation, and prompt design for language models; micro tasks.
Content Review Expert at Mercor
October 1, 2025 - Present
Content quality review: safety, factual accuracy, readability.

Education

Diploma in Creative Writing at Scholastic Publishers
January 11, 2030 - January 21, 2026
Certification in Public Speaking at University of Washington
January 11, 2030 - January 21, 2026
TESL at Arizona State University
January 11, 2030 - January 21, 2026
Course in Prompt Engineering at Udemy
January 11, 2030 - January 21, 2026
Data Science: Building Machine Learning Models at Harvard University
January 11, 2030 - January 21, 2026

Qualifications

Master's in English
January 11, 2030 - January 21, 2026
Master's in Education
January 11, 2030 - January 21, 2026
Bachelor's in Science
January 11, 2030 - January 21, 2026
Post Graduate Diploma in Computer Applications
January 11, 2030 - January 21, 2026
Diploma in Creative Writing
January 11, 2030 - January 21, 2026
Certification in Public Speaking
January 11, 2030 - January 21, 2026
TESL
January 11, 2030 - January 21, 2026
Course in Prompt Engineering
January 11, 2030 - January 21, 2026
Data Science: Building Machine Learning Models
January 11, 2030 - January 21, 2026

Industry Experience

Education, Media & Entertainment, Software & Internet, Professional Services