Looks like you have JavaScript disabled. For the full Twine experience, you will need to re-enable it.

Hello, I’m a Hindi linguist, voice professional, and AI data specialist with experience contributing both OCR-ready Hindi text-image datasets and Hindi/Indian English voice data for AI model training. On the OCR side, I create and annotate Hindi (Devanagari) text images suitable for training and evaluating OCR systems. This includes clean image capture, accurate ground-truth transcription, and structured annotations aligned with model-training requirements. On the voice / TTS side, I’m a native Hindi female speaker with extensive experience recording high-quality Hindi and Hinglish speech data for text-to-speech, speech recognition, IVR, and AI voice systems. I’ve delivered enterprise-grade voice datasets following strict audio, consistency, and script-adherence guidelines. What I offer: • Hindi OCR datasets – printed text images with verified transcriptions and optional line- or word-level bounding boxes • Hindi & Hinglish voice datasets – neutral and expressive female voice suitable for TTS and ASR • Strong understanding of AI data quality standards, accuracy checks, and common OCR/speech failure cases • Ability to deliver pilot samples first, then scale to larger structured datasets • Clear documentation and usage-rights clarity for commercial AI training In addition, I’ve worked as an AI trainer and response evaluator, which helps me align data creation with real model-training needs rather than surface-level annotation. I’d be happy to share samples or discuss dataset specifications such as duration, annotation format, or delivery structure. Thank you for your time, and I look forward to collaborating. Best regards, Meenakshi Verma Hindi Linguist | OCR & Voice TTS Data Contributor…Hello, I’m a Hindi linguist, voice professional, and AI data specialist with experience contributing both OCR-ready Hindi text-image datasets and Hindi/Indian English voice data for AI model training. On the OCR side, I create and annotate Hindi (Devanagari) text images suitable for training and evaluating OCR systems. This includes clean image capture, accurate ground-truth transcription, and structured annotations aligned with model-training requirements. On the voice / TTS side, I’m a native Hindi female speaker with extensive experience recording high-quality Hindi and Hinglish speech data for text-to-speech, speech recognition, IVR, and AI voice systems. I’ve delivered enterprise-grade voice datasets following strict audio, consistency, and script-adherence guidelines. What I offer: • Hindi OCR datasets – printed text images with verified transcriptions and optional line- or word-level bounding boxes • Hindi & Hinglish voice datasets – neutral and expressive female voice suitable for TTS and ASR • Strong understanding of AI data quality standards, accuracy checks, and common OCR/speech failure cases • Ability to deliver pilot samples first, then scale to larger structured datasets • Clear documentation and usage-rights clarity for commercial AI training In addition, I’ve worked as an AI trainer and response evaluator, which helps me align data creation with real model-training needs rather than surface-level annotation. I’d be happy to share samples or discuss dataset specifications such as duration, annotation format, or delivery structure. Thank you for your time, and I look forward to collaborating. Best regards, Meenakshi Verma Hindi Linguist | OCR & Voice TTS Data Contributor

Meenakshi Verma

Voiceover Artist, Writer, Content Creator, +2





Hello, I’m a Hindi linguist, voice professional, and AI data specialist with experience contributing both OCR-ready Hindi text-image datasets and Hindi/Indian English voice data for AI model training. On the OCR side, I create and annotate Hindi (Devanagari) text images suitable for training and evaluating OCR systems. This includes clean image capture, accurate ground-truth transcription, and structured annotations aligned with model-training requirements. On the voice / TTS side, I’m a native Hindi female speaker with extensive experience recording high-quality Hindi and Hinglish speech data for text-to-speech, speech recognition, IVR, and AI voice systems. I’ve delivered enterprise-grade voice datasets following strict audio, consistency, and script-adherence guidelines. What I offer: • Hindi OCR datasets – printed text images with verified transcriptions and optional line- or word-level bounding boxes • Hindi & Hinglish voice datasets – neutral and expressive female voice suitable for TTS and ASR • Strong understanding of AI data quality standards, accuracy checks, and common OCR/speech failure cases • Ability to deliver pilot samples first, then scale to larger structured datasets • Clear documentation and usage-rights clarity for commercial AI training In addition, I’ve worked as an AI trainer and response evaluator, which helps me align data creation with real model-training needs rather than surface-level annotation. I’d be happy to share samples or discuss dataset specifications such as duration, annotation format, or delivery structure. Thank you for your time, and I look forward to collaborating. Best regards, Meenakshi Verma Hindi Linguist | OCR & Voice TTS Data Contributor…Hello, I’m a Hindi linguist, voice professional, and AI data specialist with experience contributing both OCR-ready Hindi text-image datasets and Hindi/Indian English voice data for AI model training. On the OCR side, I create and annotate Hindi (Devanagari) text images suitable for training and evaluating OCR systems. This includes clean image capture, accurate ground-truth transcription, and structured annotations aligned with model-training requirements. On the voice / TTS side, I’m a native Hindi female speaker with extensive experience recording high-quality Hindi and Hinglish speech data for text-to-speech, speech recognition, IVR, and AI voice systems. I’ve delivered enterprise-grade voice datasets following strict audio, consistency, and script-adherence guidelines. What I offer: • Hindi OCR datasets – printed text images with verified transcriptions and optional line- or word-level bounding boxes • Hindi & Hinglish voice datasets – neutral and expressive female voice suitable for TTS and ASR • Strong understanding of AI data quality standards, accuracy checks, and common OCR/speech failure cases • Ability to deliver pilot samples first, then scale to larger structured datasets • Clear documentation and usage-rights clarity for commercial AI training In addition, I’ve worked as an AI trainer and response evaluator, which helps me align data creation with real model-training needs rather than surface-level annotation. I’d be happy to share samples or discuss dataset specifications such as duration, annotation format, or delivery structure. Thank you for your time, and I look forward to collaborating. Best regards, Meenakshi Verma Hindi Linguist | OCR & Voice TTS Data Contributor

Available to hire

Noida, India

Hello,

I’m a Hindi linguist, voice professional, and AI data specialist with experience contributing both OCR-ready Hindi text-image datasets and Hindi/Indian English voice data for AI model training.

On the OCR side, I create and annotate Hindi (Devanagari) text images suitable for training and evaluating OCR systems. This includes clean image capture, accurate ground-truth transcription, and structured annotations aligned with model-training requirements.

On the voice / TTS side, I’m a native Hindi female speaker with extensive experience recording high-quality Hindi and Hinglish speech data for text-to-speech, speech recognition, IVR, and AI voice systems. I’ve delivered enterprise-grade voice datasets following strict audio, consistency, and script-adherence guidelines.

What I offer:
• Hindi OCR datasets – printed text images with verified transcriptions and optional line- or word-level bounding boxes
• Hindi & Hinglish voice datasets – neutral and expressive female voice suitable for TTS and ASR
• Strong understanding of AI data quality standards, accuracy checks, and common OCR/speech failure cases
• Ability to deliver pilot samples first, then scale to larger structured datasets
• Clear documentation and usage-rights clarity for commercial AI training

In addition, I’ve worked as an AI trainer and response evaluator, which helps me align data creation with real model-training needs rather than surface-level annotation.

I’d be happy to share samples or discuss dataset specifications such as duration, annotation format, or delivery structure.

Thank you for your time, and I look forward to collaborating.

Best regards,
Meenakshi Verma
Hindi Linguist | OCR & Voice TTS Data Contributor

Skills

Copywriting

Experience Level

Copywriting

Expert

Language

Hindi

Fluent

English

Fluent

Work Experience

Head of Department (English) at Lotus Valley International School

April 1, 2010 - February 28, 2023

Mentored 60+ teachers and 10,000+ students. Created ESOL and English assessments for multiple organizations. Supported a Canadian firm in developing CELBAN tests. Created TESOL assessments and taught learners for Progression Formation (France). Taught English to students from Algeria, France, Spain, and Russia; designed English curriculum (ages 6–12) for Awesome Productions (Singapore). Assessed English audio for candidate evaluation. Spoken English educator — Fluency Talks (Brazil). Resource person for educational workshops.

Expert – New Business & Courses at Fluency Academy

July 1, 2022 - February 28, 2023

Market research for India, course design, lesson planning, scripting, proofreading, and video production.

Prompt Engineer, Response Evaluator at DataAnnotation.tech

November 1, 2023 - May 1, 2024

Annotated and labeled over 5,000 Hindi data points following strict quality guidelines. Trained 3 ML models. Worked on research, safety, factuality, and brevity models.

Data Annotation Expert (Hindi – Visual Cultural Knowledge) at SALT Lab, Stanford University

November 1, 2024 - January 1, 2025

Annotated data related to cultural practices, traditions, and insights, ensuring accuracy and relevance.

AI Trainer at Turing

October 1, 2025 - Present

AI training, evaluation, and prompt design for language models; micro tasks.

Content Review Expert at Mercor

October 1, 2025 - Present

Content quality review: safety, factual accuracy, readability.