LLM Evaluation Engineer (Remote)

13 - 23 years

9 - 19 Lacs

Posted:14 hours ago| Platform: Naukri logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Crossing Hurdles

Position:

Role Responsibilities:

  • Frame and design high-quality machine learning tasks to enhance LLM capabilities.
  • Build and optimize ML models for NLP, classification, prediction, recommendation, or generative tasks.
  • Run rapid experimentation cycles, evaluate model performance, and iterate on improvements.
  • Conduct advanced feature engineering and preprocessing for large-scale datasets.
  • Implement adversarial testing, robustness checks, and bias evaluations.
  • Fine-tune, evaluate, and deploy transformer-based models when required.
  • Create datasets, evaluation rubrics, and benchmarking pipelines for ML tasks.
  • Maintain documentation for experiments, datasets, and modelling decisions.
  • Stay updated with cutting-edge ML research, tools, and competition-grade methodologies.

Requirements:

  • 3–5+ years experience in machine learning model development.
  • Degree in Computer Science, Engineering, Statistics, Mathematics, or related field.
  • Proven competitive ML background (Kaggle/DrivenData) with medals or strong rankings preferred.
  • Strong proficiency in Python with PyTorch/TensorFlow.
  • Solid understanding of ML fundamentals — statistics, optimisation, model evaluation.
  • Experience building reproducible ML pipelines and experiment tracking.
  • Familiarity with benchmarking, scoring methodologies & evaluation frameworks.
  • Experience with cloud environments (AWS/GCP/Azure).
  • Strong problem-solving ability, analytical mindset, and clear communication skills.
  • Fluency in English.

Preferred:

  • Kaggle Master/Grandmaster or multiple gold medals.
  • Experience with LLMs, generative models, or multimodal learning.
  • Knowledge of vector DBs, distributed training, scalable deployments.
  • MLOps exposure — W&B, MLflow, Airflow, Docker.
  • Publications, open-source contributions, or tech writing.
  • Prior mentorship/leadership experience.

Application Process:

  1. Apply for the job role.
  2. Await the official message/email from our recruitment team (typically within 1–2 days).

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now
Crossing Hurdles logo
Crossing Hurdles

Consulting

Atlanta

RecommendedJobs for You