Python Developer (AI & Data Evaluation)

3 years

0 Lacs

Posted:17 hours ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Contractual

Job Description

Job Summary

We are partnering with one of the foundational Large Language Model (LLM) companies to help enhance next-generation AI systems. As a Python Developer, you will play a critical role in generating high-quality proprietary datasets, designing evaluation frameworks, and refining AI outputs. This role focuses on data-driven contributions to model fine-tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF), enabling measurable improvements in LLM performance and reliability.


Please note this is a contractual position with immediate joining requirement.


Key Responsibilities

  • Develop and maintain high-quality Python code for dataset creation, evaluation, and automation.
  • Design and execute evaluation strategies (Evals) to benchmark AI model performance.
  • Generate, rank, and critique AI responses across technical and general domains.
  • Build task-specific datasets for

    Supervised Fine-Tuning (SFT)

    and support

    RLHF pipelines

    .
  • Collaborate with annotators, researchers, and product teams to refine reward models.
  • Provide clear, well-documented rationales for model evaluations and feedback.
  • Conduct peer reviews of code and documentation, driving adherence to best practices.
  • Continuously explore new tools and methods to enhance AI training workflows.


Required Skills and Experience

  • 3+ years

    of strong hands-on experience with Python.
  • Proficiency in

    multi-threading, async programming, debugging concurrency/memory issues

    .
  • Strong knowledge of

    Python testing frameworks

    (unit, integration, property-based testing).
  • Ability to

    refactor code

    and work with architectural patterns.
  • Industry experience in

    maintaining code quality, formatting, and clean design

    .
  • Excellent analytical and reasoning skills to evaluate LLM outputs.
  • Fluency in written and spoken English.


Type of Projects & Hands-On Experience

  • AI Training Data Generation

    : Writing code, prompts, and responses for SFT.
  • Evaluation Frameworks

    : Designing processes to measure and benchmark model accuracy, safety, and alignment.
  • RLHF Projects

    : Comparing outputs of different LLM versions, ranking quality, and providing human feedback.
  • Production-Quality Coding

    : Writing maintainable, tested, and scalable Python solutions.


Expected depth: hands-on coding, dataset design, evaluation strategy creation, and active contribution to LLM training loops.


Preferred Qualifications

  • Experience with

    AI/ML workflows

    (fine-tuning, eval pipelines, reward models).
  • Familiarity with

    PyTorch, Hugging Face, or similar ML frameworks

    .
  • Exposure to

    AI ethics, alignment research, or model safety practices

    .
  • Advanced degree in

    Computer Science, Data Science, or related field

    (optional).



Location & Shift Details

  • Remote (Global)

    – fully distributed team.
  • Flexible engagement with required

    4-hour overlap with PST

    .
  • Options available:

    20, 30, or 40 hours/week

    .

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You