Backend Engineer (2-Month Contract)

0 years

0 Lacs

Posted:2 weeks ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Contractual

Job Description

Computer Vision & Backend Engineer (60-Day Build)

Company:

Type:

Location:


Mission (60 days)


Deliver a production-ready photo recognition system that powers a calorie-counting app end-to-end:

  • Upload → Analyze → Nutrition:

    From a food photo, return

    { name, grams, confidence, tags, ingredients, macros }

    per item, with meal totals and remaining daily targets.


  • Retraining option:

    Design and ship the infrastructure that learns from

    user corrections

    (renames, grams/macros edits) and can retrain/evaluate safely.



What you will build (end-to-end scope)


  • Public APIs


  • POST /api/vision/upload (multipart JPEG/PNG/WebP) → { name, grams, confidence, tags }[]


  • POST /api/coach/photo → persist image, call vision, run lookupFood, return items, meal totals, remaining Daily, and coachReply


  • Food analysis (multi-cuisine)


  • Gate + Instances: YOLOv8/11 detect (food vs distractors) → YOLO-seg (retina masks)


  • Naming: SigLIP/CLIP (or compact ViT) on mask crops, synonyms/taxonomy aware


  • Safety:

    OOD detector + low-confidence suggestions; safe abstain (no hallucinations)


  • Portioning (grams)


  • Device-depth first (if present),

    monocular fallback

    (MiDaS/ZoeDepth), tabletop plane-fit, coverage %, density lookup (Redis), portion_source=device|mono|heuristic


  • Nutrition & ingredients


  • Map labels →

    canonical taxonomy

    (≤400 dishes)


  • Query

    our nutrition DB

    or external sources (e.g., FDC) to assemble

    ingredients + per-ingredient macros

    , scale by grams, compute meal totals


  • Retraining loop (feedback → model)


  • Capture user edits & low-margin/OOD crops → store to ClickHouse/S3


  • Scripts & jobs to rebuild datasets, fine-tune,

    evaluate with metric gates

    , and publish new artifacts safely


  • Ops & safety


  • CI evaluator (Top-1/Top-5, OOD FP rate, Portion MAPE, latency SLOs) that

    blocks regressions


  • Observability: structured logs, per-stage ms, model/taxonomy versions


  • Privacy: consent gate, retention/“delete my images” flow


60-Day milestone plan (acceptance-driven)

Week 1–2 (Foundation & API)

  • Stand up GPU FastAPI /infer-v2 + Node /api/coach/photo


  • Return stubbed payload matching contract; basic telemetry; dockerized


  • Demo:

    curl upload → JSON schema exactly matches app contract


Week 3–4 (Models & Portions)

  • YOLO gate+seg (export ONNX); CLIP/SigLIP naming with temperature scaling


  • Depth-aware grams (device depth) + mono fallback; density via Redis


  • Demo:

    multi-cuisine sample set returns names + grams within sanity bounds


Week 5 (Nutrition & Safety)

  • Taxonomy (≤400) + nutrition mapping (our DB / FDC)


  • OOD abstain with suggestions; ingredients + per-ingredient macros scaled by grams


  • Demo:

    App-ready payload { name, grams, confidence, tags, ingredients, macros } per item; meal totals & remainingDaily

  • Week 6–8 (Retraining + CI gates + Canary)

    • Feedback capture from user edits; dataset rebuild scripts; fine-tune path


    • Evaluator + CI gates (json report) and shadow/canary rollout toggles


    • Privacy & retention wired; runbook + handover docs


    • Final Demo (Day 60):

      end-to-end flow on staging GPU; retrain on a small corrected set; CI passes; canary toggle ready


    Success metrics (set at kickoff; used by CI gate)


    • Quality:

      Top-1 on core ≥ target; OOD FP ≤ target; Portion MAPE ≤ target on depth images


    • Latency:

      p50 ≤

      350 ms

      , p95 ≤

      800 ms

      on our staging GPU


    • Reliability:

      CI gate prevents regressions; logs/metrics complete; consent & retention enforced


    Minimum qualifications


    • Shipped

      computer-vision systems

      to production (beyond notebooks)


    • YOLO detect/seg training or fine-tuning; export to

      ONNX/TensorRT

      and debug opsets/dynamic shapes


    • CLIP/SigLIP or ViT classifier work (fine-tune +

      temperature scaling

      ); OOD thresholding


    • Depth pipelines (device + monocular), geometric reasoning (plane fitting, coverage)


    • Production APIs (FastAPI/Node), Redis/ClickHouse (or similar), Docker, GitHub Actions


    • Obs/ops: structured logging, latency profiling, privacy/retention patterns

    Nice-to-haves

    • Triton Inference Server, FAISS/ANN, K8s/Helm, W&B/MLflow


    • Nutrition data integration (FDC or equivalent), taxonomy design


    Tech you’ll touch

    PyTorch, Ultralytics YOLOv8/11, SAM/SAM2, SigLIP/CLIP, MiDaS/ZoeDepth, ONNX Runtime (CUDA EP), TensorRT (nice), FastAPI, Node/Express, Redis, ClickHouse, Docker, GitHub Actions.


    What we provide

    • GPU access (cloud, H100/A10/T4), seed datasets & taxonomy draft, staging infra, and rapid product feedback


    • Clear API contract and benchmark packs for CI gating


    How to apply

    hello@wownom.com

    • A shipped CV project (repo/demo) + one

      latency

      and one

      accuracy

      number you achieved and how


    • Availability to start within 1–2 weeks and timezone


    • (Optional) A brief note on grams estimation from depth vs. monocular on plated dishes


    Mock Interview

    Practice Video Interview with JobPe AI

    Start Job-Specific Interview
    cta

    Start Your Job Search Today

    Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

    Job Application AI Bot

    Job Application AI Bot

    Apply to 20+ Portals in one click

    Download Now

    Download the Mobile App

    Instantly access job listings, apply easily, and track applications.

    coding practice

    Enhance Your Skills

    Practice coding challenges to boost your skills

    Start Practicing Now

    RecommendedJobs for You