Home
Jobs

Applied AI Engineer- Generative & Cognitive Technologies

0 years

0 Lacs

Posted:21 hours ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

About the Role You’ll join a small, fast team turning cutting-edge AI research into shippable products across text, vision, and multimodal domains. One sprint you’ll be distilling an LLM for WhatsApp chat-ops; the next you’ll be converting CAD drawings to BOM stories, or training a computer-vision model that flags onsite safety risks. You own the model life-cycle end-to-end: data prep ➞ fine-tune/distil ➞ evaluate ➞ deploy ➞ monitor. Key Responsibilities Model Engineering • Fine-tune and quantise open-weight LLMs (Llama 3, Mistral, Gemma) and SLMs for low-latency edge inference. • Train or adapt computer-vision models (YOLO, Segment Anything, SAM-DINO) to detect site hazards, drawings anomalies, or asset states. Multimodal Pipelines • Build retrieval-augmented-generation (RAG) stacks: loaders → vector DB (FAISS / OpenSearch) → ranking prompts. • Combine vision + language outputs into single “scene → story” responses for dashboards and WhatsApp bots. Serving & MLOps • Package models as Docker images, SageMaker endpoints, or ONNX edge bundles; expose FastAPI/GRPC handlers with auth, rate-limit, telemetry. • Automate CI/CD: GitHub Actions → Terraform → blue-green deploys. Evaluation & Guardrails • Design automatic eval harnesses (BLEU, BERTScore, CLIP similarity, toxicity & bias checks). • Monitor drift, hallucination, latency; implement rollback triggers. Enablement & Storytelling • Write prompt playbooks & model cards so other teams can reuse your work. • Run internal workshops: “From design drawing to narrative” / “LLM safety by example”. Required Skills & Experience 3+ yrs ML/NLP/CV in production; at least 1 yr hands-on with Generative AI . Strong Python (FastAPI, Pydantic, asyncio) and HuggingFace Transformers OR diffusers . Experience with minima­l-footprint models (LoRA, QLoRA, GGUF, INT-4) and vector search. Comfortable on AWS/GCP/Azure for GPU instances, serverless endpoints, IaC. Solid grasp of evaluation/guardrail frameworks (Helm, PromptLayer, Guardrails-AI, Triton metrics). Bonus Points Built a RAG or function-calling agent used by 500+ users. Prior CV pipeline (object-detection, segmentation) or speech-to-text real-time project. Live examples of creative prompt engineering or story-generation. Familiarity with LangChain, LlamaIndex, or BentoML. Why You’ll Love It Multidomain playground – text, vision, storytelling, decision-support. Tech freedom – pick the right model & stack; justify it; ship it. Remote-first – work anywhere ±4 hrs of IST; quarterly hack-weeks in Hyderabad. Top-quartile pay – base + milestone bonus + conference stipend. How to Apply Send a resume and link to GitHub / HF / Kaggle showcasing LLM or CV work. Include a 200-word note describing your favourite prompt or model tweak and the impact it had. Short-listed candidates complete a practical take-home (fine-tune tiny model, build RAG or vision demo, brief write-up) and a 45-min technical chat. We hire builders, not resume keywords. Show us you can ship AI that works in the real world—and explain it clearly—and you’re in. Show more Show less

Mock Interview

Practice Video Interview with JobPe AI

Start Ai Interview Now