Posted:1 week ago|
Platform:
On-site
Full Time
Hello,
We are looking for AI/ML Engineers / Interns who can handle end-to-end data-to-AI pipeline.
Responsibilities:
- Convert scanned legal PDFs into clean, structured text using OCR tools (Tesseract, PaddleOCR, DocTR).
- Perform data cleaning, preprocessing, and chunking (300–600 tokens with overlap, section-wise).
- Generate embeddings and build Vector DB indexes (FAISS / Qdrant / Weaviate).
- Implement RAG pipelines (LangChain / LlamaIndex) to connect legal data with GPT models (OpenAI GPT-4o Mini / GPT-4.1 Mini).
- Ensure source-grounded answers with Act/Section citations and prevent hallucinations.
- Evaluate system performance (Recall@k, latency, faithfulness, hallucination rate).
- Work with legal researchers to align AI output with real law practices.
Required Skills:
- Strong Python (Pandas, Regex, JSON handling).
- Experience with OCR tools (Tesseract, PaddleOCR, pdfplumber, PyMuPDF).
- Knowledge of NLP basics (tokenization, embeddings, transformers).
- Hands-on with Vector DBs (FAISS, Qdrant, Weaviate).
- Familiarity with LangChain / LlamaIndex for RAG.
- OpenAI API integration (prompting, structured outputs).
- Basic knowledge of Git + Docker for deployment.
Location: NCR/Delhi.
Best Regards,
Job Type: Full-time
Pay: ₹50,000.00 - ₹90,000.00 per month
Work Location: In person
MJ Global
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowExperience: Not specified
6.0 - 10.8 Lacs P.A.
Experience: Not specified
0.5 - 0.9 Lacs P.A.
Experience: Not specified
6.0 - 10.8 Lacs P.A.