Machine Learning Engineer

20 years

0 Lacs

Posted:1 day ago| Platform: Linkedin logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

About Us

For over 20 years, Smart Data Solutions has been partnering with leading payer organizations to provide automation and technology solutions enabling data standardization and workflow automation. The company brings a comprehensive set of turn-key services to handle all claims and claims-related information regardless of format (paper, fax, electronic), digitizing and normalizing for seamless use by payer clients. Solutions include intelligent data capture, conversion and digitization, mailroom management, comprehensive clearinghouse services and proprietary workflow offerings. SDS’ headquarters are just outside of St. Paul, MN and leverages dedicated onshore and offshore resources as part of its service delivery model. The company counts over 420 healthcare organizations as clients, including multiple Blue Cross Blue Shield state plans, large regional health plans and leading independent TPAs, handling over 500 million transactions of varying types annually with a 98%+ customer retention rate. SDS has also invested meaningfully in automation and machine learning capabilities across its tech-enabled processes to drive scalability and greater internal operating efficiency while also improving client results.SDS recently partnered with a leading growth-oriented investment firm, Parthenon Capital, to further accelerate expansion and product innovation.Location : 6th Floor, Block 4A, Millenia Business Park, Phase II MGR Salai, Kandanchavadi , Perungudi Chennai 600096, India.

Smart Data Solutions is an equal opportunity employer.

All qualified applicants will receive consideration for employment without regard to race, color, sex, sexual orientation, gender identity, religion, national origin, disability, veteran status, age, marital status, pregnancy, genetic information, or other legally protected statusTo perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed above are representative of the knowledge skill and or ability required. Reasonable accommodation may be made to enable individuals with disabilities to perform essential job functions.Due to access to Protected Healthcare Information, employees in this role must be free of felony convictions on a background check report.

Responsibilities

Duties and Responsibilities include but are not limited to:
  • Design and build ML pipelines for OCR extraction, document image processing, and text classification tasks.
  • Fine-tune or prompt large language models (LLMs) (e.g., Qwen, GPT, LLaMA , Mistral) for domain-specific use cases.
  • Develop systems to extract structured data from scanned or unstructured documents (PDFs, images, TIFs).
  • Integrate OCR engines (Tesseract, EasyOCR , AWS Textract , etc.) and improve their accuracy via pre-/post-processing.
  • Handle natural language processing (NLP) tasks such as named entity recognition (NER), summarization, classification, and semantic similarity.
  • Collaborate with product managers, data engineers, and backend teams to productionize ML models.
  • Evaluate models using metrics like precision, recall, F1-score, and confusion matrix, and improve model robustness and generalizability.
  • Maintain proper versioning, reproducibility, and monitoring of ML models in production.
The duties set forth above are essential job functions for the role. Reasonable accommodations may be made to enable individuals with disabilities to perform essential job functions.

Skills And Qualifications

  • 4–5 years of experience in machine learning, NLP, or AI roles
  • Proficiency with Python and ML libraries such as  PyTorch , TensorFlow, scikit-learn, Hugging Face Transformers.
  • Experience with LLMs (open-source or proprietary), including fine-tuning or prompt engineering.
  • Solid experience in OCR tools (Tesseract, PaddleOCR , etc.) and document parsing.
  • Strong background in text classification, tokenization, and vectorization techniques (TF-IDF, embeddings, etc.).
  • Knowledge of handling unstructured data (text, scanned images, forms).
  • Familiarity with  MLOps tools: MLflow , Docker, Git, and model serving frameworks.
  • Ability to write clean, modular, and production-ready code.
  • Experience working with medical, legal, or financial document processing.
  • Exposure to vector databases (e.g., FAISS, Pinecone, Weaviate ) and semantic search.
  • Understanding of document layout analysis (e.g., LayoutLM , Donut, DocTR ).
  • Familiarity with cloud platforms (AWS, GCP, Azure) and deploying models at scale

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

Kolkata, Mumbai, New Delhi, Hyderabad, Pune, Chennai, Bengaluru

Hyderabad, Delhi / NCR, Mumbai (All Areas)