Posted:1 week ago|
Platform:
On-site
Full Time
Job Summary As a Data Scientist specializing in NLP, Generative AI, and Cloud technologies, you will be responsible for designing, developing, and deploying data extraction pipelines from documents. You will use state-of-the-art machine learning models, NLP techniques, and cloud platforms to improve automation, data quality, and overall decision-making processes. This role requires strong technical expertise, creative problem-solving, and hands-on experience with cloud technologies. Key Responsibilities Design and implement advanced NLP models to extract structured data from unstructured document formats (e.g., PDFs, Word, scanned images, emails, etc.). Leverage Generative AI techniques for data enhancement, content summarization, and document generation where necessary. Develop, fine-tune, and deploy machine learning models to enhance document understanding and automate data extraction processes. Collaborate with engineering teams to integrate NLP models into cloud-based data pipelines and workflows (AWS, Azure, or GCP). Build scalable and efficient data extraction workflows, ensuring high accuracy and performance of models. Conduct end-to-end data science activities, from data collection, cleaning, and exploration, to feature engineering and model deployment. Ensure the security, scalability, and compliance of data processing solutions deployed in the cloud. Evaluate and improve existing document extraction tools and processes, suggesting innovative solutions. Stay updated on the latest trends in NLP, Generative AI, and Cloud technologies, applying these advancements to enhance model performance and operational efficiency. Required Skills & Qualifications Minimum of 5 years of hands-on experience in Data Science, with a focus on NLP, machine learning, and AI. Strong proficiency in Python and libraries like SpaCy, NLTK, Hugging Face Transformers, and TensorFlow/PyTorch. Deep knowledge of document processing techniques, including OCR, text extraction, and document classification. Experience with Generative AI models (e.g., GPT, BERT) and their application in data extraction or document processing tasks. Expertise in cloud technologies (AWS, Azure, GCP) for building and deploying data-driven solutions. Proficiency in data manipulation and analysis using libraries like Pandas, NumPy, and SQL. Hands-on experience with model deployment frameworks and tools like Docker, Kubernetes, or MLFlow. Familiarity with version control (Git), CI/CD processes, and Agile development practices. Strong problem-solving skills, with the ability to design innovative solutions for complex document extraction challenges. Excellent communication skills and ability to work in cross-functional teams. Preferred Qualifications Master’s or PhD in Computer Science, Data Science, Artificial Intelligence, or related field. Experience with Large Language Models (LLMs) and advanced NLP techniques such as transfer learning and few-shot learning. Familiarity with document management systems (DMS) or enterprise content management (ECM) platforms. Experience with deploying and scaling machine learning models in production environments. Understanding of data privacy regulations and secure processing of sensitive information. Show more Show less
EXL
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
My Connections EXL
Gurgaon, Haryana, India
Salary: Not disclosed
Gurgaon, Haryana, India
Salary: Not disclosed
Gurgaon, Haryana, India
Experience: Not specified
Salary: Not disclosed
Noida, Uttar Pradesh, India
Salary: Not disclosed
Noida, Uttar Pradesh, India
Salary: Not disclosed
Bengaluru / Bangalore, Karnataka, India
Experience: Not specified
Salary: Not disclosed
Noida, Uttar Pradesh, India
Experience: Not specified
Salary: Not disclosed
Chennai, Tamil Nadu, India
Salary: Not disclosed
Chennai, Tamil Nadu, India
Salary: Not disclosed
Chennai, Tamil Nadu, India
Salary: Not disclosed