Data & AI Specialist - Data Scraping, Enrichment & Quality Assurance

0 years

0 Lacs

Posted:16 hours ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Overview

We're looking for a data-obsessed explorer who can build and maintain pipelines that collect, clean, and enhance large volumes of data, then apply AI tools to keep it accurate, useful, and ready for analysis. This is initially a

project-based role

with the possibility of evolving into a

full-time contract

based on performance and business needs.

Key Responsibilities

  • Data Acquisition & Scraping
    • Design, develop, and maintain scalable web-scraping systems and APIs to collect structured and unstructured data from diverse sources.
    • Ensure compliance with data privacy laws (GDPR, CCPA) and site-specific terms of service.
  • Data Enrichment & Transformation
    • Implement pipelines to clean, normalize, and enrich raw data using third-party datasets, NLP (natural language processing), and machine learning techniques.
    • Build automated matching and deduplication processes to maintain a unified source of truth.
  • Quality Assurance & Monitoring
    • Create automated QA checks to validate data accuracy, completeness, and consistency.
    • Set up monitoring and alert systems to catch anomalies or pipeline failures early.
  • AI & Process Optimization
    • Integrate AI models for entity extraction, text classification, and predictive enrichment.
    • Work with the data science team to design features that feed analytics and machine learning models.
  • Collaboration & Documentation
    • Partner with product, engineering, and analytics teams to define data requirements and priorities.
    • Maintain clear technical documentation and data lineage records

Requirements

  • Strong programming skills in Python (Scrapy, BeautifulSoup, Selenium, Playwright) or equivalent languages.
  • Experience with data pipelines and ETL tools (Airflow, Prefect, or similar).
  • Proficiency in SQL/NoSQL databases and data warehousing (e.g., BigQuery, Snowflake).
  • Familiarity with cloud platforms (AWS, GCP, or Azure) and containerization (Docker/Kubernetes).
  • Knowledge of machine learning workflows and libraries (scikit-learn, spaCy, Hugging Face) is a big plus.
  • Solid understanding of data privacy and ethical data collection practices.

Nice-to-Have

  • Experience with LLMs (large language models) for text enrichment.
  • Background in data visualization or BI tools (Tableau, Looker, Power BI).
  • Familiarity with real-time streaming data (Kafka, Kinesis).

Traits for Success

  • Detail-oriented with a knack for spotting hidden data issues.
  • Curious problem solver who loves automation and efficiency.
  • Comfortable in a fast-paced environment where requirements evolve quickly

Benefits

Remote work.Flexible work schedule .Opportunity for a long term contract .

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You