Python Developer for web scrapping

6 years

0 Lacs

Posted:1 week ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

Python Developer (Web Scraping)

Location:

Type:

Experience:

About the Role

We’re looking for a Python developer who lives and breathes web scraping—someone who can design reliable, scalable crawlers, extract clean data from messy sites, and ship pipelines that run in production every day.

What You’ll Do
  • Build and maintain high-quality scrapers using Python (Scrapy/Playwright/Requests + lxml/BS4).
  • Handle anti-bot defenses: rotating proxies, headless browsers, session management, retries, backoff, fingerprinting hygiene.
  • Parse complex HTML/JSON/XML; normalize and validate data; design resilient selectors.
  • Orchestrate jobs with Airflow/Cron; containerize with Docker; deploy on AWS/GCP.
  • Store and serve data via PostgreSQL/MongoDB/Elasticsearch/S3; build ETL steps.
  • Add logging/metrics (Prometheus/Grafana/ELK) and alerts; write tests and docs.
  • Collaborate with product/legal to ensure ethical, compliant scraping (robots.txt, ToS, data/privacy laws).
Must-Have Qualifications
  • Strong Python (3.9+), async patterns (asyncio/aiohttp) and standard libs.
  • Hands-on with at least two:

    Scrapy

    ,

    Playwright

    /

    Selenium

    ,

    Requests/httpx

    ,

    lxml/BS4

    .
  • Battle-tested techniques for rate limits, CAPTCHAs (via approved providers), pagination, infinite scroll, JS-heavy sites.
  • Solid SQL + one NoSQL store; data modeling and deduping strategies.
  • CI/CD basics (GitHub Actions/GitLab CI), Docker, Linux command-line.
  • Clear communication; habit of writing clean, maintainable code with tests.
Nice to Have
  • Experience with headless browser stealth (Playwright extra, CDP), scrapy-clusters, queueing (Kafka/SQS/RabbitMQ).
  • Data quality checks (Great Expectations), Pydantic, typing.
  • AWS (EC2/ECS/Lambda, S3, CloudWatch), Terraform.
  • Basic ML/NLP for extraction (NER, rule + model hybrids).
  • Security awareness and compliance mindset (privacy, licensing, rate-limit etiquette).
KPIs You’ll Own
  • Freshness & Coverage:

    % of targets crawled on schedule.
  • Data Quality:

    accuracy/duplication/error rates.
  • Reliability:

    job success rate, mean time to recover, alert responsiveness.
  • Efficiency:

    cost per million records, crawl duration.


Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You