Web Scraper Developer

4 years

0 Lacs

Posted:3 days ago| Platform: Linkedin logo

Apply

Work Mode

Remote

Job Type

Full Time

Job Description

About the Job:

You will be responsible for developing, deploying, and maintaining high-performance web crawlers and data extraction pipelines that source threat intelligence, leaked datasets, and cybersecurity-related data from the surface, deep, and dark web.

This role requires strong technical knowledge in Python-based scraping frameworks, distributed data pipelines, and automation systems to collect and normalize large-scale datasets with minimal manual intervention. Your work will directly support HEROIC’s mission to make the internet safer through intelligent, data-driven cybersecurity insights.

What you will do:

  • Design, develop, and maintain large-scale, distributed web crawlers and data extraction pipelines.
  • Build automated systems to scrape, clean, and normalize structured and unstructured data from multiple web sources (surface, deep, and dark web).
  • Develop resilient scraping solutions using frameworks like Scrapy, Selenium, Playwright, or custom Python-based tools.
  • Implement strategies to overcome anti-bot challenges (e.g., proxy rotation, CAPTCHA handling, user-agent management).
  • Integrate scraped data into centralized databases (PostgreSQL, MySQL etc).
  • Collaborate with the backend team to design ingestion workflows that feed into HEROIC’s cybersecurity intelligence platform.
  • Monitor and optimize scraping performance, reliability, and compliance with data usage policies.
  • Automate deployment and scaling of crawler clusters using Docker, Kubernetes, or cloud infrastructure (AWS/GCP).
  • Write and maintain APIs, scripts, and ETL components for downstream data processing.
  • Collaborate closely with software development team to ensure seamless data flow and usability.
Requirements
  • Bachelor's Degree in Computer Science, Information Technology or related field
  • Minimum 4 years of hands-on experience in web scraping, data crawling, or data pipeline development.
  • Strong proficiency in Python and scraping frameworks such as Scrapy, Selenium, Playwright, or BeautifulSoup.
  • Proven experience building scalable crawlers capable of handling high-volume, dynamic, or JavaScript-rendered sites.
  • Deep understanding of HTTP, DOM structures, XPath/CSS selectors, and data parsing.
  • Experience managing asynchronous/concurrent scraping tasks and distributed crawling architectures.
  • Knowledge of data pipelines, ETL workflows, and API integrations.
  • Familiarity with NoSQL and SQL databases (e.g., MongoDB, PostgreSQL, Elasticsearch, Cassandra).
  • Strong command of Linux/Unix systems, shell scripting, and version control (Git).
  • Experience with containerization and cloud-based deployments (Docker, Kubernetes, AWS, or GCP).
  • Excellent problem-solving, analytical, and debugging skills.
  • Strong written and verbal communication in English.
  • Prior experience in cybersecurity, data intelligence, or dark web data collection (preferred but not required).
Benefits
  • Position Type:

    Full-time
  • Location:

    India (Remote – Work from anywhere)
  • Salary:

    Competitive salary based on experience
  • Other Benefits:

    PTOs & National Holidays
  • Professional Growth:

    Work with cutting-edge AI, cybersecurity, and SaaS technologies
  • Culture:

    Fast-paced, innovative, mission-driven team.

About Us:

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You