Jobs
Interviews

139 Web Scraping Jobs - Page 6

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

3 - 7 years

20 - 25 Lacs

Gurugram

Work from Office

Develop capability to efficiently scrape data from the web from multiple sources. Scrape difficult websites by deploying anti blocking and anti-captcha tools. Knowledge of Mobile request monitoring. Knowledge to use the robots.txt file. Required Candidate profile Proxy rotation and user-agent rotation. Knowledge of request monitoring with different tools like fiddler, Charles Knowledge to set up the virtual device and root the device like MuMu player.

Posted 3 months ago

Apply

2 - 3 years

4 - 5 Lacs

Bengaluru

Work from Office

As a skilled Developer, you are responsible for building tools and applications that utilize the data held within company databases. The primary responsibility will be to design and develop these layers of our applications and to coordinate with the rest of the team working on different layers of IT infrastructure. A commitment to collaborative problem solving, sophisticated design and quality product is essential Python Developer Necessary Skills: Have experience in data wrangling and manipulation with Python/Pandas. Experience with Docker containers. Knowledge of data structures, algorithms and data modeling. Experience with versioning (Git, Azure DevOps). Design and implementation of ETL/ELT pipelines. Should have good knowledge and experience on web scrapping (Scrapy, BeautifulSoup, Selenium) Expertise in at least one popular Python framework (like Django, Flask or Pyramid) Design, build, and maintain efficient, reusable, and reliable Python code. (SOLID, Design principles) Have experience in SQL database (Views, Stored Procedure, etc.) Responsibilities and Activities Aside from the core development role this job position includes auxiliary roles that are not related to development. The role includes but is not limited to: Support and maintenance of customs and previously developed tools, as well as excellence of performance and responsiveness of new applications. Deliver high quality and reliable applications, including Development and Front-End. In addition, you will maintain code quality, prioritize organization, and drive automatization. Participate in the peer review of plans, technical solutions, and related documentation (Map/document technical procedures). Identify security issues, bottlenecks, and bugs, implementing solutions to mitigate and address issues of service data security and data breaches. Work with SQL / Postgres databases: installing and maintaining database systems, supporting server management, including Backups. In addition to troubleshooting issues raised by the Data Processing team.

Posted 3 months ago

Apply

2 - 5 years

5 - 10 Lacs

Ahmedabad

Work from Office

Roles & responsibilities: Develop, and maintain applications using JavaScript. Collaborate with product managers and other developers to translate requirements into functional solutions. Write clean, efficient, and reusable code that adheres to industry best practices and coding standards. Troubleshoot and debug issues reported by clients or internal teams, ensuring timely resolution. Participate in agile development processes, including sprint planning, task estimation, and daily stand-ups. Participating in team building & organizational activities Being a supportive team member and adaptive to company culture Participation towards achieving organizational goal Helping to cherish healthy work culture & AWESOME WORKPLACE Key skills required: Minimum of 2 years of experience in JavaScript Development related work (with any Libraries, with any Frameworks). Proficiency in HTML and CSS. Strong understanding of core JavaScript concepts, including DOM manipulation, asynchronous programming, and event handling Proficient in debugging and troubleshooting skills Understanding of software development principles, including modularization, separation of concerns, and code reusability Excellent problem-solving skills and attention to detail. Ability to work independently as well as part of a team. Good communication skills and ability to collaborate effectively with team members. Must be a fast learner to adapt to changes in technologies. Familiarity with Chrome extension APIs. Good to have experience with building Chrome extensions is a plus. Good to have knowledge with C# development. Perks and benefits: Excellent base salary Flexible working hours 5 Days of Working Work-life balance culture Annual Performance Bonus Company outing Family Health Insurance Lunch, snacks & other benefits

Posted 3 months ago

Apply

5.0 - 10.0 years

10 - 20 Lacs

hyderabad

Work from Office

Role & responsibilities Mandatory Skills: 5+ years experience as a Python Full Stack Backend Developer Expertise in Native APIs, Web Applications, RESTful APIs, and FastAPI Hands-on experience with Web Scraping, Selenium, SQL, NoSQL, and MongoDB Working knowledge on Docker and Kubernetes Preferred candidate profile

Posted Date not available

Apply

3.0 - 7.0 years

20 - 35 Lacs

hyderabad, gurugram

Hybrid

Key Responsibilities Design, Develop and Deploy ML powered products and pipelines Play a central role in all stages of the data science project life cycle, including: Identification of suitable data science project opportunities Partnering with business leaders, domain experts, and end-users to gain business understanding, data understanding, and collect requirements Evaluation/interpretation of results and presentation to business leaders Performing exploratory data analysis, proof-of-concept modelling, model benchmarking and setup model validation experiments Training large models both for experimentation and production Develop production ready pipelines for enterprise scale projects Perform code reviews & optimization for your projects and team Spearhead deployment and model scaling strategies Stakeholder management and representing the team in front of our leadership Leading and mentoring by example including project scrums What Were Looking For: 3 - 7 years of professional experience in Data Science domain Expertise in Python (Numpy, Pandas, Spacy, Sklearn, Pytorch/TF2, HuggingFace etc.) Experience with SOTA models related to NLP and expertise in text matching techniques, including sentence transformers, word embeddings, and similarity measures Expertise in probabilistic machine learning model for classification, regression & clustering Strong experience in feature engineering, data preprocessing, and building machine learning models for large datasets. Exposure to Information Retrieval, Web scraping and Data Extraction at scale OOP Design patterns, Test-Driven Development and Enterprise System design SQL (any variant, bonus if this is a big data variant) Linux OS (e.g. bash toolset and other utilities) Version control system experience with Git, GitHub, or Azure DevOps. Problem-solving and debugging skills Software craftsmanship, adherence to Agile principles and taking pride in writing good code Techniques to communicate change to non-technical people Nice to have Prior work to show on Github, Kaggle, StackOverflow etc. Cloud expertise (AWS and GCP preferably) Expertise in deploying machine learning models in cloud environments Familiarity in working with LLMs

Posted Date not available

Apply

5.0 - 8.0 years

10 - 20 Lacs

hyderabad

Work from Office

Dear candidate! Looking for Python Automation Developer for Hyderabad location. Kindly find the below details and revert with suitable resume. 1. Backend Python developer with Web Development and automation. 2. Web Crawling/Scrapping experience (3 to 5 years) 3. Hands-on experience RESTful APIs (FastAPI preferred) 4. MS SQL Server and Mongo (1 to 2 years) 5. Docker/ Kubernetes (1 to 2 years) Selenium (3 to 5 years) Detailed Job Description & Roles and Responsibilities Strong backend development experience using Python (3-5 years) Hands-on experience for building and managing RESTful APIs (FastAPI preferred) Proficient in web development and automation , including dynamic web scraping using Selenium Database expertise : MS SQL Server and MongoDB (12 years) Experience with Docker/Kubernetes for containerization and deployment workflows Ability to design and implement scalable backend systems and data pipelines Comfortable working independently on API development, scraping tasks, and backend integration. Interested candidates with suitable resume kindly share your resume to Ybalakrishnan@sonata-software.com Regards Yamini

Posted Date not available

Apply

2.0 - 4.0 years

0 - 0 Lacs

mumbai

Work from Office

Role & responsibilities Technical Skills: Proficiency in Python and libraries like BeautifulSoup, Scrapy, and Selenium. • Experience with regular expressions (Regex) for data parsing. • Strong knowledge of HTTP protocols, cookies, headers, and user-agent rotation. • Familiarity with databases (SQL and NoSQL) for storing scraped data. • Hands-on experience with data manipulation libraries such as pandas and NumPy. Experience working with APIs and managing third-party integrations. • Familiarity with version control systems like Git. Bonus Skills: • Knowledge of containerization tools like Docker. Preferred candidate profile Develop and maintain automated web scraping scripts using Python libraries such as BeautifulSoup, Scrapy, and Selenium. • Optimize scraping pipelines for performance, scalability, and resource efficiency. • Handle dynamic websites, CAPTCHA-solving, and implement IP rotation techniques for uninterrupted scraping. • Process and clean raw data, ensuring accuracy and integrity in extracted datasets. • Collaborate with cross-functional teams to understand data requirements and deliver actionable insights. • Leverage APIs when web scraping is not feasible, managing authentication and request optimization. • Document processes, pipelines, and troubleshooting steps for maintainable and reusable scraping solutions. • Ensure compliance with legal and ethical web scraping practices, implementing security safeguards.

Posted Date not available

Apply

9.0 - 14.0 years

25 - 30 Lacs

noida, hyderabad, pune

Hybrid

A Python Lead with expertise in Django and AWS holds a pivotal role in the development and deployment of web applications, encompassing both technical leadership and hands-on contributions. Required Candidate profile Python Development and Implementation: Architecture and Design: Cloud Infrastructure Management: Team Management and Collaboration:

Posted Date not available

Apply

0.0 - 2.0 years

2 - 3 Lacs

gurugram

Work from Office

Job Description: Trainee - Application Developer (Gurgaon, India) We are looking for a Trainee - Application Developer to join our team in Gurgaon, India. In this role, you will learn, develop, and maintain data acquisition solutions for our clients. Your main responsibility will be to ensure the applications you work on meet company and industry standards for quality and timelines. Full training will be provided to help you succeed. Key Responsibilities: Learn and use Python and the companys technology to collect data from different websites. Configure solutions on the companys platform. Work with other teams to improve and enhance existing applications. Help other divisions to improve customer satisfaction. Create and perform unit tests to ensure code quality. Communicate clearly with teams locally and remotely. Required Skills: Basic understanding of web data extraction and its challenges. Familiarity with Python (preferred). Knowledge of HTTP (the protocol used for web communication). Familiarity with JavaScript or other scripting languages is a plus. Strong SQL skills and database knowledge. Ability to learn new technologies and recommend improvements. Strong communication and interpersonal skills. Ability to analyze and solve problems. If you're eager to learn and grow in the field of application development, this is a great opportunity to start your career!

Posted Date not available

Apply

3.0 - 5.0 years

3 - 7 Lacs

chennai

Work from Office

Good knowledge and in Python, sql, Perl with 6+ years experience. Good problem solving skill. Ability to understand the data and its relations Capability to learn new technologies in short span of time. Should be able to work in Sprint and meet deadlines. Flexible Work time. Mandatory Skills: Python - Basics, Pandas, Web scrapping, File and XML Handling, Extracting/Manipulating - Excel/CSV/Any File Formats. Perl - Basics, CPAN modules, File and Web scrapping/Handling. ** Work from option is available

Posted Date not available

Apply

6.0 - 11.0 years

6 - 10 Lacs

pune

Work from Office

Position Description: Founded in 1976, CGI is among the world's largest independent IT and business consulting services firms. With 94,000 consultants and professionals globally, CGI delivers an end-to-end portfolio of capabilities, from strategic IT and business consulting to systems integration, managed IT and business process services, and intellectual property solutions. CGI works with clients through a local relationship model complemented by a global delivery network that helps clients digitally transform their organizations and accelerate results. CGI Fiscal 2024 reported revenue is CA$14.68billion, and CGI shares are listed on the TSX (GIB.A) and the NYSE (GIB). Learn more atcgi.com. Job Title: Senior Data Engineer Position: SSE LA AC Experience: 6+ years of experience Category: Software Development Job location: Pune Position ID: J0825-0171 Work Type: Hybrid Employment Type: Full Time Permanent Qualification: Bachelors or Masters degree in Computer Science, Engineering, or a related field. Required Skills & Experience Proficient in Python, with deep experience using pandas or polars Strong understanding of ETL development, data extraction, and transformation Hands-on experience with SQL and querying large datasets Experience deploying workflows on Apache Airflow Familiar with web scraping techniques (Selenium is a plus) Comfortable working with various data formats and large-scale datasets Experience with Azure DevOps, including pipeline configuration and automation Familiarity with Pytest or equivalent test frameworks Strong communication skills and a team-first attitude. Experience with Databricks Familiarity with AWS services Working knowledge of Jenkins and advanced ADO Pipelines Key Responsibilities Design, build, and maintain pipelines in Python to collect data from a wide range of sources (APIs, SFTP servers, websites, emails, PDFs, etc.) Deploy and orchestrate workflows using Apache Airflow Perform web scraping using libraries like requests, BeautifulSoup, Selenium Handle structured, semi-structured, and unstructured data efficiently Transform datasets using pandas and/or polars Write unit and component tests using pytest Collaborate with platform teams to improve the data scraping framework Query and analyze data using SQL (PostgreSQL, MSSQL, Databricks) Conduct code reviews, support best practices, and improve coding standards across the team Manage and maintain CI/CD pipelines (Azure DevOps Pipelines, Jenkins) Tech stack: Main/essential: Python - Pandas and/or Polars - Essential SQL Azure DevOps Airflow Additional: Databricks AWS Jenkins ADO Pipelines Skills: DevOps Pandas Python

Posted Date not available

Apply

4.0 - 7.0 years

20 - 35 Lacs

chennai

Remote

Role & responsibilities: Architect, deploy, and manage scalable cloud environments (AWS/GCP/DO) to support distributed data processing solutions to handle terabyte-scale datasets and billions of records efficiently Automate infrastructure provisioning, monitoring, and disaster recovery using tools like Terraform, Kubernetes, and Prometheus. Optimize CI/CD pipelines to ensure seamless deployment of web scraping workflows and infrastructure updates. Develop and maintain stealthy web scrapers using Puppeteer, Playwright, and headless chromium browsers. Reverse-engineer bot-detection mechanisms (e.g., TLS fingerprinting, CAPTCHA solving) and implement evasion strategies. Monitor system health, troubleshoot bottlenecks, and ensure 99.99% uptime for data collection and processing pipelines. Implement security best practices for cloud infrastructure, including intrusion detection, data encryption, and compliance audits. Partner with data collection, ML and SaaS teams to align infrastructure scalability with evolving data needs Preferred candidate profile : 4-7 years of experience in site reliability engineering and cloud infrastructure management Proficiency in Python, JavaScript for scripting and automation . Hands-on experience with Puppeteer/Playwright, headless browsers, and anti-bot evasion techniques . Knowledge of networking protocols, TLS fingerprinting, and CAPTCHA-solving frameworks . Experience with monitoring and observability tools such as Grafana, Prometheus, Elasticsearch, and familiarity with monitoring and optimizing resource utilization in distributed systems. Experience with data lake architectures and optimizing storage using formats such as Parquet, Avro, or ORC. Strong proficiency in cloud platforms (AWS, GCP, or Azure) and containerization/orchestration (Docker, Kubernetes). Deep understanding of infrastructure-as-code tools (Terraform, Ansible) . Deep experience in designing resilient data systems with a focus on fault tolerance, data replication, and disaster recovery strategies in distributed environments. Experience implementing observability frameworks, distributed tracing, and real-time monitoring tools. Excellent problem-solving abilities, with a collaborative mindset and strong communication skills.

Posted Date not available

Apply

2.0 - 5.0 years

2 - 5 Lacs

noida, delhi / ncr

Work from Office

Job Summary We are looking for a Techno-Functional Data Engineer who is passionate about solving realworld problems through data-driven systems. While prior e-commerce experience is a plus, it is not mandatory we welcome engineers, tinkerers, and builders who are eager to challenge themselves, build scalable systems, and work closely with product and business teams. In this role, you will be at the intersection of data engineering, automation, and product strategy, contributing to a modern SaaS platform that supports diverse and dynamic customer needs. Key Responsibilities Data Engineering & Automation - Build and maintain data pipelines and automated workflows for data ingestion, transformation, and delivery. - Integrate structured and semi-structured data from APIs, external sources, and internal systems using Python and SQL. - Work on core platform modules like data connectors, product catalogs, inventory sync, and channel integrations. - Implement data quality, logging, and alerting mechanisms to ensure pipeline reliability. - Build internal APIs and microservices using Flask or Django to expose enriched datasets. Functional & Analytical Contribution - Collaborate with Product and Engineering teams to understand use cases and translate them into data-backed features. - Analyze data using Pandas, NumPy, and SQL to support roadmap decisions and customer insights. - Build bots, automation scripts, or scraping tools to handle repetitive data operations or integrate with third-party systems. - Participate in designing reporting frameworks, dashboards, and analytics services for internal and client use. Mindset & Growth - Be open to learning the dynamics of e-commerce, catalog structures, order flows, and marketplace ecosystems. - Take ownership of problems beyond your immediate knowledge area and drive them to closure. - Engage with a product-first engineering culture where outcomes > tech stack, and impact matters most. Required Skills & Qualifications - 2+ years of experience in data engineering, backend development, or technical product analytics. - Strong Python skills, with experience in: - Data libraries: Pandas, NumPy - Web frameworks: Flask, Django - Automation: Requests, BeautifulSoup, Scrapy, bot frameworks - Image processing: Pillow, OpenCV (a plus) - Proficient in SQL and hands-on with MySQL, PostgreSQL, or MongoDB. - Experience building or consuming REST APIs. - Familiarity with version control tools like Git and collaborative workflows (CI/CD, Agile). - Strong problem-solving mindset and willingness to learn domain-specific complexities. Nice to Have (But Not Required) - Exposure to cloud data platforms like AWS, GCP, or Azure. - Experience with workflow orchestration tools like Airflow, DBT, or Luigi. - Basic knowledge of BI tools (Power BI, Tableau, Looker). - Prior work on data-centric products or SaaS tools.

Posted Date not available

Apply

1.0 - 3.0 years

3 - 8 Lacs

bengaluru

Hybrid

About the Role: Grade Level (for internal use): 08 Job Title: Associate Data Engineer The Team: The Automotive Insights - Supply Chain and Technology and IMR department at S&P Global is dedicated to delivering critical intelligence and comprehensive analysis of the automotive industry's supply chain and technology. Our team provides actionable insights and data-driven solutions that empower clients to navigate the complexities of the automotive ecosystem, from manufacturing and logistics to technological innovations and market dynamics. We collaborate closely with industry stakeholders to ensure our research supports strategic decision-making and drives growth within the automotive sector. Join us to be at the forefront of transforming the automotive landscape with cutting-edge insights and expertise. Responsibilities and Impact: Develop and maintain automated data pipelines to extract, transform, and load data from diverse online sources, ensuring high data quality. Build, optimize, and document web scraping tools using Python and related libraries to support ongoing research and analytics. Implement DevOps practices for deploying, monitoring, and maintaining machine learning workflows in production environments. Collaborate with data scientists and analysts to deliver reliable, well-structured data for analytics and modeling. Perform data quality checks, troubleshoot pipeline issues, and ensure alignment with internal taxonomies and standards. Stay current with advancements in data engineering, DevOps, and web scraping technologies, contributing to team knowledge and best practices. What Were Looking For: Basic Required Qualifications: Bachelors degree in computer science, Engineering, or a related field. 1 to 3 years of hands-on experience in data engineering, including web scraping and ETL pipeline development using Python. Proficiency with Python programming and libraries such as Pandas, BeautifulSoup, Selenium, or Scrapy. Exposure to implementing and maintaining DevOps workflows, including model deployment and monitoring. Familiarity with containerization technologies (e.g., Docker) and CI/CD pipelines for data and ML workflows. Familiarity with the cloud platforms (preferably AWS). Key Soft Skills: Strong analytical and problem-solving skills, with attention to detail. Excellent communication and collaboration abilities for effective teamwork. Ability to work independently and manage multiple priorities. Curiosity and a proactive approach to learning and applying new technologies.

Posted Date not available

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies