Python Web Scraper

2 - 6 years

0 Lacs

Posted:1 week ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Python Web Scraper at HiLabs, you will be a key player in designing and building scalable and reliable web scraping solutions using Python/PySpark. Your responsibilities will include developing enterprise-grade scraping services, working with large volumes of structured and unstructured data, implementing robust data validation and monitoring processes, and optimizing data workflows for performance and scalability. You will also collaborate with data scientists, analysts, and engineers to integrate data from disparate sources and ensure smooth data flow between systems. **Key Responsibilities:** - Design and build scalable, reliable web scraping solutions using Python/PySpark. - Develop enterprise-grade scraping services that are robust, fault-tolerant, and production-ready. - Work with large volumes of structured and unstructured data; parse, clean, and transform as required. - Implement robust data validation and monitoring processes to ensure accuracy, consistency, and availability. - Write clean, modular code with proper logging, retries, error handling, and documentation. - Automate repetitive scraping tasks and optimize data workflows for performance and scalability. - Optimize and manage databases (SQL/NoSQL) to ensure efficient data storage, retrieval, and manipulation for both structured and unstructured data. - Analyze and identify data sources relevant to business. - Collaborate with data scientists, analysts, and engineers to integrate data from disparate sources and ensure smooth data flow between systems. **Qualification Required:** - Bachelors or Masters degree in Computer Science, Information Technology, or a related field. - 2-4 years of experience in web scraping, data crawling, or data. - Proficiency in Python with web scraping tools and libraries (e.g., Beautiful Soup, Scrapy, or Selenium). - Basic working knowledge of PySpark and data tools like Apache Airflow and EMR. - Experience with cloud-based platforms (AWS, Google Cloud, Azure) and familiarity with cloud-native data tools like Apache Airflow and EMR. - Expertise in SQL and NoSQL databases (e.g., MySQL, PostgreSQL, MongoDB, Cassandra). - Understanding of data governance, data security best practices, and data privacy regulations (e.g., GDPR, HIPAA). - Familiarity with version control systems like Git. If this opportunity aligns with your expertise and career goals, we encourage you to apply and be a part of HiLabs, a company dedicated to transforming the healthcare industry through innovation and technology.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You