Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in hyderabad
>
eMedEvents - Global Marketplace for CME/CE
>
eMedEvents - Python Developer - Web Scraping

eMedEvents - Python Developer - Web Scraping

eMedEvents - Global Marketplace for CME/CE

3 - 5 years

0 Lacs

hyderabad telangana india

Posted:2 months ago| Platform: Foundit logo

Apply

Skills Required

html5lib workflow orchestrators regular expressions pdfplumber lxml task schedulers pymupdf playwright pdf parsing pdfminer.six beautifulsoup

Work Mode

On-site

Job Type

Full Time

Job Description

Python Developer Web Scraping & Data Processing

Experience :

3+ Years

Employment Type :

Full-time

Job Overview

We are seeking a skilled and detail-oriented Python Developer with 3+ years of hands-on experience in web scraping, document parsing (PDF, HTML, XML), and structured data extraction. You will be a vital part of a core team focused on aggregating biomedical content from diverse sources, including grant repositories, scientific journals, conference abstracts, treatment guidelines, and clinical trial databases. This role demands strong technical proficiency in various parsing and scraping libraries, along with solid data processing and integration skills.

Key Responsibilities

Develop scalable Python scripts to effectively scrape and parse biomedical data from a wide range of web sources, including websites, pre-print servers, citation indexes, scientific journals, and treatment guidelines.
Build robust modules specifically for splitting multi-record documents (such as PDFs, HTML, and other formats) into individual, manageable content units.
Implement NLP-based field extraction pipelines utilizing libraries like spaCy, NLTK, or advanced regex for precise metadata tagging.
Design and automate complex data acquisition workflows using schedulers and orchestrators like cron, Celery, or Apache Airflow for periodic scraping and content updates.
Store parsed and processed data efficiently in both relational (PostgreSQL) and NoSQL (MongoDB) databases, ensuring optimal schema design for performance and scalability.
Ensure robust logging, comprehensive exception handling, and rigorous content quality validation across all data processing and scraping workflows.

Required Skills And Qualifications

3+ years of hands-on experience in Python, particularly focused on data extraction, transformation, and loading (ETL).
Strong command over web scraping libraries, including :
BeautifulSoup
Scrapy
Selenium
Playwright
Proficiency in PDF parsing libraries, such as :
PyMuPDF
pdfminer.six
PDFPlumber
Experience with HTML/XML parsers: lxml, XPath, html5lib.
Familiarity with regular expressions, NLP concepts, and advanced field extraction techniques.
Working knowledge of SQL and/or NoSQL databases (MySQL, PostgreSQL, MongoDB).
Understanding of API integration (RESTful APIs) for interacting with structured data sources.
Experience with task schedulers and workflow orchestrators (cron, Apache Airflow, Celery).
Proficiency in version control using Git/GitHub and comfort working in collaborative development environments.

Good To Have

Exposure to biomedical or healthcare data parsing (scientific abstracts, clinical trials data, drug labels).
Familiarity with cloud environments like AWS (specifically Lambda, S3 for data storage and processing).
Experience with data validation frameworks and building robust QA rules for data quality.
Understanding of ontologies and taxonomies (UMLS, MeSH) for structured content tagging.

(ref:hirist.tech)

More Jobs at eMedEvents - Global Marketplace for CME/CE

Research & Data Entry Associate

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Content Writer

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Senior Content Writer

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Social Media Marketing Executive

Hyderabad, Telangana, India

1.0 - 1.0 yrs

Salary: Not disclosed

Senior Software Tester

Hyderabad, Telangana, India

8.0 - 9.0 yrs

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

eMedEvents - Global Marketplace for CME/CE

RecommendedJobs for You

eMedEvents - Python Developer - Web Scraping

eMedEvents - Global Marketplace for CME/CE

hyderabad, telangana, india

Login to

Please Verify Your Phone or Email

Confirm Action

eMedEvents - Python Developer - Web Scraping