Senior Backend Data & Integration Engineer

4 - 8 years

0 Lacs

Posted:13 hours ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Data Engineer, your primary objective will be to build data pipelines for crawling, parsing, deduplication, and embeddings while connecting external systems and interfaces. Your responsibilities will include: - Developing crawling and fetching pipelines with an API-first approach, utilizing playwright/requests as necessary. - Parsing and normalizing job postings & CVs, implementing deduplication/delta logic such as seen hash and repost heuristics. - Implementing embeddings and enabling similarity search, managing Azure OpenAI and vector persistence in pgvector. - Integrating with systems like HR4YOU (API/webhooks/CSV import), SerpAPI, BA job board, and email/SMTP. - Implementing batch/stream processing using Azure Functions/container jobs, including retry/backoff mechanisms and dead-letter queues. - Setting up telemetry processes to monitor data quality metrics like freshness, duplicate rate, coverage, and cost per 1,000 items. - Collaborating with the Frontend team for exports in CSV/Excel format, presigned URLs, and admin configuration. Qualifications required for this role include: - 4+ years of backend/data engineering experience. - Proficiency in Python (FastAPI, pydantic, httpx/requests, Playwright/Selenium) and solid TypeScript knowledge for smaller services/SDKs. - Experience with Azure services like Functions/Container Apps or AKS jobs, Storage/Blob, Key Vault, and Monitor/Log Analytics. - Familiarity with messaging systems such as Service Bus/Queues, idempotence & exactly-once semantics, and a pragmatic approach to problem-solving. - Strong grasp of databases including PostgreSQL, pgvector, query design, and performance tuning. - Knowledge of clean ETL/ELT patterns, testability with pytest, and observability using OpenTelemetry. Nice-to-have qualifications may include: - Experience with NLP/IE tools like spaCy/regex/rapidfuzz, and document parsing using pdfminer/textract. - Understanding of license/ToS-compliant data retrieval, captcha/anti-bot strategies while ensuring legal compliance. - Working knowledge of API-first approach, clean code practices, trunk-based development, and mandatory code reviews. - Proficiency with tools/stacks like GitHub, GitHub Actions/Azure DevOps, Docker, pnpm/Turborepo (Monorepo), Jira/Linear, and Notion/Confluence. - Willingness to participate in on-call support on a rotating basis, following the "you build it, you run it" philosophy.,

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You