Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in coimbatore
>
Aivar Innovations
>
Senior Data Engineer - Data Processing & Feature Engineering

Senior Data Engineer - Data Processing & Feature Engineering

Aivar Innovations

6 years

0 Lacs

coimbatore tamil nadu india

Posted:1 day ago| Platform:

Apply

Skills Required

data processing engineering ai design transactions compliance extraction architecture governance tracking management audit pipeline reliability security healthcare encryption model training documentation apache spark aws ml python etl development analysis logic dynamodb parsing layout kafka ranking graphs vision contracts monitoring

Work Mode

On-site

Job Type

Full Time

Job Description

Senior Data Engineer - Data Processing & Feature Engineering

Location: Coimbatore

Experience Level: 6+ years

About the Role

We are seeking exceptional Senior Data Engineers to build the data foundation powering Velogent AI's autonomous agents. You will design and implement large-scale data ingestion, processing, and feature engineering systems that transform unstructured enterprise data (invoices, documents, transactions, RFQs) into structured, high-quality datasets. Your work enables agentic AI systems to make accurate, compliance-aware decisions while maintaining data quality, lineage, and auditability standards required by regulated industries.

Core Responsibilities

Design and architect end-to-end data pipelines processing large volumes of unstructured enterprise data (documents, PDFs, transaction records, email, etc.)
Build sophisticated data ingestion frameworks supporting multiple data sources and formats with automated validation and quality checks
Implement large-scale data processing solutions using distributed computing frameworks handling terabytes of data efficiently
Develop advanced feature engineering pipelines extracting meaningful signals from unstructured data (document classification, entity extraction, semantic tagging)
Design data warehousing architecture supporting both operational (near real-time) and analytical queries for agentic AI reasoning
Build robust data quality frameworks ensuring high data accuracy critical for agent decision-making and regulatory compliance
Implement data governance patterns including lineage tracking, metadata management, and audit trails for regulated environments
Optimize data pipeline performance, reliability, and cost through intelligent partitioning, caching, and resource optimization
Lead data security implementation protecting sensitive information (PII, financial data, healthcare records) with encryption and access controls
Collaborate with AI engineers to understand data requirements and optimize data for model training and inference
Establish best practices for data documentation, SLA management, and operational excellence

Must-Have Qualifications

Unstructured Data Expertise: Production experience ingesting and processing large volumes of unstructured data (documents, PDFs, images, text, logs)
Large-Scale Data Processing: Advanced expertise with distributed data processing frameworks (Apache Spark, Flink, or cloud-native alternatives like AWS Glue)
Feature Engineering: Deep knowledge of advanced feature engineering techniques for ML systems, including automated feature extraction and transformation
Python Proficiency: Expert-level Python for data processing, ETL pipeline development, and data science workflows
NLP/Text Processing: Strong background in NLP and text analysis techniques for document understanding, entity extraction, and semantic processing
Data Architecture: Experience designing data warehouses, data lakes, or lakehouse architectures supporting both batch and real-time processing
ETL/ELT Pipeline Design: Proven expertise building production-grade ETL/ELT pipelines with error handling, retry logic, and monitoring
Cloud Data Platforms: Advanced experience with AWS data services (S3, Athena, Glue, RDS, DynamoDB) or equivalent cloud platforms
Data Quality & Governance: Understanding of data quality frameworks, metadata management, and data governance practices

Nice-to-Have Qualifications

Experience with document parsing and layout analysis libraries (Pydantic, unstructured.io, PyPDF, etc.)
Knowledge of information extraction pipelines and vector databases for semantic search
Familiarity with Apache Kafka or other event streaming platforms for real-time data processing
Experience with dbt (data build tool) or similar data transformation frameworks
Understanding of data privacy and compliance frameworks (GDPR, HIPAA, SOC2)
Experience optimizing costs in cloud data platforms through intelligent resource allocation
Background in building recommendation systems or ranking systems using feature engineering
Knowledge of graph databases and knowledge graphs for relationship extraction
Familiarity with computer vision techniques for document analysis and processing
Published work or open-source contributions in NLP, document processing, or data engineering

What You'll Work With

Large-scale document processing pipelines handling millions of invoices, contracts, and business documents
Apache Spark and distributed computing frameworks for ETL
AWS data services (S3, Glue, Athena, RDS) for data infrastructure
Advanced NLP and text processing libraries (spaCy, transformers, LangChain)
Vector databases and semantic search infrastructureData quality and monitoring frameworks
Cloud data warehouses and data lakes on AWS
Compliance and governance frameworks for regulated industries

More Jobs at Aivar Innovations

Marketing Lead

coimbatore, tamil nadu, india

5.0 - 5.0 yrs

Salary: Not disclosed

Marketing Lead

coimbatore, tamil nadu

5.0 - 9.0 yrs

Salary: Not disclosed

AI Engineer (Voice AI & Real-Time Communication)

coimbatore, bengaluru

2.0 - 7.0 yrs

INR 15 - 30 Lacs

Devops Engineer

coimbatore

3.0 - 5.0 yrs

INR 10 - 20 Lacs

DevOps Engineer

coimbatore, tamil nadu, india

5.0 - 5.0 yrs

INR 12 - 25 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.