Senior Big Data Developer

7 - 12 years

25.0 - 35.0 Lacs P.A.

Kolkata

Posted:2 months ago| Platform: Naukri logo

Apply Now

Skills Required

AirflowapacheSparkPythonSQLEtl PipelinesParquetEDAduckdb

Work Mode

Hybrid

Job Type

Full Time

Job Description

About the Role We are seeking a Senior Python/Data Engineer to design, develop, and optimize large-scale data pipelines, transformation workflows, and analytics-ready datasets . This role requires expertise in Python, Apache Airflow, Apache Spark, SQL, and DuckDB , along with strong experience in data quality, data processing, and automation . As a Senior Data Engineer , you will play a key role in building scalable, high-performance data engineering solutions , ensuring data integrity, and supporting real-time and batch data workflows . You will work closely with Data Scientists, Analysts, DevOps, and Engineering teams to build efficient, cost-effective, and reliable data architectures . Key Responsibilities Design, build, and maintain scalable ETL/ELT data pipelines using Apache Airflow, Spark, and SQL . Develop Python-based data engineering solutions to automate data ingestion, transformation, and validation. Implement data transformation and quality checks for structured and unstructured datasets. Work with DuckDB and other in-memory databases to enable fast exploratory data analysis (EDA). Optimize data storage and retrieval using Parquet, Apache Iceberg, and S3-based data lakes . Develop SQL-based analytics workflows and optimize performance for querying large datasets. Implement data lineage, governance, and metadata management for enterprise-scale data solutions. Ensure high availability, fault tolerance, and security of data pipelines. Collaborate with Data Science, AI/ML, and Business Intelligence teams to enable real-time and batch analytics . Work with cloud platforms ( AWS, Azure, GCP ) for data pipeline deployment and scaling. Write clean, efficient, and maintainable code following best software engineering practices. Required Skills & Qualifications 7+ years of experience in data engineering, big data processing, and backend development . Expertise in Python for data processing and automation. Strong experience with Apache Airflow for workflow orchestration. Hands-on experience with Apache Spark for big data transformations. Proficiency in SQL (PostgreSQL, DuckDB, Snowflake, etc.) for analytics and ETL workflows. Experience with data transformation, data validation, and quality assurance frameworks . Hands-on experience with DuckDB, Apache Arrow, or Vaex for in-memory data processing. Knowledge of data lake architectures (S3, Parquet, Iceberg) and cloud data storage. Familiarity with distributed computing, parallel processing, and optimized query execution . Experience working in CI/CD, DevOps, containerization (Docker, Kubernetes), and cloud environments . Strong problem-solving and debugging skills. Excellent written and verbal communication skills. Preferred Skills (Nice to Have) Experience programming in JAVA/JEE platform is highly desired. Experience with data streaming technologies (Kafka, Flink, Kinesis) . Familiarity with NoSQL databases (MongoDB, DynamoDB) . Exposure to AI/ML data pipelines and feature engineering . Knowledge of data security, compliance (SOC2 Type2, GDPR, HIPAA), and governance best practices . Experience in building metadata-driven data pipelines for self-service analytics.

IT Services and IT Consulting
Kolkata West Bengal

RecommendedJobs for You

Chennai, Pune, Delhi, Mumbai, Bengaluru, Hyderabad, Kolkata

Pune, Bengaluru, Mumbai (All Areas)

Chennai, Pune, Delhi, Mumbai, Bengaluru, Hyderabad, Kolkata

Bengaluru, Hyderabad, Mumbai (All Areas)

Hyderabad, Gurgaon, Mumbai (All Areas)