Lead Analyst - Lead Bigdata Developer - Python, PySpark & SQL

CGI

8 - 13 years

10 - 15 Lacs

hyderabad

Posted:1 day ago| Platform:

Apply

Skills Required

big data hive pyspark sql docker git data science spark gcp devops hadoop etl azure s3 snowflake python data engineer airflow databricks machine learning data quality query optimization kafka aws kubernetes

Work Mode

Work from Office

Job Type

Full Time

Job Description

Job Summary:

Your future duties and responsibilities:

Lead the design and development of scalable, efficient, and reliable data pipelines using PySpark, Python, and SQL
Collaborate with data architects, analysts, and business stakeholders to understand data requirements and translate them into technical solutions
Optimize data workflows for performance, scalability, and cost efficiency in big data environments (e.g, Databricks, EMR, GCP DataProc, or similar)
Implement data ingestion, transformation, and aggregation processes from multiple structured and unstructured sources
Ensure data quality, integrity, and consistency through validation, testing, and monitoring frameworks
Work with cloud-based data platforms (AWS, Azure, or GCP) and leverage tools like S3, Delta Lake, or Snowflake
Design and enforce best practices for coding, version control, and CI/CD within the data engineering team
Provide technical leadership and mentorship to junior and mid-level developers
Collaborate with DevOps and DataOps teams for deployment and operationalization of data solutions
Stay updated with the latest technologies and trends in the big data ecosystem

Required qualifications to be successful in this role:

Required Skills & Experience
:- 8+ years of experience in data engineering or big data development, with at least 3+ years in a lead or senior role
Strong proficiency in Python for data processing, scripting, and automation
Advanced hands-on experience with PySpark (RDD, DataFrame, and Spark SQL APIs)
Deep expertise in SQL (query optimization, analytical functions, performance tuning)
Strong understanding of distributed data processing and data lake architectures
Experience working with Hadoop ecosystem (Hive, HDFS, Spark, Kafka, etc)
Hands-on experience with cloud platforms (AWS, Azure, or GCP) and data orchestration tools (Airflow, ADF, etc)
Solid understanding of data modeling, ETL design, and performance optimization
Experience with version control (Git) and CI/CD pipelines for data projects
Excellent communication and leadership skills, with the ability to guide cross-functional teams
Preferred Qualifications
:- Experience with Delta Lake / Apache Iceberg / Hudi
Knowledge of containerization and orchestration (Docker, Kubernetes)
Exposure to machine learning pipelines or data science integration
Certification in AWS Big Data / GCP Data Engineer / Azure Data Engineer is a plus
Education:- Bachelors or masters degree in computer science, Information Technology, or a related field.

Skills:

English
Python
SQL
Analytical Thinking.

More Jobs at CGI

Sap Abap Consultant

Bengaluru, Hyderabad

4 - 8 yrs

INR 12 - 19 Lacs

Junior position Software Engineer/Lead Java Fullstack

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Lead Analyst - SAP ABAP

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Power Bi Developer

Mumbai

5 - 8 yrs

INR 0 - 0 Lacs

Python Software Developer

Mumbai

5 - 8 yrs

INR 0 - 0 Lacs

Mock Interview

Practice Video Interview with JobPe AI

Start Big Data Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.