Senior Data Engineer

5 - 9 years

0 Lacs

Posted:5 days ago| Platform: Shine logo

Apply

Work Mode

On-site

Job Type

Full Time

Job Description

As a Senior Data Engineer at our company, you will play a crucial role in the architecture, design, development, and maintenance of our data platforms. Your primary focus will be on utilizing Python and PySpark for data processing and transformation, contributing to the overall data strategy, and driving data-driven decision-making throughout the organization. **Key Responsibilities:** - **Data Architecture & Design:** Design, develop, and optimize data architectures, pipelines, and data models to support various business needs such as analytics, reporting, and machine learning. - **ETL/ELT Development (Python/PySpark Focus):** Build, test, and deploy highly scalable and efficient ETL/ELT processes using Python and PySpark to ingest, transform, and load data from diverse sources into data warehouses and data lakes. Develop and optimize complex data transformations using PySpark. - **Data Quality & Governance:** Implement best practices for data quality, data governance, and data security to ensure the integrity, reliability, and privacy of our data assets. - **Performance Optimization:** Monitor, troubleshoot, and optimize data pipeline performance, ensuring data availability and timely delivery, particularly for PySpark jobs. - **Infrastructure Management:** Collaborate with DevOps and MLOps teams to manage and optimize data infrastructure, including cloud resources (AWS, Azure, GCP), databases, and data processing frameworks, ensuring efficient operation of PySpark clusters. - **Mentorship & Leadership:** Provide technical guidance, mentorship, and code reviews to junior data engineers, particularly in Python and PySpark best practices, fostering a culture of excellence and continuous improvement. - **Collaboration:** Work closely with data scientists, analysts, product managers, and other stakeholders to understand data requirements and deliver solutions that meet business objectives. - **Innovation:** Research and evaluate new data technologies, tools, and methodologies to enhance our data capabilities and stay ahead of industry trends. - **Documentation:** Create and maintain comprehensive documentation for data pipelines, data models, and data infrastructure. **Qualifications:** - **Education:** Bachelor's or Master's degree in Computer Science, Software Engineering, Data Science, or a related quantitative field. - **Experience:** 5+ years of professional experience in data engineering, with a strong emphasis on building and maintaining large-scale data systems. - **Technical Skills:** Expert proficiency in Python is mandatory. Strong SQL mastery is essential. Familiarity with Scala or Java is a plus. In-depth knowledge and hands-on experience with Apache Spark (PySpark) for data processing. Hands-on experience with at least one major cloud provider (AWS, Azure, GCP) and their data services. Proficient with Git and CI/CD pipelines. - **Soft Skills:** Excellent problem-solving and analytical abilities. Strong communication and interpersonal skills. Ability to work effectively in a fast-paced, agile environment. Proactive and self-motivated with a strong sense of ownership. This role offers an exciting opportunity to work with cutting-edge technologies, collaborate with cross-functional teams, and drive impactful data initiatives within the organization. As a Senior Data Engineer at our company, you will play a crucial role in the architecture, design, development, and maintenance of our data platforms. Your primary focus will be on utilizing Python and PySpark for data processing and transformation, contributing to the overall data strategy, and driving data-driven decision-making throughout the organization. **Key Responsibilities:** - **Data Architecture & Design:** Design, develop, and optimize data architectures, pipelines, and data models to support various business needs such as analytics, reporting, and machine learning. - **ETL/ELT Development (Python/PySpark Focus):** Build, test, and deploy highly scalable and efficient ETL/ELT processes using Python and PySpark to ingest, transform, and load data from diverse sources into data warehouses and data lakes. Develop and optimize complex data transformations using PySpark. - **Data Quality & Governance:** Implement best practices for data quality, data governance, and data security to ensure the integrity, reliability, and privacy of our data assets. - **Performance Optimization:** Monitor, troubleshoot, and optimize data pipeline performance, ensuring data availability and timely delivery, particularly for PySpark jobs. - **Infrastructure Management:** Collaborate with DevOps and MLOps teams to manage and optimize data infrastructure, including cloud resources (AWS, Azure, GCP), databases, and data processing frameworks, ensuring efficient operation of PySpark clusters. - **Mentorship & Leadership:** Provide technical guidance, mentorship, and code reviews to junior data engineers, particularly in Python and PySpark best practices, fostering

Mock Interview

Practice Video Interview with JobPe AI

Start Python Interview
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

coding practice

Enhance Your Python Skills

Practice Python coding challenges to boost your skills

Start Practicing Python Now

RecommendedJobs for You

navi mumbai, pune, mumbai (all areas)

noida, uttar pradesh, india

hyderabad, telangana, india