Data Engineer - Python/PySpark

4 years

0 Lacs

Hyderabad, Telangana, India

Posted:4 days ago| Platform: Linkedin logo

Apply

Skills Required

data python pyspark engineering sql flexibility integration etl processing design query extract writing integrity collaboration troubleshooting optimization resolve documentation automation database management aws gcp azure collaborative debugging communication

Work Mode

Remote

Job Type

Full Time

Job Description

Job Description We are seeking an experienced Data Engineer to join our team on a contractual basis for 3 months (extendable). The ideal candidate will have 4+ years of hands-on experience in data engineering, with advanced proficiency in SQL, and 3+ years of experience with Python and PySpark. This role requires flexibility to work in the 1 PM to 10 PM (timezone) shift. Responsibilities Data Integration and ETL : Develop, maintain, and optimize data pipelines and ETL processes to ensure efficient data processing and integration from multiple sources. Advanced SQL : Design, query, and optimize complex SQL queries to extract, transform, and load (ETL) data efficiently. Proficient in writing optimized queries for large datasets and ensuring high performance. Python and PySpark : Utilize Python for data manipulation, cleaning, and processing. Work extensively with PySpark to handle large-scale data processing and distributed computing tasks. Data Modeling and Warehousing : Assist in designing, building, and maintaining data models and data warehouses, ensuring data integrity and performance. Collaboration : Work closely with data scientists, analysts, and other engineers to understand data needs and optimize data workflows. Troubleshooting and Optimization : Diagnose, troubleshoot, and resolve data-related issues and performance bottlenecks. Documentation : Create and maintain documentation for data pipelines, processes, and data : 4+ years of experience in data engineering or a related field. Advanced SQL knowledge with experience in writing optimized queries for large databases. 3+ years of experience in Python for data processing and automation. 3+ years of hands-on experience with PySpark for distributed data processing. Strong understanding of data modeling and database management. Familiarity with cloud platforms (AWS, GCP, or Azure) is a plus. Good understanding of data pipelines, ETL tools, and big data ecosystems. Ability to work in a collaborative, fast-paced environment. Strong problem-solving and debugging skills. Excellent communication skills and ability to work in a remote team. (ref:hirist.tech) Show more Show less

Mock Interview

Practice Video Interview with JobPe AI

Start Data Interview Now

RecommendedJobs for You