Posted:1 day ago|
Platform:
On-site
Full Time
Role: Data Engineer Key Skill: Pyspark, Cloudera Data Platform, Big data Hadoop, Hive, Kafka Responsibilities Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy. Data Ingestion: Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP. Data Transformation and Processing: Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements. Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes. Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline. Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem. Technical Skills 3+ years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform PySpark: Advanced proficiency in PySpark, including working with RDDs, DataFrames, and optimization techniques. Cloudera Data Platform: Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase. Data Warehousing: Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala). Big Data Technologies: Familiarity with Hadoop, Kafka, and other distributed computing tools. Orchestration and Scheduling: Experience with Apache Oozie, Airflow, or similar orchestration frameworks. Scripting and Automation: Strong scripting skills in Linux.
Virtusa
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Mock Interview
Bengaluru / Bangalore, Karnataka, India
5.0 - 8.0 Lacs P.A.
4.0 - 9.0 Lacs P.A.
Chennai, Tamil Nadu, India
Salary: Not disclosed
Bangalore Urban, Karnataka, India
Salary: Not disclosed
Pune, Maharashtra, India
7.0 - 10.0 Lacs P.A.
Bengaluru / Bangalore, Karnataka, India
3.5 - 15.0 Lacs P.A.
Noida, Uttar Pradesh, India
9.0 - 12.0 Lacs P.A.
Hyderabad / Secunderabad, Telangana, Telangana, India
3.0 - 6.0 Lacs P.A.
3.0 - 6.0 Lacs P.A.
Bengaluru / Bangalore, Karnataka, India
4.0 - 9.0 Lacs P.A.