ETL Developer
will be responsible for designing, implementing, and optimizing distributed data processing jobs to handle
large-scale data in Hadoop Distributed File System(HDFS)
using
Apache Spark and Python.
This role required deep understanding of data engineering principles, proficiency in Python and hands-on experience with
Spark and Hadoop
ecosystems. Developer will collaborate with data engineers, analysts, and business stakeholders to process, transform and drive insights and data driven decisions.
Responsibilities:
- Data Processing and Transformation:
Design and Implement of Spark applications to process and transform large datasets in HDFS.Develop ETL Pipelines in Spark using Python for data Ingestion, cleaning, aggregation, and transformations.Performance Optimization:Optimize Spark jobs for efficiency, reducing run time and resource usage.Finetune memory management, caching, and partitioning strategies for Optimal performanceData Engineering with Hadoop and Spark:Load data from different sources into HDFS, ensuring data accuracy and integrity.Integrate Spark Applications with Hadoop frameworks like Hive, Sqoop etc.Testing and debugging:Troubleshoot and debug Spark Job failures, monitor job logs, and Spark UI to Identify Issues.
Qualifications:
- 2-5 years of relevant experience
- Experience in programming/debugging used in business applications
- Working knowledge of industry practice and standards
- Comprehensive knowledge of specific business area for application development
- Working knowledge of program languages
- Consistently demonstrates clear and concise written and verbal communication
- Expertise in handling complex large-scale Warehouse environments
- Hands-on experience writing complex SQL queries, exporting and importing large amounts of data using utilities
Education:
- Bachelor's degree in a quantitative field (such as Engineering, Computer Science) or equivalent experience
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Applications Development------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.