Job
Description
As a Data Engineer, you will be responsible for designing, developing, and implementing data pipelines using StreamSets Data Collector. Your role involves ingesting, transforming, and delivering data from diverse sources to target systems. You will write and maintain efficient, reusable pipelines while adhering to coding standards and best practices. Additionally, developing custom processors and stages within StreamSets to address unique data integration challenges is a key aspect of your responsibilities. Ensuring data accuracy and consistency is crucial, and you will implement data validation and quality checks within StreamSets pipelines. Optimizing pipeline performance for high-volume data processing and automating deployment and monitoring using CI/CD tools are essential tasks you will perform. In terms of quality assurance and testing, you will develop comprehensive test plans and test cases to validate pipeline functionality and data integrity. Thorough testing, debugging, and troubleshooting of pipelines will be conducted to identify and resolve issues. You will also standardize quality assurance procedures for StreamSets development and perform performance testing and tuning to ensure optimal pipeline performance. When it comes to problem-solving and support, you will research and analyze complex software-related issues to provide effective solutions. Timely resolution of production issues related to StreamSets pipelines is part of your responsibility. Providing technical support and guidance to team members on StreamSets development and monitoring pipeline logs and metrics for issue identification and resolution are also key tasks. Strategic alignment and collaboration are essential aspects of the role. Understanding and aligning with departmental, segment, and organizational strategies and objectives are necessary. Collaboration with data engineers, data analysts, and stakeholders to deliver effective data solutions is crucial. Documenting pipeline designs and configurations, participating in code reviews, and contributing to the development of data integration best practices and standards are also part of your responsibilities. To qualify for this role, you should have a Bachelor's Degree in Computer Science, Information Technology, or a related field. A minimum of 3-5 years of hands-on experience in systems analysis or application programming development with a focus on data integration is required. Proven experience in developing and deploying StreamSets Data Collector pipelines, strong understanding of data integration concepts and best practices, proficiency in SQL, experience with relational databases, various data formats (JSON, XML, CSV, Avro, Parquet), cloud platforms (AWS, Azure, GCP), and cloud-based data services, as well as experience with version control systems (Git) are essential qualifications. Strong analytical and problem-solving skills, excellent communication and collaboration abilities, and the capacity to work independently are also necessary for this role.,