Work from Office
Full Time
Role & responsibilities Develop and Maintain Data Pipelines: Design, develop, and manage scalable ETL pipelines to process large datasets using PySpark, Databricks, and other big data technologies. Data Integration and Transformation: Work with various structured and unstructured data sources to build efficient data workflows and integrate them into a central data warehouse. Collaborate with Data Scientists & Analysts: Work closely with the data science and business intelligence teams to ensure the right data is available for advanced analytics, machine learning, and reporting. Optimize Performance: Optimize and tune data pipelines and ETL processes to improve data throughput and reduce latency, ensuring timely delivery of high-quality data. Automation and Monitoring: Implement automated workflows and monitoring tools to ensure data pipelines are running smoothly, and issues are proactively addressed. Ensure Data Quality: Build and maintain validation mechanisms to ensure the accuracy and consistency of the data. Data Storage and Access: Work with data storage solutions (e.g., Azure, AWS, Google Cloud) to ensure effective data storage and fast access for downstream users. Documentation and Reporting: Maintain proper documentation for all data processes and architectures to facilitate easier understanding and onboarding of new team members. Skills and Qualifications: Experience: 5+ years of experience as a Data Engineer or similar role, with hands-on experience in designing, building, and maintaining ETL pipelines. Technologies: Proficient in PySpark for large-scale data processing. Strong programming experience in Python , particularly for data engineering tasks. Experience working with Databricks for big data processing and collaboration. Hands-on experience with data storage solutions (e.g., AWS S3, Azure Data Lake, or Google Cloud Storage). Solid understanding of ETL concepts, tools, and best practices. Familiarity with SQL for querying and manipulating data in relational databases. Experience working with data orchestration tools such as Apache Airflow or Luigi is a plus. Data Modeling & Warehousing: Experience with data warehousing concepts and technologies (e.g., Redshift, Snowflake, or Big Query). Knowledge of data modeling, data transformations, and dimensional modeling. Soft Skills: Strong analytical and problem-solving skills. Excellent communication skills, capable of explaining complex data processes to non-technical stakeholders. Ability to work in a fast-paced, collaborative environment and manage multiple priorities. Preferred Qualifications: Bachelor's or masters degree in computer science, Engineering, or a related field. Certification or experience with cloud platforms like AWS , Azure , or Google Cloud . Experience in Apache Kafka or other stream-processing technologies.
Atyeti
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Chennai
4.0 - 7.0 Lacs P.A.
Bengaluru
25.0 - 30.0 Lacs P.A.
Mumbai
4.0 - 5.0 Lacs P.A.
Pune, Chennai, Bengaluru
0.5 - 0.5 Lacs P.A.
Chennai, Malaysia, Malaysia, Kuala Lumpur
7.0 - 11.0 Lacs P.A.
5.0 - 8.0 Lacs P.A.
Bengaluru
15.0 - 20.0 Lacs P.A.
Hyderabad
25.0 - 35.0 Lacs P.A.
Hyderabad
25.0 - 30.0 Lacs P.A.
25.0 - 30.0 Lacs P.A.