Posted:5 hours ago|
Platform:
Work from Office
Full Time
Skill : Big Data Need to Have : Python, Pyspark, Trino, Hive Good to Have : Snowflake, SQL, Airflow, Openshift, Kubernentes Location : Hyderabad, Pune, Bangalore Job Description : Develop and maintain data pipelines, ELT processes, and workflow orchestration using Apache Airflow, Python and PySpark to ensure the efficient and reliable delivery of data. Design and implement custom connectors to facilitate the ingestion of diverse data sources into our platform, including structured and unstructured data from various document formats .
Collaborate closely with cross functional teams to gather requirements, understand data needs, and translate them into technical solutions. Implement DataOps principles and best practices to ensure robust data operations and efficient data delivery. Design and implement data CI/CD pipelines to enable automated and efficient data integration, transformation, and deployment processes.
Monitor and troubleshoot data pipelines, proactively identifying and resolving issues related to data ingestion, transformation, and loading. Conduct data validation and testing to ensure the accuracy, consistency, and compliance of data. Document data workflows, processes, and technical specifications to facilitate knowledge sharing and ensure data governance.
Responsibilities: Proficiency in using Apache Airflow and Spark for data transformation, data integration, and data management.
Experience implementing workflow orchestration using tools like Apache Airflow or similar platforms. Demonstrated experience in developing custom connectors for data ingestion from various sources.
Strong understanding of SQL and database concepts, with the ability to write efficient queries and optimize performance. Strong programming skills, particularly in Python, with experience in web scraping and implementing OCR based data extraction. Experience implementing DataOps principles and practices, including data CI/CD pipelines. Understanding of NLP techniques to analyze text data and derive valuable insights for compliance and business intelligence purposes. Familiarity with NLP techniques and libraries for text data analysis.
Familiarity with data visualization tools Apache SuperSet and dashboard development. Knowledge of data streaming and real time data processing technologies (e.g., Apache Kafka). Strong understanding of software development principles and practices, including version control (e.g., Git) and code review processes. Required Skills Python, Pyspark, SQL, Airflow, Trino, Hive, Snowflake, Agile Scrum Optional Skill Linux, Openshift, Kubernentes, Superset
Growel Softech Pvt. Ltd.
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python Nowuttar pradesh
5.0 - 9.0 Lacs P.A.
hyderabad, pune, bengaluru
5.0 - 9.0 Lacs P.A.
Salary: Not disclosed
hyderabad, chennai, bengaluru
5.0 - 8.0 Lacs P.A.
bengaluru, karnataka, india
5.0 - 20.0 Lacs P.A.
Salary: Not disclosed
chennai
6.0 - 10.0 Lacs P.A.
hyderabad
15.0 - 30.0 Lacs P.A.
kochi, kerala, india
Salary: Not disclosed
hyderabad, telangana, india
Salary: Not disclosed