Jobs
Interviews

3 And Pyspark Jobs

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 8.0 years

7 - 17 Lacs

noida, hyderabad, bengaluru

Work from Office

5 to 7 years of good hands on exposure in : - Develop and maintain data ingestion pipelines for various data sources, including transactional databases, streaming big data, and batch data, utilizing tools such as GCP, GitHub, Terraform , and PySpark, Kafka, ECS. - Set up and manage batch orchestration jobs using Apache Airflow, ensuring timely execution and reliability. - Monitor data pipelines continuously to ensure operational efficiency and address any anomalies or incidents that may arise in a timely manner. - Collaborate with the data governance team to ensure compliance with data governance guidelines, including data classification and quality, using monitoring tools like Grafana and metrics from Prometheus. - Document operational procedures, incident reports, and performance metrics to support continuous improvement efforts. Qualifications: - bachelor in IT or information systems - Proficient in GitHub and GitHub Actions, as well as Terraform for infrastructure management. - Experience with AWS Glue, Python, and PySpark for data processing. - Familiarity with monitoring and visualization tools such as Grafana and Prometheus. - A solid understanding of object-oriented programming principles and SOLID principles is a plus. Desired Skills: - Strong analytical and problem-solving abilities, particularly in an operational context. - Capable of effectively communicating technical concepts to non-technical stakeholders. - A proactive team player with a hands-on mindset, ready to tackle operational challenges Role & responsibilities Preferred candidate profile

Posted 5 hours ago

Apply

6.0 - 9.0 years

5 - 8 Lacs

hyderabad, telangana, india

On-site

Roles & Responsibilities: Collaborate with the QA Manager to design and implement end-to-end test strategies for data validation, semantic layer testing, and GraphQL API validation. Perform manual validation of data pipelines, including source-to-target data mapping, transformation logic, and business rule verification. Develop and maintain automated data validation scripts using Python and PySpark for both real-time and batch pipelines. Contribute to the design and enhancement of reusable automation frameworks, with components for schema validation, data reconciliation, and anomaly detection. Validate semantic layers (e.g., Looker, dbt models) and GraphQL APIs, ensuring data consistency, compliance with contracts, and alignment with business expectations. Write and manage test plans, test cases, and test data for structured, semi-structured, and unstructured data. Track, manage, and report defects using tools like JIRA, ensuring thorough root cause analysis and timely resolution. Collaborate with Data Engineers, Product Managers, and DevOps teams to integrate tests into CI/CD pipelines and enable shift-left testing practices. Ensure comprehensive test coverage for all aspects of the data lifecycle, including ingestion, transformation, delivery, and consumption. Participate in QA ceremonies (standups, planning, retrospectives) and continuously contribute to improving the QA process and culture. Experience building or maintaining test data generators Contributions to internal quality dashboards or data observability systems Awareness of metadata-driven testing approaches and lineage-based validations Experience working with agile Testing methodologies such as Scaled Agile. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Must-Have Skills: 69 years of experience in QA roles, with at least 3+ years of strong exposure to data pipeline testing and ETL validation. Strong in SQL, Python, and optionally PySpark comfortable with writing complex queries and validation scripts. Practical experience with manual validation of data pipelines and source-to-target testing. Experience in validating GraphQL APIs, semantic layers (Looker, dbt, etc.), and schema/data contract compliance. Familiarity with data integration tools and platforms such as Databricks, AWS Glue, Redshift, Athena, or BigQuery. Strong understanding of test planning, defect tracking, bug lifecycle management, and QA documentation. Experience working in Agile/Scrum environments with standard QA processes. Knowledge of test case and defect management tools (e.g., JIRA, TestRail, Zephyr). Strong understanding of QA methodologies, test planning, test case design, and defect lifecycle management. Deep hands-on expertise in SQL, Python, and PySpark for testing and automating validation. Proven experience in manual and automated testing of batch and real-time data pipelines. Familiarity with data processing and analytics stacks: Databricks, Spark, AWS (Glue, S3, Athena, Redshift). Experience with bug tracking and test management tools like JIRA, TestRail, or Zephyr. Ability to troubleshoot data issues independently and collaborate with engineering for root cause analysis. Experience integrating automated tests into CI/CD pipelines (e.g., Jenkins, GitHub Actions). Experience validating data from various file formats such as JSON, CSV, Parquet, and Avro

Posted 1 week ago

Apply

2.0 - 7.0 years

40 - 45 Lacs

Chandigarh, Bengaluru, Remote

Work from Office

As the Data Engineer, you will play a pivotal role in shaping our data infrastructure and executing against our strategy. You will ideate alongside engineering, data and our clients to deploy data products with an innovative and meaningful impact to clients. You will design, build, and maintain scalable data pipelines and workflows on AWS. Additionally, your expertise in AI and machine learning will enhance our ability to deliver smarter, more predictive solutions. Key Responsibilities Collaborate with other engineers, customers to brainstorm and develop impactful data products tailored to our clients. Leverage AI and machine learning techniques to integrate intelligent features into our offerings. Develop, and optimize end-to-end data pipelines on AWS Follow best practices in software architecture and development. Implement effective cost management and performance optimization strategies. Develop and maintain systems using Python, SQL, PySpark, and Django for front-end development. Work directly with clients and end-users and address their data needs Utilize databases and tools including and not limited to, Postgres, Redshift, Airflow, and MongoDB to support our data ecosystem. Leverage AI frameworks and libraries to integrate advanced analytics into our solutions. Qualifications Experience: Minimum of 3 years of experience in data engineering, software development, or related roles. Proven track record in designing and deploying AWS cloud infrastructure solutions At least 2 years in data analysis and mining techniques to aid in descriptive and diagnostic insights Extensive hands-on experience with Postgres, Redshift, Airflow, MongoDB, and real-time data workflows. Technical Skills: Expertise in Python, SQL, and PySpark Strong background in software architecture and scalable development practices. Tableau, Metabase or similar viz tools experience Working knowledge of AI frameworks and libraries is a plus. Leadership & Communication: Demonstrates ownership and accountability for delivery with a strong commitment to quality. Excellent communication skills with a history of effective client and end-user engagement. Startup & Fintech Mindset: Adaptability and agility to thrive in a fast-paced, early-stage startup environment. Passion for fintech innovation and a strong desire to make a meaningful impact on the future of finance.

Posted 3 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies