Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
0.0 - 3.0 years
3 - 5 Lacs
Hyderabad
Work from Office
What you will do In this vital role We are seeking a Associate Data Engineer to design, build, and maintain scalable data solutions that drive business insights. You will work with large datasets, cloud platforms (AWS preferred), and big data technologies to develop ETL pipelines, ensure data quality, and support data governance initiatives. Develop and maintain data pipelines, ETL/ELT processes, and data integration solutions . Design and implement data models, data dictionaries, and documentation for accuracy and consistency. Ensure data security, privacy, and governance standard processes. Use Databricks, Apache Spark (PySpark, SparkSQL), AWS, Redshift, for scalable data processing. Collaborate with cross-functional teams to understand data needs and deliver actionable insights. Optimize data pipeline performance and explore new tools for efficiency. Follow best practices in coding, testing, and infrastructure-as-code (CI/CD, version control, automated testing) . What we expect of you We are all different, yet we all use our unique contributions to serve patients. Strong problem-solving, critical thinking, and communication skills. Ability to collaborate effectively in a team setting. Proficiency in SQL, data analysis tools, and data visualization . Hands-on experience with big data technologies (Databricks, Apache Spark, AWS, Redshift ) . Experience with ETL tools, workflow orchestration, and performance tuning for big data . Basic Qualifications: Bachelors degree and 0 to 3 years of experience OR Diploma and 4 to 7 years of experience in Computer science, IT or related field. Preferred Qualifications: Knowledge of data modeling, warehousing, and graph databases Experience with Python, SageMaker, and cloud data platforms . AWS Certified Data Engineer or Databricks certification preferred.
Posted 1 month ago
3.0 - 8.0 years
5 - 10 Lacs
Hyderabad
Work from Office
Role Description: We are looking for highly motivated expert Senior Data Engineer who can own the design & development of complex data pipelines, solutions and frameworks. The ideal candidate will be responsible to design, develop, and optimize data pipelines, data integration frameworks, and metadata-driven architectures that enable seamless data access and analytics. This role prefers deep expertise in big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management. Roles & Responsibilities: Design, develop, and maintain scalable ETL/ELT pipelines to support structured, semi-structured, and unstructured data processing across the Enterprise Data Fabric. Implement real-time and batch data processing solutions, integrating data from multiple sources into a unified, governed data fabric architecture. Optimize big data processing frameworks using Apache Spark, Hadoop, or similar distributed computing technologies to ensure high availability and cost efficiency. Work with metadata management and data lineage tracking tools to enable enterprise-wide data discovery and governance. Ensure data security, compliance, and role-based access control (RBAC) across data environments. Optimize query performance, indexing strategies, partitioning, and caching for large-scale data sets. Develop CI/CD pipelines for automated data pipeline deployments, version control, and monitoring. Implement data virtualization techniques to provide seamless access to data across multiple storage systems. Collaborate with cross-functional teams, including data architects, business analysts, and DevOps teams, to align data engineering strategies with enterprise goals. Stay up to date with emerging data technologies and best practices, ensuring continuous improvement of Enterprise Data Fabric architectures. Must-Have Skills: Hands-on experience in data engineering technologies such as Databricks, PySpark, SparkSQL Apache Spark, AWS, Python, SQL, and Scaled Agile methodologies. Proficiency in workflow orchestration, performance tuning on big data processing. Strong understanding of AWS services Experience with Data Fabric, Data Mesh, or similar enterprise-wide data architectures. Ability to quickly learn, adapt and apply new technologies Strong problem-solving and analytical skills Excellent communication and teamwork skills Experience with Scaled Agile Framework (SAFe), Agile delivery practices, and DevOps practices. Good-to-Have Skills: Good to have deep expertise in Biotech & Pharma industries Experience in writing APIs to make the data available to the consumers Experienced with SQL/NOSQL database, vector database for large language models Experienced with data modeling and performance tuning for both OLAP and OLTP databases Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and DevOps Education and Professional Certifications Masters degree and 3 to 4 + years of Computer Science, IT or related field experience OR Bachelors degree and 5 to 8 + years of Computer Science, IT or related field experience AWS Certified Data Engineer preferred Databricks Certificate preferred Scaled Agile SAFe certification preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals. Ability to learn quickly, be organized and detail oriented. Strong presentation and public speaking skills.
Posted 1 month ago
0.0 - 2.0 years
2 - 4 Lacs
Hyderabad
Work from Office
Role Description: We are looking for an Associate Data Engineer with deep expertise in writing data pipelines to build scalable, high-performance data solutions. The ideal candidate will be responsible for developing, optimizing and maintaining complex data pipelines, integration frameworks, and metadata-driven architectures that enable seamless access and analytics. This role prefers deep understanding of the big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management. Roles & Responsibilities: Data Engineer who owns development of complex ETL/ELT data pipelines to process large-scale datasets Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions Ensuring data integrity, accuracy, and consistency through rigorous quality checks and monitoring Exploring and implementing new tools and technologies to enhance ETL platform and performance of the pipelines Proactively identify and implement opportunities to automate tasks and develop reusable frameworks Eager to understand the biotech/pharma domains & build highly efficient data pipelines to migrate and deploy complex data across systems Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories. Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle Collaborate and communicate effectively with the product teams, with cross-functional teams to understand business requirements and translate them into technical solutions Must-Have Skills: Experience in Data Engineering with a focus on Databricks, AWS, Python, SQL, and Scaled Agile methodologies Proficiency & Strong understanding of data processing and transformation of big data frameworks (Databricks, Apache Spark, Delta Lake, and distributed computing concepts) Strong understanding of AWS services and can demonstrate the same Ability to quickly learn, adapt and apply new technologies Strong problem-solving and analytical skills Excellent communication and teamwork skills Experience with Scaled Agile Framework (SAFe), Agile delivery, and DevOps practices Good-to-Have Skills: Data Engineering experience in Biotechnology or pharma industry Exposure to APIs, full stack development Experienced with SQL/NOSQL database, vector database for large language models Experienced with data modeling and performance tuning for both OLAP and OLTP databases Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops Education and Professional Certifications Bachelors degree and 2 to 5 + years of Computer Science, IT or related field experience OR Masters degree and 1 to 4 + years of Computer Science, IT or related field experience AWS Certified Data Engineer preferred Databricks Certificate preferred Scaled Agile SAFe certification preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals. Ability to learn quickly, be organized and detail oriented. Strong presentation and public speaking skills.
Posted 1 month ago
9.0 - 14.0 years
11 - 16 Lacs
Hyderabad
Work from Office
Role Description: We are seeking a seasoned Solution Architect to drive the architecture, development and implementation of data solutions to Amgen functional groups. The ideal candidate able to work in large scale Data Analytic initiatives, engage and work along with Business, Program Management, Data Engineering and Analytic Engineering teams. Be champions of enterprise data analytic strategy, data architecture blueprints and architectural guidelines. As a Solution Architect, you will play a crucial role in designing, building, and optimizing data solutions to Amgen functional groups such as R&D, Operations and GCO. Roles & Responsibilities: Implement and manage large scale data analytic solutions to Amgen functional groups that align with the Amgen Data strategy Collaborate with Business, Program Management, Data Engineering and Analytic Engineering teams to deliver data solutions Responsible for design, develop, optimize, delivery and support of Data solutions on AWS and Databricks architecture Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions. Provide expert guidance and mentorship to the team members, fostering a culture of innovation and best practices. Be passionate and hands-on to quickly experiment with new data related technologies Define guidelines, standards, strategies, security policies and change management policies to support the Enterprise Data platform. Collaborate and align with EARB, Cloud Infrastructure, Security and other technology leaders on Enterprise Data Architecture changes Work with different project and application groups to drive growth of the Enterprise Data Platform using effective written/verbal communication skills, and lead demos at different roadmap sessions Overall management of the Enterprise Data Platform on AWS environment to ensure that the service delivery is cost effective and business SLAs around uptime, performance and capacity are met Ensure scalability, reliability, and performance of data platforms by implementing best practices for architecture, cloud resource optimization, and system tuning. Collaboration with RunOps engineers to continuously increase our ability to push changes into production with as little manual overhead and as much speed as possible. Maintain knowledge of market trends and developments in data integration, data management and analytics software/tools Work as part of team in a SAFe Agile/Scrum model Basic Qualifications and Experience: Masters degree with 6 - 8 years of experience in Computer Science, IT or related field OR Bachelors degree with 9 - 12 years of experience in Computer Science, IT or related field OR Functional Skills: Must-Have Skills: 7+ years of hands-on experience in Data integrations, Data Management and BI technology stack. Strong experience with one or more Data Management tools such as AWS data lake, Snowflake or Azure Data Fabric Expert-level proficiency with Databricks and experience in optimizing data pipelines and workflows in Databricks environments. Strong experience with Python, PySpark, and SQL for building scalable data workflows and pipelines. Experience with Apache Spark, Delta Lake, and other relevant technologies for large-scale data processing. Familiarity with BI tools including Tableau and PowerBI Demonstrated ability to enhance cost-efficiency, scalability, and performance for data solutions Strong analytical and problem-solving skills to address complex data solutions Good-to-Have Skills: Preferred to have experience in life science or tech or consultative solution architecture roles Experience working with agile development methodologies such as Scaled Agile. Professional Certifications AWS Certified Data Engineer preferred Databricks Certificate preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills.
Posted 1 month ago
5.0 - 8.0 years
7 - 11 Lacs
Bengaluru
Work from Office
Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLAs defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers and clients business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLAs Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks Deliver NoPerformance ParameterMeasure 1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT 2Team ManagementProductivity, efficiency, absenteeism 3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Apache Spark.
Posted 1 month ago
10.0 - 15.0 years
12 - 18 Lacs
Maharashtra
Work from Office
Staff Software Engineers are the technology leaders of our highest impact projects. Your high energy is contagious, you actively collaborate with others across the engineering organization, and you seek to learn as much as you like to teach. You personify the notion of constant improvement as you work with your team and the larger engineering group to build software that delivers on our mission. You use your extraordinary technical competence to ensure a high bar for excellence while you mentor other engineers on their own path towards craftsmanship. You are most likely T-shaped, with broad knowledge across many technologies plus strong skills in a specific area. Staff Software Engineers embrace the opportunity to represent HMH in industry groups and open-source communities. Area of Responsibility: You will be working on the HMH Assessment Platform that is part of the HMH Educational Online/Digital Learning Platform. The Assessment team builds highly scalable and available platform. The platform is built using Microservices Architecture, Java microservices backend, REACT JavaScript UI Frontend, REST APIs, Postgres Database, AWS Cloud technologies, AWS Kafka, Kubernetes or Mesos orchestration, DataDog for logging/monitoring/alerting, Concourse CI or Jenkins, Maven etc. Responsibilities: Be the technical lead for feature development in a team of 5-10 engineers and influencing the technical direction of the overall engineering organization. Decompose business objectives into valuable, incrementally releasable user features accurately estimating the effort to complete each. Contribute code to feature development efforts demonstrating to others efficient design, delivery and testing patterns and techniques. Strive for high quality outcomes, continuously look for ways to improve team productivity and product reliability, performance, and security. Develop the talents and abilities of peers and colleagues. Create a memorable legacy as you progress toward your personal and professional objectives. Foster your personal and professional development continually seeking assignments that challenge you. Skills & Experience: Successful Candidates must demonstrate an appropriate combination of: 10+ years of experience as a software engineer. 3+ years of experience as a Staff or lead software engineer. Bachelor's degree in computer science or a STEM field. A portfolio of thought leadership and individual technical accomplishments. Full understanding of Agile software development methodologies and practices. Strong communication skills both verbal and written. Extensive experience working with technologies and concepts such: Behavior-driven or test-driven development JVM-based languages such as Java and Scala Development frameworks such as Spring Boot Asynchronous programming concepts, including Event processing Database technologies such as SQL, Postgres/MySQL, AWS Aurora DBs, Redshift, Liquibase or Flyway No-SQL technologies such as Redis, MongoDB and Cassandra Streaming technologies such as Apache Kafka, Apache Spark or Amazon Kinesis Unit-testing frameworks such as jUnit Performance testing frameworks such as Gatling Architectural concepts such as micro-services and separation of concerns Expert knowledge of class-based, object-oriented programming and design patterns Development tools such as GitHub, Jira, Jenkins, Concourse, and Maven Cloud technologies such as AWS and Azure Data Center Operating Technologies such as Kubernetes, Apache Mesos Apache Aurora, and TerraForm and container services such as Docker and Kubernetes Monitoring and operational data analysis practices and tools such as DataDog, Splunk and ELK.
Posted 1 month ago
6.0 - 11.0 years
22 - 30 Lacs
Hyderabad
Work from Office
Qualifications - External Required Qualifications: Bachelors Degree in Computer Information Systems or other technology-related fields 14+ years of software application development and documentation experience with a focus on quality, performance, scalability, and resilience using front end technologies and database 8+ years of experience in Apache Spark and Azure with .NET / Java framework and related technologies such as ASP.NET, C#, VB.NET, .NET Core, Angular/React, JAVA, J2Ee, Springboot, Microservices etc. - Preferably 8+ years of experience with solid understanding of object-oriented programming (OOP) principles and design patterns 8+ years of experience with software technologies commonly used in .NET/Java development, such as SQL Server or MySQL or Springboot and Microservices 6+ years of knowledge of web development technologies like HTML, CSS, JavaScript, and front-end frameworks 4+ years of experience in CI/CD like using TFS, Azure DevOps, or GIT tools for data solution management and delivery 2+ years of experience with familiarity with Cloud, preferably Azure services and solutions, including infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS) Experience on Software Products Engineering using .NET framework and related technologies such as ASP.NET, C#, VB.NET, .NET Core, etc. Solid understanding of agile software development methodology (Scrum) and industry best practices
Posted 1 month ago
6.0 - 10.0 years
8 - 12 Lacs
Pune, Gurugram, Bengaluru
Work from Office
Contractual Hiring manager :- My profile :- linkedin.com/in/yashsharma1608 Payroll of :- https://www.nyxtech.in/ 1. AZURE DATA ENGINEER WITH FABRIC The Role : Lead Data Engineer PAYROLL Client - Brillio About Role: Experience 6 to 8yrs Location- Bangalore , Hyderabad , Pune , Chennai , Gurgaon (Hyderabad is preferred) Notice- 15 days / 30 days. Budget -15 LPA AZURE FABRIC EXP MANDATE Skills : Azure Onelake, datapipeline , Apache Spark , ETL , Datafactory , Azure Fabric , SQL , Python/Scala. Key Responsibilities: Data Pipeline Development: Lead the design, development, and deployment of data pipelines using Azure OneLake, Azure Data Factory, and Apache Spark, ensuring efficient, scalable, and secure data movement across systems. ETL Architecture: Architect and implement ETL (Extract, Transform, Load) workflows, optimizing the process for data ingestion, transformation, and storage in the cloud. Data Integration: Build and manage data integration solutions that connect multiple data sources (structured and unstructured) into a cohesive data ecosystem. Use SQL, Python, Scala, and R to manipulate and process large datasets. Azure OneLake Expertise: Leverage Azure OneLake and Azure Synapse Analytics to design and implement scalable data storage and analytics solutions that support big data processing and analysis. Collaboration with Teams: Work closely with Data Scientists, Data Analysts, and BI Engineers to ensure that the data infrastructure supports analytical needs and is optimized for performance and accuracy. Performance Optimization: Monitor, troubleshoot, and optimize data pipeline performance to ensure high availability, fast processing, and minimal downtime. Data Governance & Security: Implement best practices for data governance, data security, and compliance within the Azure ecosystem, ensuring data privacy and protection. Leadership & Mentorship: Lead and mentor a team of data engineers, promoting a collaborative and high-performance team culture. Oversee code reviews, design decisions, and the implementation of new technologies. Automation & Monitoring: Automate data engineering workflows, job scheduling, and monitoring to ensure smooth operations. Use tools like Azure DevOps, Airflow, and other relevant platforms for automation and orchestration. Documentation & Best Practices: Document data pipeline architecture, data models, and ETL processes, and contribute to the establishment of engineering best practices, standards, and guidelines. C Innovation: Stay current with industry trends and emerging technologies in data engineering, cloud computing, and big data analytics, driving innovation within the team.C
Posted 1 month ago
8.0 - 10.0 years
10 - 12 Lacs
Hyderabad
Work from Office
ABOUT THE ROLE Role Description: We are seeking a highly skilled and experienced hands-on Test Automation Engineering Manager with a deep e xpertise in Data Quality (DQ) , Data Integration (DIF) , and Data Governance . In this role, you will design and implement automated frameworks that ensure data accuracy, metadata consistency , and compliance throughout the data pipeline , leveraging technologies like Data bricks , AWS , and cloud-native tools . You will have a major focus on Data Cataloguing and Governance , ensuring that data assets are well-documented, auditable, and secure across the enterprise. In this role, you will be responsible for the end-to-end design and development of a test automation framework, working collaboratively with the team. As the delivery owner for test automation, your primary focus will be on building and automating comprehensive validation frameworks for data cataloging , data classification, and metadata tracking, while ensuring alignment with internal governance standards. will also work closely with data engineers, product teams, and data governance leads to enforce data quality and governance policies . Your efforts will play a key role in driving data integrity, consistency, and trust across the organization. The role is highly technical and hands-on , with a strong focus on automation, metadata validation , and ensuring data governance practices are seamlessly integrated into development pipelines. Roles & Responsibilities: Data Quality & Integration Frameworks Design and implement Data Quality (DQ) frameworks that validate schema compliance, transformations, completeness, null checks, duplicates, threshold rules, and referential integrity. Build Data Integration Frameworks (DIF) that validate end-to-end data pipelines across ingestion, processing, storage, and consumption layers. Automate data validations in Databricks/Spark pipelines, integrated with AWS services like S3, Glue, Athena, and Lake Formation. Develop modular, reusable validation components using PySpark, SQL, Python, and orchestration via CI/CD pipelines. Data Cataloging & Governance Integrate automated validations with AWS Glue Data Catalog to ensure metadata consistency, schema versioning, and lineage tracking. Implement checks to verify that data assets are properly cataloged, discoverable, and compliant with internal governance standards. Validate and enforce data classification, tagging, and access controls, ensuring alignment with data governance frameworks (e.g., PII/PHI tagging, role-based access policies). Collaborate with governance teams to automate policy enforcement and compliance checks for audit and regulatory needs. Visualization & UI Testing Automate validation of data visualizations in tools like Tableau, Power BI, Looker , or custom React dashboards. Ensure charts, KPIs, filters, and dynamic views correctly reflect backend data using UI automation (Selenium with Python) and backend validation logic. Conduct API testing (via Postman or Python test suites) to ensure accurate data delivery to visualization layers. Technical Skills and Tools Hands-on experience with data automation tools like Databricks and AWS is essential, as the manager will be instrumental in building and managing data pipelines. Leverage automated testing frameworks and containerization tools to streamline processes and improve efficiency. Experience in UI and API functional validation using tools such as Selenium with Python and Postman, ensuring comprehensive testing coverage. Technical Leadership, Strategy & Team Collaboration Define and drive the overall QA and testing strategy for UI and search-related components with a focus on scalability, reliability, and performance, while establishing alerting and reporting mechanisms for test failures, data anomalies, and governance violations. Contribute to system architecture and design discussions , bringing a strong quality and testability lens early into the development lifecycle. Lead test automation initiatives by implementing best practices and scalable frameworks, embedding test suites into CI/CD pipelines to enable automated, continuous validation of data workflows, catalog changes, and visualization updates Mentor and guide QA engineers , fostering a collaborative, growth-oriented culture focused on continuous learning and technical excellence. Collaborate cross-functionally with product managers, developers, and DevOps to align quality efforts with business goals and release timelines. Conduct code reviews, test plan reviews, and pair-testing sessions to ensure team-level consistency and high-quality standards. Good-to-Have Skills: Experience with data governance tools such as Apache Atlas , Collibra , or Alation Understanding of DataOps methodologies and practices Familiarity with monitoring/observability tools such as Datadog , Prometheus , or CloudWatch Experience building or maintaining test data generators Contributions to internal quality dashboards or data observability systems Awareness of metadata-driven testing approaches and lineage-based validations Experience working with agile Testing methodologies such as Scaled Agile. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Must-Have Skills: Strong hands-on experience with Data Quality (DQ) framework design and automation Expertise in PySpark, Python, and SQL for data validations Solid understanding of ETL/ELT pipeline testing in Databricks or Apache Spark environments Experience validating structured and semi-structured data formats (e.g., Parquet, JSON, Avro) Deep familiarity with AWS data services: S3, Glue, Athena, Lake Formation, Data Catalog Integration of test automation with AWS Glue Data Catalog or similar catalog tools UI automation using Selenium with Python for dashboard and web interface validation API testing using Postman, Python, or custom API test scripts Hands-on testing of BI tools such as Tableau, Power BI, Looker, or custom visualization layers CI/CD test integration with tools like Jenkins, GitHub Actions, or GitLab CI Familiarity with containerized environments (e.g., Docker, AWS ECS/EKS) Knowledge of data classification, access control validation, and PII/PHI tagging Understanding of data governance standards (e.g., GDPR, HIPAA, CCPA) Understanding Data Structures: Knowledge of various data structures and their applications. Ability to analyze data and identify inconsistencies. Proven hands-on experience in test automation and data automation using Databricks and AWS. Strong knowledge of Data Integrity Frameworks (DIF) and Data Quality (DQ) principles. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Strong understanding of data transformation techniques and logic. Education and Professional Certifications Bachelors degree in computer science and engineering preferred, other Engineering field is considered; Masters degree and 6+ years experience Or Bachelors degree and 8+ years Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills.
Posted 1 month ago
12.0 - 17.0 years
14 - 19 Lacs
Hyderabad
Work from Office
W e are seeking a highly skilled , hands -on and technically proficient Test Automation Engineering Manager with strong experience in data quality , data integration , and a specific focus on semantic layer validation . This role combines technical ownership of automated data testing solutions with team leadership responsibilities, ensuring that the data infrastructure across platforms remains accurate , reliable, and high performing . As a leader in the QA and Data Engineering space, you will be responsible for building robust automated testing frameworks, validating GraphQL -based data layers, and driving the teams technical growth. Your work will ensure that all data flows, transformations, and API interactions meet enterprise-grade quality standards across the data lifecycle. Y ou will be responsible for the end-to-end design and development of test automation frameworks, working collaboratively with your team. As the delivery owner for test automation, your primary responsibilities will include building and automating comprehensive validation frameworks for semantic layer testing, GraphQL API validation, and schema compliance , ensuring alignment with data quality, performance, and integration reliability standards. You will also work closely with data engineers, product teams, and platform architects to validate data contracts and integration logic, supporting the integrity and trustworthiness of enterprise data solutions. This is a highly technical and hands-on role, with strong emphasis on automation, data workflow validation , and the seamless integration of testing practices into CI/CD pipelines . Roles & Responsibilities: Design and implement robust data validation frameworks focused on the semantic layer, ensuring accurate data model, schema compliance, and contract adherence across services and platforms. Build and automate end-to-end data pipeline validations across ingestion, transformation, and consumption layers using Databricks, Apache Spark, and AWS services such as S3, Glue, Athena, and Lake Formation. Lead test automation initiatives by developing scalable, modular test frameworks and embedding them into CI/CD pipelines for continuous validation of semantic models, API integrations, and data workflows. Validate GraphQL APIs by testing query/mutation structures, schema compliance, and end-to-end integration accuracy using tools like Postman, Python, and custom test suites. Oversee UI and visualization testing for tools like Tableau, Power BI, and custom front-end dashboards, ensuring consistency with backend data through Selenium with Python and backend validations. Define and drive the overall QA strategy with emphasis on performance, reliability, and semantic data accuracy, while setting up alerting and reporting mechanisms for test failures, schema issues, and data contract violations. Collaborate closely with product managers, data engineers, developers, and DevOps teams to align quality assurance initiatives with business goals and agile release cycles. Actively contribute to architecture and design discussions, ensuring quality and testability are embedded from the earliest stages of development. Mentor and manage QA engineers, fostering a collaborative environment focused on technical excellence, knowledge sharing, and continuous professional growth. Must-Have Skills: Team Leadership Experience is also required. Strong 6+ years of experience in Requested Data Ops/Testing is required 7+ to 12 years of Overall experience is expected in Test Automation. Strong experience in designing and implementing test automation frameworks integrated with CI/CD pipelines. Expertise in validating data pipelines at the syntactic layer, including schema checks, null/duplicate handling, and transformation validation. Hands-on experience with Databricks, Apache Spark, and AWS services (S3, Glue, Athena, Lake Formation). Proficiency in Python, PySpark, and SQL for writing validation scripts and automation logic. Solid understanding of GraphQL APIs, including schema validation and query/mutation testing. Experience with API testing tools like Postman and Python-based test frameworks. Proficient in UI and visualization testing using Selenium with Python, especially for tools like Tableau, Power BI, or custom dashboards. Familiarity with CI/CD tools such as Jenkins, GitHub Actions, or GitLab CI for test orchestration. Ability to implement alerting and reporting for test failures, anomalies, and validation issues. Strong background in defining QA strategies and leading test automation initiatives in data-centric environments. Excellent collaboration and communication skills, with the ability to work closely with cross-functional teams in Agile settings. Mentor and manage QA engineers, fostering a collaborative environment focused on technical excellence, knowledge sharing, and continuous professional growth. Good-to-Have Skills: Experience with data governance tools such as Apache Atlas, Collibra, or Alation Understanding of DataOps methodologies and practices Contributions to internal quality dashboards or data observability systems Awareness of metadata-driven testing approaches and lineage-based validations Experience working with agile Testing methodologies such as Scaled Agile. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Education and Professional Certifications Bachelors/Masters degree in computer science and engineering preferred. Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills. EQUAL OPPORTUNITY STATEMENT We provide reasonable accommodations for individuals with disabilities during the application, interview process, job functions, and employment benefits. Contact us to request an accommodation .
Posted 1 month ago
4.0 - 6.0 years
6 - 8 Lacs
Hyderabad
Work from Office
ABOUT THE ROLE Role Description: We are seeking a highly experienced and hands-on Test Automation Engineering Manager with strong leadership skills and deep expertise in Data Integration, Data Quality , and automated data validation across real-time and batch pipelines . In this strategic role, you will lead the design, development, and implementation of scalable test automation frameworks that validate data ingestion, transformation, and delivery across diverse sources into AWS-based analytics platforms , leveraging technologies like Databricks , PySpark , and cloud-native services. As a lead , you will drive the overall testing strategy, lead a team of test engineers, and collaborate cross-functionally with data engineering, platform, and product teams. Your focus will be on delivering high-confidence, production-grade data pipelines with built-in validation layers that support enterprise analytics, ML models, and reporting platforms. The role is highly technical and hands-on , with a strong focus on automation, metadata validation , and ensuring data governance practices are seamlessly integrated into development pipelines. Roles & Responsibilities: Define and drive the test automation strategy for data pipelines, ensuring alignment with enterprise data platform goals. Lead and mentor a team of data QA/test engineers, providing technical direction, career development, and performance feedback. Own delivery of automated data validation frameworks across real-time and batch data pipelines using Databricks and AWS services. Collaborate with data engineering, platform, and product teams to embed data quality checks and testability into pipeline design. Design and implement scalable validation frameworks for data ingestion, transformation, and consumption layers. Automate validations for multiple data formats including JSON, CSV, Parquet, and other structured/semi-structured file types during ingestion and transformation. Automate data testing workflows for pipelines built on Databricks/Spark, integrated with AWS services like S3, Glue, Athena, and Redshift. Establish reusable test components for schema validation, null checks, deduplication, threshold rules, and transformation logic. Integrate validation processes with CI/CD pipelines, enabling automated and event-driven testing across the development lifecycle. Drive the selection and adoption of tools/frameworks that improve automation, scalability, and test efficiency. Oversee testing of data visualizations in Tableau, Power BI, or custom dashboards, ensuring backend accuracy via UI and data-layer validations. Ensure accuracy of API-driven data services, managing functional and regression testing via Postman, Python, or other automation tools. Track test coverage, quality metrics, and defect trends, providing regular reporting to leadership and ensuring continuous improvement. establishing alerting and reporting mechanisms for test failures, data anomalies, and governance violations. Contribute to system architecture and design discussions, bringing a strong quality and testability lens early into the development lifecycle. Lead test automation initiatives by implementing best practices and scalable frameworks, embedding test suites into CI/CD pipelines to enable automated, continuous validation of data workflows, catalog changes, and visualization updates Mentor and guide QA engineers, fostering a collaborative, growth-oriented culture focused on continuous learning and technical excellence. Collaborate cross-functionally with product managers, developers, and DevOps to align quality efforts with business goals and release timelines. Conduct code reviews, test plan reviews, and pair-testing sessions to ensure team-level consistency and high-quality standards. Must-Have Skills: Hands-on experience with Databricks and Apache Spark for building and validating scalable data pipelines Strong expertise in AWS services including S3, Glue, Athena, Redshift, and Lake Formation Proficient in Python, PySpark, and SQL for developing test automation and validation logic Experience validating data from various file formats such as JSON, CSV, Parquet, and Avro In-depth understanding of data integration workflows including batch and real-time (streaming) pipelines Strong ability to define and automate data quality checks : schema validation, null checks, duplicates, thresholds, and transformation validation Experience designing modular, reusable automation frameworks for large-scale data validation Skilled in integrating tests with CI/CD tools like GitHub Actions , Jenkins , or Azure DevOps Familiarity with orchestration tools such as Apache Airflow , Databricks Jobs , or AWS Step Functions Hands-on experience with API testing using Postman , pytest , or custom automation scripts Proven track record of leading and mentoring QA/test engineering teams Ability to define and own test automation strategy and roadmap for data platforms Strong collaboration skills to work with engineering, product, and data teams Excellent communication skills for presenting test results, quality metrics , and project health to leadership Contributions to internal quality dashboards or data observability systems Awareness of metadata-driven testing approaches and lineage-based validations Experience working with agile Testing methodologies such as Scaled Agile. Familiarity with automated testing frameworks like Selenium, JUnit, TestNG, or PyTest. Good-to-Have Skills: Experience with data governance tools such as Apache Atlas, Collibra, or Alation Understanding of DataOps methodologies and practices Familiarity with monitoring/observability tools such as Datadog, Prometheus, or CloudWatch Experience building or maintainingtest data generators Education and Professional Certifications Bachelors/Masters degree in computer science and engineering preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals Strong presentation and public speaking skills.
Posted 1 month ago
1.0 - 3.0 years
3 - 5 Lacs
Hyderabad
Work from Office
What you will do In this vital role you will be responsible for designing, building, maintaining, analyzing, and interpreting data to provide actionable insights that drive business decisions. This role involves working with large datasets, developing reports, supporting and performing data governance initiatives and, visualizing data to ensure data is accessible, reliable, and efficiently managed. The ideal candidate has deep technical skills, experience with big data technologies, and a deep understanding of data architecture and ETL processes. Roles & Responsibilities: Design, develop, and maintain data solutions for data generation, collection, and processing Be a crucial team member that assists in design and development of the data pipeline Build data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency Implement data security and privacy measures to protect sensitive data Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions Collaborate and communicate effectively with product teams Collaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast-paced business needs across geographic regions Identify and resolve complex data-related challenges Adhere to best practices for coding, testing, and designing reusable code/component Explore new tools and technologies that will help to improve ETL platform performance Participate in sprint planning meetings and provide estimations on technical implementation Basic Qualifications: Masters degree and 1 to 3 years of Computer Science, IT or related field experience OR Bachelors degree and 3 to 5 years of Computer Science, IT or related field experience OR Diploma and 7 to 9 years of Computer Science, IT or related field experience Preferred Qualifications: Must-Have Skills: Hands-on experience with big data technologies and platforms, such as Databricks, Apache Spark (PySpark, SparkSQL), workflow orchestration, performance tuning on big data processing Proficiency in data analysis tools (eg. SQL) and experience with data visualization tools Excellent problem-solving skills and the ability to work with large, complex datasets Solid understanding of data governance frameworks, tools, and best practices. Knowledge of data protection regulations and compliance requirements Good-to-Have Skills: Experience with ETL tools such as Apache Spark, and various Python packages related to data processing, machine learning model development Good understanding of data modeling, data warehousing, and data integration concepts Knowledge of Python/R, Databricks, SageMaker, cloud data platforms Professional Certifications Certified Data Engineer / Data Analyst (preferred on Databricks or cloud environments) Soft Skills: Excellent critical-thinking and problem-solving skills Good communication and collaboration skills Demonstrated awareness of how to function in a team setting Demonstrated presentation skills
Posted 1 month ago
9.0 - 12.0 years
0 - 3 Lacs
Hyderabad
Work from Office
About the Role: Grade Level (for internal use): 11 The Team: Our team is responsible for the design, architecture, and development of our client facing applications using a variety of tools that are regularly updated as new technologies emerge. You will have the opportunity every day to work with people from a wide variety of backgrounds and will be able to develop a close team dynamic with coworkers from around the globe. The Impact: The work you do will be used every single day, its the essential code youll write that provides the data and analytics required for crucial, daily decisions in the capital and commodities markets. Whats in it for you: Build a career with a global company. Work on code that fuels the global financial markets. Grow and improve your skills by working on enterprise level products and new technologies. Responsibilities: Solve problems, analyze and isolate issues. Provide technical guidance and mentoring to the team and help them adopt change as new processes are introduced. Champion best practices and serve as a subject matter authority. Develop solutions to develop/support key business needs. Engineer components and common services based on standard development models, languages and tools Produce system design documents and lead technical walkthroughs Produce high quality code Collaborate effectively with technical and non-technical partners As a team-member should continuously improve the architecture Basic Qualifications: 9-12 years of experience designing/building data-intensive solutions using distributed computing. Proven experience in implementing and maintaining enterprise search solutions in large-scale environments. Experience working with business stakeholders and users, providing research direction and solution design and writing robust maintainable architectures and APIs. Experience developing and deploying Search solutions in a public cloud such as AWS. Proficient programming skills at a high-level languages - Java, Scala, Python Solid knowledge of at least one machine learning research frameworks Familiarity with containerization, scripting, cloud platforms, and CI/CD. 5+ years experience with Python, Java, Kubernetes, and data and workflow orchestration tools 4+ years experience with Elasticsearch, SQL, NoSQL,??Apache spark, Flink, Databricks and Mlflow. Prior experience with operationalizing data-driven pipelines for large scale batch and stream processing analytics solutions Good to have experience with contributing to GitHub and open source initiatives or in research projects and/or participation in Kaggle competitions Ability to quickly, efficiently, and effectively define and prototype solutions with continual iteration within aggressive product deadlines. Demonstrate strong communication and documentation skills for both technical and non-technical audiences. Preferred Qualifications: Search Technologies: Query and Indexing content for Apache Solr, Elastic Search, etc. Proficiency in search query languages (e.g., Lucene Query Syntax) and experience with data indexing and retrieval. Experience with machine learning models and NLP techniques for search relevance and ranking. Familiarity with vector search techniques and embedding models (e.g., BERT, Word2Vec). Experience with relevance tuning using A/B testing frameworks. Big Data Technologies: Apache Spark, Spark SQL, Hadoop, Hive, Airflow Data Science Search Technologies: Personalization and Recommendation models, Learn to Rank (LTR) Preferred Languages: Python, Java Database Technologies: MS SQL Server platform, stored procedure programming experience using Transact SQL. Ability to lead, train and mentor.
Posted 1 month ago
6.0 - 11.0 years
8 - 13 Lacs
Pune, Chennai, Bengaluru
Work from Office
As a Backend Python Developer, you will be responsible for designing, developing, and deploying scalable, secure APIs and backend services. You will work closely with cross-functional teams to implement best practices in coding, testing, and deployment. The role also involves ensuring high-quality, modular code delivery while leveraging cloud technologies like AWS, Docker, and Kubernetes. You'll be expected to maintain code quality, collaborate in an Agile environment, and optimize backend systems for performance and scalability. Experience6+ years LocationChennai, Bangalore, Pune, Noida Primary Skills: Core Python, Linux, SQLSecondary skills REST API, Cloud, NoSQL Requirement:- 5+ years of solid experience as a backend Python developer- Experience with Python frameworks (e.g. Django, Flask, Bottle)- Should have good experience in both Python 2 and Python 3- Strong knowledge of Data Structures & Algorithms, OOP, Threads, Parallel-Processing- Experience building secure, complex, and scalable APIs, from design through deployment- Should be able to write clean, modular code. Solid understanding of writing and deliveringtestable quality code- Should have knowledge of SDLC best practices, including coding standards, code reviews,source control management, build processes, testing, and operations. Experience with GIT, Jira & Agile Methodology- Familiarity with Amazon Web Services (AWS) and REST API- Experience with Docker and Kubernetes is a big plus- Experience with SQLNice to Have:- Experience with streaming data and complex event processing systems- Experience in working with NoSQL technologies like Redis, MongoDB, Cassandra is a plus- Working knowledge in AWS, Kafka, Apache Spark, ElasticsearchJava knowledge is a plusJob locationChennai / Bangalore / Pune (Hybrid model - weekly 2 days working from office)QUALIFICATIONS: Bachelors Degree in Computer Science or any related field B.Tech, BE, BCA, etc. 6 to 8 years of experience in software industry
Posted 1 month ago
15.0 - 24.0 years
40 - 90 Lacs
Bengaluru
Hybrid
Key Skills: SCALA, AWS, AWS Cloud, Apache Spark, Architect, SparkSQL, Spark, Spring Boot, Java Roles and Responsibilities: Technical lead the team and project to meet deadlines. Lead efforts with team members to come up with software solutions. Optimize and maintain existing software. Recommend tech upgrades to company leaders. Build scalable, efficient, and high-performance pipelines and workflows that are capable of processing large amounts of batch and real-time data. Multidisciplinary work supporting real-time streams, ETL pipelines, data warehouses, and reporting services. Design and develop microservices and data applications that interact with other microservices. Use Big Data technologies such as Kafka, Data Lake on AWS S3, EMR, Spark, and related technologies to ingest, store, aggregate, transform, move, and query data. Follow coding best practices - Unit testing, design/code reviews, code coverage, documentation, etc. Performance analysis and capacity planning for every release. Work effectively as part of an Agile team. Bring new and innovative solutions to resolve challenging software issues as they may develop throughout the product lifecycle. Skills Required: Excellence in software design skills. Strong knowledge of design patterns, including performance optimization considerations. Proficient in writing high-quality, well-structured code in Java and Scala. Excellence in test-driven development approach and debugging software. Proficient in writing clear, concise, and organized documentation. Knowledge of Amazon cloud computing infrastructure (Aurora MySQL, DynamoDB, EMR, Lambda, Step Functions, and S3). Ability to excel in a team environment. Strong communication skills and the ability to discuss a solution with team members of varying technical sophistication. Ability to perform thoughtful and detailed code reviews, both for peers and Junior Developers. Familiarity with software engineering and project management tools. Following security protocols and best data governance practices. Able to construct KPIs and using metrics for process improvements. Minimum qualifications: 12+ years' experience in designing and developing enterprise-level software solutions. 5 years' experience developing Scala/Java applications and microservices using Spring Boot. 10 years' experience with large volume data processing and big data tools such as Apache Spark, Scala, and Hadoop technologies. 5 years' experience with SQL and Relational databases. 2 years' experience working with Agile/Scrum methodology. Education: Bachelor's Degree in related field
Posted 1 month ago
8.0 - 12.0 years
15 - 20 Lacs
Pune
Work from Office
We are looking for a highly experienced Lead Data Engineer / Data Architect to lead the design, development, and implementation of scalable data pipelines, data Lakehouse, and data warehousing solutions. The ideal candidate will provide technical leadership to a team of data engineers, drive architectural decisions, and ensure best practices in data engineering. This role is critical in enabling data-driven decision-making and modernizing our data infrastructure. Key Responsibilities: Act as a technical leader responsible for guiding the design, development, and implementation of data pipelines, data Lakehouse, and data warehousing solutions. Lead a team of data engineers, ensuring adherence to best practices and standards. Drive the successful delivery of high-quality, scalable, and reliable data solutions. Play a key role in shaping data architecture, adopting modern data technologies, and enabling data-driven decision-making across the team. Provide technical vision, guidance, and mentorship to the team. Lead technical design discussions, perform code reviews, and contribute to architectural decisions.
Posted 1 month ago
5.0 - 7.0 years
18 - 20 Lacs
Hyderabad, Bengaluru
Hybrid
Type: Contract-to-Hire (C2H) Job Summary We are looking for a skilled PySpark Developer with MUST 4+ YEARS hands-on experience in building scalable data pipelines and processing large datasets. The ideal candidate will have deep expertise in Apache Spark, Python, and working with modern data engineering tools in cloud environments such as AWS. Key Skills & Responsibilities Strong expertise in PySpark and Apache Spark for batch and real-time data processing. Experience in designing and implementing ETL pipelines, including data ingestion, transformation, and validation. Proficiency in Python for scripting, automation, and building reusable components. Hands-on experience with scheduling tools like Airflow or Control-M to orchestrate workflows. Familiarity with AWS ecosystem, especially S3 and related file system operations. Strong understanding of Unix/Linux environments and Shell scripting. Experience with Hadoop, Hive, and platforms like Cloudera or Hortonworks. Ability to handle CDC (Change Data Capture) operations on large datasets. Experience in performance tuning, optimizing Spark jobs, and troubleshooting. Strong knowledge of data modeling, data validation, and writing unit test cases. Exposure to real-time and batch integration with downstream/upstream systems. Working knowledge of Jupyter Notebook, Zeppelin, or PyCharm for development and debugging. Understanding of Agile methodologies, with experience in CI/CD tools (e.g., Jenkins, Git). Preferred Skills Experience in building or integrating APIs for data provisioning. Exposure to ETL or reporting tools such as Informatica, Tableau, Jasper, or QlikView. Familiarity with AI/ML model development using PySpark in cloud environments.
Posted 1 month ago
8.0 - 10.0 years
15 - 30 Lacs
Bengaluru
Work from Office
Role & responsibilities: Technical Skills: 1. Core Databricks Platform: Databricks workspace, clusters, jobs, notebooks, Unity Catalog 2. Big Data Technologies: Apache Spark (PySpark/Scala), Delta Lake, Apache Kafka 3. Programming Languages: Python (advanced), SQL (advanced), Scala (preferred) 4. Cloud Platforms: Azure (preferred) or AWS with Databricks integration 5. Data Pipeline Tools: Apache Airflow, Azure Data Factory, or similar orchestration tools 6. Version Control & CI/CD: Git, Azure DevOps, Jenkins, or GitHub Actions 7. Data Formats & Storage: Parquet, JSON, Avro, Azure Data Lake, S3 8. Monitoring & Observability: Databricks monitoring, custom metrics, alerting systems Leadership & Soft Skills: 1. Strong leadership and people management capabilities 2. Excellent communication skills with ability to explain complex technical concepts 3. Experience with Agile/Scrum methodologies 4. Problem-solving mindset with attention to detail 5. Ability to work in fast-paced, dynamic environments 6. 8+ years of overall experience in data engineering, software engineering, or related technical roles 7. 4+ years of hands-on experience with Databricks/big data platform and Apache Spark 8. 2+ years of team leadership or technical mentoring experience Preferred Qualifications: 1. Databricks certifications (Certified Data Engineer Associate/Professional) 2. Experience with MLOps and machine learning pipeline deployment 3. Knowledge of data mesh or data fabric architectures 4. Experience with streaming data processing using Spark Structured Streaming 5. Background in financial services, healthcare, or retail domains
Posted 2 months ago
5.0 - 10.0 years
15 Lacs
Noida, Chennai, Bengaluru
Work from Office
Responsibilities Lead the design, development, and implementation of big data solutions using Apache Spark and Databricks. Architect and optimize data pipelines and workflows to process large volumes of data efficiently. Utilize Databricks features such as Delta Lake, Databricks SQL, and Databricks Workflows to enhance data processing and analytics capabilities. Collaborate with data engineers, data scientists, and business stakeholders to understand data requirements and deliver high-quality data solutions. Implement best practices for data engineering, including data quality, data governance, and data security. Monitor and troubleshoot performance issues in Spark jobs and Databricks clusters. Mentor and guide junior engineers in the team, promoting a culture of continuous learning and improvement. Stay up-to-date with the latest advancements in Spark and Databricks technologies and incorporate them into the team's practices.
Posted 2 months ago
4.0 - 8.0 years
6 - 10 Lacs
Mumbai, Delhi / NCR, Bengaluru
Work from Office
We specialize in delivering high-quality human-curated data and AI-first scaled operations services Based in San Francisco and Hyderabad, we are a fast-moving team on a mission to build AI for Good, driving innovation and societal impact Role Overview: We are looking for a Data Scientist to join and build intelligent, data-driven solutions for our client that enable impactful decisions This role requires contributions across the data science lifecycle from data wrangling and exploratory analysis to building and deploying machine learning models Whether youre just getting started or have years of experience, were looking for individuals who are curious, analytical, and driven to make a difference with data Responsibilities: Design, develop, and deploy machine learning models and analytical solutions Conduct exploratory data analysis and feature engineering Own or contribute to the end-to-end data science pipeline: data cleaning, modeling, validation, and deployment Collaborate with cross-functional teams (engineering, product, business) to define problems and deliver measurable impact Translate business challenges into data science problems and communicate findings clearly Implement A/B tests, statistical tests, and experimentation strategies Support model monitoring, versioning, and continuous improvement in production environments Evaluate new tools, frameworks, and best practices to improve model accuracy and scalability Required Skills: Strong programming skills in Python including libraries such as pandas, NumPy, scikit-learn, matplotlib, seaborn Proficient in SQL, comfortable querying large, complex datasets Sound understanding of statistics, machine learning algorithms, and data modeling Experience building end-to-end ML pipelines Exposure to or hands-on experience with model deployment tools like FastAPI, Flask, MLflow Experience with data visualization and insight communication Familiarity with version control tools (eg, Git) and collaborative workflows Ability to write clean, modular code and document processes clearly Nice to Have: Experience with deep learning frameworks like TensorFlow or PyTorch Familiarity with data engineering tools like Apache Spark, Kafka, Airflow, dbt Exposure to MLOps practices and managing models in production environments Working knowledge of cloud platforms like AWS, GCP, or Azure (e, SageMaker, BigQuery, Vertex AI) Experience designing and interpreting A/B tests or causal inference models Prior experience in high-growth startups or cross-functional leadership roles Educational Qualifications: Bachelors or Masters degree in Computer Science, Data Science, Mathematics, Engineering, or a related field Location : - Mumbai, Delhi / NCR, Bengaluru , Kolkata, Chennai, Hyderabad, Ahmedabad, Pune,India
Posted 2 months ago
5.0 - 8.0 years
9 - 14 Lacs
Bengaluru
Work from Office
Role Purpose The purpose of the role is to support process delivery by ensuring daily performance of the Production Specialists, resolve technical escalations and develop technical capability within the Production Specialists. Do Oversee and support process by reviewing daily transactions on performance parameters Review performance dashboard and the scores for the team Support the team in improving performance parameters by providing technical support and process guidance Record, track, and document all queries received, problem-solving steps taken and total successful and unsuccessful resolutions Ensure standard processes and procedures are followed to resolve all client queries Resolve client queries as per the SLAs defined in the contract Develop understanding of process/ product for the team members to facilitate better client interaction and troubleshooting Document and analyze call logs to spot most occurring trends to prevent future problems Identify red flags and escalate serious client issues to Team leader in cases of untimely resolution Ensure all product information and disclosures are given to clients before and after the call/email requests Avoids legal challenges by monitoring compliance with service agreements Handle technical escalations through effective diagnosis and troubleshooting of client queries Manage and resolve technical roadblocks/ escalations as per SLA and quality requirements If unable to resolve the issues, timely escalate the issues to TA & SES Provide product support and resolution to clients by performing a question diagnosis while guiding users through step-by-step solutions Troubleshoot all client queries in a user-friendly, courteous and professional manner Offer alternative solutions to clients (where appropriate) with the objective of retaining customers and clients business Organize ideas and effectively communicate oral messages appropriate to listeners and situations Follow up and make scheduled call backs to customers to record feedback and ensure compliance to contract SLAs Build people capability to ensure operational excellence and maintain superior customer service levels of the existing account/client Mentor and guide Production Specialists on improving technical knowledge Collate trainings to be conducted as triage to bridge the skill gaps identified through interviews with the Production Specialist Develop and conduct trainings (Triages) within products for production specialist as per target Inform client about the triages being conducted Undertake product trainings to stay current with product features, changes and updates Enroll in product specific and any other trainings per client requirements/recommendations Identify and document most common problems and recommend appropriate resolutions to the team Update job knowledge by participating in self learning opportunities and maintaining personal networks Deliver NoPerformance ParameterMeasure1ProcessNo. of cases resolved per day, compliance to process and quality standards, meeting process level SLAs, Pulse score, Customer feedback, NSAT/ ESAT2Team ManagementProductivity, efficiency, absenteeism3Capability developmentTriages completed, Technical Test performance Mandatory Skills: Apache Spark. Experience: 5-8 Years.
Posted 2 months ago
5.0 - 8.0 years
7 - 10 Lacs
Hyderabad
Work from Office
Role Description: We are looking for highly motivated expert Data Engineer who can own the design & development of complex data pipelines, solutions and frameworks. The ideal candidate will be responsible to design, develop, and maintain data pipelines, data integration frameworks, and metadata-driven architectures that enable seamless data access and analytics. This role prefers deep expertise in big data processing, distributed computing, data modeling, and governance frameworks to support self-service analytics, AI-driven insights, and enterprise-wide data management. Roles & Responsibilities: Design, develop, and maintain complex ETL/ELT data pipelines in Databricks using PySpark, Scala, and SQL to process large-scale datasets Understand the biotech/pharma or related domains & build highly efficient data pipelines to migrate and deploy complex data across systems Design and Implement solutions to enable unified data access, governance, and interoperability across hybrid cloud environments Ingest and transform structured and unstructured data from databases (PostgreSQL, MySQL, SQL Server, MongoDB etc.), APIs, logs, event streams, images, pdf, and third-party platforms Ensuring data integrity, accuracy, and consistency through rigorous quality checks and monitoring Expert in data quality, data validation and verification frameworks Innovate, explore and implement new tools and technologies to enhance efficient data processing Proactively identify and implement opportunities to automate tasks and develop reusable frameworks Work in an Agile and Scaled Agile (SAFe) environment, collaborating with cross-functional teams, product owners, and Scrum Masters to deliver incremental value Use JIRA, Confluence, and Agile DevOps tools to manage sprints, backlogs, and user stories. Support continuous improvement, test automation, and DevOps practices in the data engineering lifecycle Collaborate and communicate effectively with the product teams, with cross-functional teams to understand business requirements and translate them into technical solutions Must-Have Skills: Hands-on experience in data engineering technologies such as Databricks, PySpark, SparkSQL Apache Spark, AWS, Python, SQL, and Scaled Agile methodologies. Proficiency in workflow orchestration, performance tuning on big data processing. Strong understanding of AWS services Ability to quickly learn, adapt and apply new technologies Strong problem-solving and analytical skills Excellent communication and teamwork skills Experience with Scaled Agile Framework (SAFe), Agile delivery practices, and DevOps practices. Good-to-Have Skills: Data Engineering experience in Biotechnology or pharma industry Experience in writing APIs to make the data available to the consumers Experienced with SQL/NOSQL database, vector database for large language models Experienced with data modeling and performance tuning for both OLAP and OLTP databases Experienced with software engineering best-practices, including but not limited to version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven etc.), automated unit testing, and Dev Ops Education and Professional Certifications Minimum 5 to 8 years of Computer Science, IT or related field experience AWS Certified Data Engineer preferred Databricks Certificate preferred Scaled Agile SAFe certification preferred Soft Skills: Excellent analytical and troubleshooting skills. Strong verbal and written communication skills Ability to work effectively with global, virtual teams High degree of initiative and self-motivation. Ability to manage multiple priorities successfully. Team-oriented, with a focus on achieving team goals. Ability to learn quickly, be organized and detail oriented. Strong presentation and public speaking skills.
Posted 2 months ago
4.0 - 8.0 years
5 - 9 Lacs
Hyderabad, Bengaluru
Work from Office
Whats in it for you? Pay above market standards The role is going to be contract based with project timelines from 2 12 months, or freelancing Be a part of an Elite Community of professionals who can solve complex AI challenges Work location could be: Remote (Highly likely) Onsite on client location Deccan AIs Office: Hyderabad or Bangalore Responsibilities: Design and architect enterprise-scale data platforms, integrating diverse data sources and tools Develop real-time and batch data pipelines to support analytics and machine learning Define and enforce data governance strategies to ensure security, integrity, and compliance along with optimizing data pipelines for high performance, scalability, and cost efficiency in cloud environments Implement solutions for real-time streaming data (Kafka, AWS Kinesis, Apache Flink) and adopt DevOps/DataOps best practices Required Skills: Strong experience in designing scalable, distributed data systems and programming (Python, Scala, Java) with expertise in Apache Spark, Hadoop, Flink, Kafka, and cloud platforms (AWS, Azure, GCP) Proficient in data modeling, governance, warehousing (Snowflake, Redshift, BigQuery), and security/compliance standards (GDPR, HIPAA) Hands-on experience with CI/CD (Terraform, Cloud Formation, Airflow, Kubernetes) and data infrastructure optimization (Prometheus, Grafana) Nice to Have: Experience with graph databases, machine learning pipeline integration, real-time analytics, and IoT solutions Contributions to open-source data engineering communities What are the next steps? Register on our Soul AI website
Posted 2 months ago
4.0 - 8.0 years
13 - 17 Lacs
Hyderabad, Bengaluru
Work from Office
Responsibilities: Design and architect enterprise-scale data platforms, integrating diverse data sources and tools Develop real-time and batch data pipelines to support analytics and machine learning Define and enforce data governance strategies to ensure security, integrity, and compliance along with optimizing data pipelines for high performance, scalability, and cost efficiency in cloud environments Implement solutions for real-time streaming data (Kafka, AWS Kinesis, Apache Flink) and adopt DevOps/DataOps best practices Required Skills: Strong experience in designing scalable, distributed data systems and programming (Python, Scala, Java) with expertise in Apache Spark, Hadoop, Flink, Kafka, and cloud platforms (AWS, Azure, GCP) Proficient in data modeling, governance, warehousing (Snowflake, Redshift, Big Query), and security/compliance standards (GDPR, HIPAA) Hands-on experience with CI/CD (Terraform, Cloud Formation, Airflow, Kubernetes) and data infrastructure optimization (Prometheus, Grafana) Nice to Have: Experience with graph databases, machine learning pipeline integration, real-time analytics, and IoT solutions Contributions to open-source data engineering communities
Posted 2 months ago
4.0 - 8.0 years
6 - 10 Lacs
Mumbai, Delhi / NCR, Bengaluru
Work from Office
We specialize in delivering high-quality human-curated data and AI-first scaled operations services Based in San Francisco and Hyderabad, we are a fast-moving team on a mission to build AI for Good, driving innovation and societal impact Role Overview: We are seeking a Data Engineer / Data Architect who will be responsible for designing, building, and maintaining scalable data infrastructure and systems for a client Youll play a key role in enabling efficient data flow, storage, transformation, and access across our organization or client ecosystems Whether youre just beginning or already an expert, we value strong technical skills, curiosity, and the ability to translate complex requirements into reliable data pipelines Responsibilities: Design and implement scalable, robust, and secure data pipelines Build ETL/ELT frameworks to collect, clean, and transform structured and unstructured data Collaborate with data scientists, analysts, and backend engineers to enable seamless data access and model integration Maintain data integrity, schema design, lineage, and quality monitoring Optimize performance and ensure reliability of data workflows in production environments Design and manage data warehousing and lakehouse architecture Set up and manage infrastructure using IaC (Infrastructure as Code) when applicable Required Skills: Strong programming skills in Python, SQL, and Shell scripting Hands-on experience with ETL tools and orchestration frameworks (e g, Airflow, Luigi, dbt) Proficiency in relational databases (e g , PostgreSQL, MySQL) and NoSQL databases (e g , MongoDB, Redis) Experience with big data technologies: Apache Spark, Kafka, Hive, Hadoop, etc Deep understanding of data modeling, schema design, and data warehousing concepts Proficient with cloud platforms (AWS/GCP/Azure) and services like Redshift, BigQuery, S3, Dataflow, or Databricks Knowledge of DevOps and CI/CD tools relevant to data infrastructure Nice to Have: Experience working in real-time streaming environments Familiarity with containerization and Kubernetes Exposure to MLOps and collaboration with ML teams Experience with security protocols, data governance, and compliance frameworks Educational Qualifications: Bachelors or Masters in Computer Science, Data Engineering, Information Systems, or a related technical field Location - Mumbai, Delhi / NCR, Bengaluru , Kolkata, Chennai, Hyderabad, Ahmedabad, Pune, India
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough