Jobs
Interviews

491 Data Pipeline Jobs - Page 17

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

5.0 - 8.0 years

27 - 42 Lacs

Bengaluru

Work from Office

Job Summary NetApp is a cloud-led, data-centric software company that helps organizations put data to work in applications that elevate their business. We help organizations unlock the best of cloud technology. As a member of Solutions Integration Engineering you work cross-functionally to define and create engineered solutions /products which would accelerate the field adoption. We work closely with ISV’s and with the startup ecosystem in the Virtualization, Cloud, and AI/ML domains to build solutions that matter for the customers You will work closely with product owner and product lead on the company's current and future strategies related to said domains. Job Requirements • Deliver features, including participating in the full software development lifecycle. • Deliver reliable, innovative solutions and products • Participate in product design, development, verification, troubleshooting, and delivery of a system or major subsystems, including authoring project specifications. • Work closely with cross-functional teams including business stakeholders to innovate and unlock new use-cases for our customers • Write unit and automated integrationtests and project documentation Technical Skills: • Understanding of Software development lifecycle • Proficiency in full stack development ~ Python, Container Ecosystem, Cloud and Modern ML frameworks • Knowledge of Data storage and Artificial intelligence concepts including server/storage architecture, batch/stream processing, data warehousing, data lakes, distributed filesystems, OLTP/OLAP databases and data pipelining tools, model, inferencing as well as RAG workflows. • Exposure on Data pipeline, integrations and Unix based operating system kernels and development environments, e.g. Linux or FreeBSD. • A strong understanding of basic to complex concepts related to computer architecture, data structures, and new programming paradigms • Demonstrated creative and systematic approach to problem solving. • Possess excellent written and verbal communication skills. Education • Minimum 5 years of experience and must be hands-on with coding. • B.E/B.Tech or M.S in Computer Science or related technical field.

Posted 2 months ago

Apply

5.0 - 10.0 years

8 - 14 Lacs

Kolkata

Work from Office

Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.

Posted 2 months ago

Apply

5.0 - 10.0 years

8 - 14 Lacs

Noida

Work from Office

Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.

Posted 2 months ago

Apply

1.0 - 5.0 years

3 - 7 Lacs

Faridabad

Work from Office

We are seeking a Data Science & Machine Learning Engineer with strong Python skills, experience in training and deploying machine learning models, and expertise in SQL, cloud platforms (AWS, AWS), and data pipeline development.

Posted 2 months ago

Apply

5.0 - 10.0 years

8 - 14 Lacs

Ahmedabad

Work from Office

Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.

Posted 2 months ago

Apply

3.0 - 4.0 years

15 - 22 Lacs

Bengaluru

Hybrid

Velotio Technologies is a product engineering company working with innovative startups and enterprises. We are a certified Great Place to Work and recognized as one of the best companies to work for in India. We have provided full-stack product development for 110+ startups across the globe building products in the cloud-native, data engineering, B2B SaaS, IoT & Machine Learning space. Our team of 400+ elite software engineers solves hard technical problems while transforming customer ideas into successful products. Requirements Design, develop, and maintain robust and scalable data pipelines that ingest, transform, and load data from various sources into data warehouse. Collaborate with business stakeholders to understand data requirements and translate them into technical solutions. Implement data quality checks and monitoring to ensure data accuracy and integrity. Optimize data pipelines for performance and efficiency. Troubleshoot and resolve data pipeline issues. Stay up-to-date with emerging technologies and trends in data engineering. Qualifications Bachelors or Masters degree in Computer Science, Engineering, or a related field. 2+ years of experience in data engineering or a similar role. Strong proficiency in SQL and at least one programming language (e.g., Python, Java). Experience with data pipeline tools and frameworks Experience with cloud-based data warehousing solutions (Snowflake). Experience with AWS Kinesis, SNS, SQS Excellent problem-solving and analytical skills. Strong communication and interpersonal skills. Desired Skills & Experience: Data pipeline architecture Data warehousing ETL (Extract, Transform, Load) Data modeling SQL Python or Java or Go Cloud computing Business intelligence Our Culture : We have an autonomous and empowered work culture encouraging individuals to take ownership and grow quickly Flat hierarchy with fast decision making and a startup-oriented get things done culture A strong, fun & positive environment with regular celebrations of our success. We pride ourselves in creating an inclusive, diverse & authentic environment At Velotio, we embrace diversity. Inclusion is a priority for us, and we are eager to foster an environment where everyone feels valued. We welcome applications regardless of ethnicity or cultural background, age, gender, nationality, religion, disability or sexual orientation.

Posted 2 months ago

Apply

5.0 - 10.0 years

8 - 14 Lacs

Pune

Work from Office

Key Responsibilities : - Design and develop scalable PySpark pipelines to ingest, parse, and process XML datasets with extreme hierarchical complexity. - Implement efficient XPath expressions, recursive parsing techniques, and custom schema definitions to extract data from nested XML structures. - Optimize Spark jobs through partitioning, caching, and parallel processing to handle terabytes of XML data efficiently. - Transform raw hierarchical XML data into structured DataFrames for analytics, machine learning, and reporting use cases. - Collaborate with data architects and analysts to define data models for nested XML schemas. - Troubleshoot performance bottlenecks and ensure reliability in distributed environments (e.g., AWS, Databricks, Hadoop). - Document parsing logic, data lineage, and optimization strategies for maintainability. Qualifications : - 5+ years of hands-on experience with PySpark and Spark XML libraries (e.g., `spark-xml`) in production environments. - Proven track record of parsing XML data with 20+ levels of nesting using recursive methods and schema inference. - Expertise in XPath, XQuery, and DataFrame transformations (e.g., `explode`, `struct`, `selectExpr`) for hierarchical data. - Strong understanding of Spark optimization techniques: partitioning strategies, broadcast variables, and memory management. - Experience with distributed computing frameworks (e.g., Hadoop, YARN) and cloud platforms (AWS, Azure, GCP). - Familiarity with big data file formats (Parquet, Avro) and orchestration tools (Airflow, Luigi). - Bachelor's degree in Computer Science, Data Engineering, or a related field. Preferred Skills : - Experience with schema evolution and versioning for nested XML/JSON datasets. - Knowledge of Scala or Java for extending Spark XML libraries. - Exposure to Databricks, Delta Lake, or similar platforms. - Certifications in AWS/Azure big data technologies.

Posted 2 months ago

Apply

2.0 - 7.0 years

4 - 8 Lacs

Noida

Work from Office

About the Role: We are seeking a highly skilled and motivated Backend Developer with 2 to 5 years of experience to design and implement a high-performance, secure, and scalable server-side architecture for our trading terminal. In this role, you will develop systems capable of processing large volumes of real-time financial data, ensuring low latency and exceptional reliability for mission-critical applications. Your expertise will be central to empowering data-driven trading experiences for our users. Key Responsibilities: Service Architecture & Development: Design, develop, and maintain high-performance backend services, RESTful APIs, and microservices. Architect systems that efficiently process and analyze large-scale real-time market data. Develop robust, modular, and scalable server-side logic to support complex trading transactions. Data Management & Integration: Build and optimize data pipelines connecting external data providers, databases, and client applications. Integrate real-time data feeds using protocols such as WebSockets to enable seamless, live data updates. Collaborate with frontend teams to ensure data consistency, reliability, and performance across the platform. Performance & Security: Optimize system performance with a focus on low latency, high throughput, and resource efficiency. Implement strong security measures including authentication, encryption, and secure API practices to protect sensitive financial data. Monitor system performance, troubleshoot, and resolve issues to ensure uninterrupted service during peak market conditions. Collaboration & Agile Development: Work closely with multi-disciplinary teams (frontend developers, product managers, and QA engineers) in an Agile setting. Participate actively in code reviews, design discussions, and strategy meetings to drive continuous improvement. Leverage CI/CD practices to implement automated testing, integration, and deployment pipelines for frequent yet stable releases. Innovation & Continuous Improvement: Stay updated on backend technologies, cloud services, container orchestration, and microservices architecture. Propose and experiment with new tools and techniques to improve system efficiency and scalability. Document best practices and contribute to a knowledge-sharing culture within the team. Required Qualifications: Experience: A minimum of 2 to 5 years in backend development with a demonstrable record of building robust web applications, APIs, or microservices. Technical Expertise: Proficiency in server-side programming languages such as Node.js, Python, Django Solid experience with both SQL (e.g., PostgreSQL, MySQL) and NoSQL (e.g., MongoDB, Redis) databases. Hands-on experience with cloud platforms (AWS, Azure, or Google Cloud Platform) and containerization tools (Docker, Kubernetes). Familiarity with real-time communication protocols (WebSockets, MQTT) and API design. Development Practices: Strong background in RESTful API development, microservices design, and automated testing methodologies. Experience with version control systems (Git) and CI/CD pipelines. A deep commitment to writing clean, maintainable, and well-documented code. Preferred Qualifications: Prior experience building backend solutions for financial or trading platforms. Familiarity with transaction processing systems and high-frequency trading requirements. Excellent problem-solving skills and strong collaboration capabilities in a fast-paced environment. What We Offer: An engaging, innovative work environment focused on cutting-edge financial technology. A competitive compensation package and comprehensive benefits. Opportunities for professional growth, continuous learning, and career advancement. A chance to make a significant impact by shaping next-generation trading infrastructure. Application Process: Interested candidates should submit: A detailed resume outlining your relevant experience and technical expertise. Links to GitHub repositories or portfolios that highlight your backend projects. A cover letter describing your approach to scalable system design, your passion for financial technology, and how your skills align with our vision. you can share your cv on whatsapp 8115677271 (no call)

Posted 2 months ago

Apply

1.0 - 4.0 years

10 - 14 Lacs

Pune

Work from Office

Overview Design, develop, and maintain data pipelines and ETL/ELT processes using PySpark/Databricks. Optimize performance for large datasets through techniques such as partitioning, indexing, and Spark optimization. Collaborate with cross-functional teams to resolve technical issues and gather requirements. Responsibilities Ensure data quality and integrity through data validation and cleansing processes. Analyze existing SQL queries, functions, and stored procedures for performance improvements. Develop database routines like procedures, functions, and views. Participate in data migration projects and understand technologies like Delta Lake/warehouse. Debug and solve complex problems in data pipelines and processes. Qualifications Bachelor’s degree in computer science, Engineering, or a related field. Strong understanding of distributed data processing platforms like Databricks and BigQuery. Proficiency in Python, PySpark, and SQL programming languages. Experience with performance optimization for large datasets. Strong debugging and problem-solving skills. Fundamental knowledge of cloud services, preferably Azure or GCP. Excellent communication and teamwork skills. Nice to Have: Experience in data migration projects. Understanding of technologies like Delta Lake/warehouse. What we offer you Transparent compensation schemes and comprehensive employee benefits, tailored to your location, ensuring your financial security, health, and overall wellbeing. Flexible working arrangements, advanced technology, and collaborative workspaces. A culture of high performance and innovation where we experiment with new ideas and take responsibility for achieving results. A global network of talented colleagues, who inspire, support, and share their expertise to innovate and deliver for our clients. Global Orientation program to kickstart your journey, followed by access to our Learning@MSCI platform, LinkedIn Learning Pro and tailored learning opportunities for ongoing skills development. Multi-directional career paths that offer professional growth and development through new challenges, internal mobility and expanded roles. We actively nurture an environment that builds a sense of inclusion belonging and connection, including eight Employee Resource Groups. All Abilities, Asian Support Network, Black Leadership Network, Climate Action Network, Hola! MSCI, Pride & Allies, Women in Tech, and Women’s Leadership Forum. At MSCI we are passionate about what we do, and we are inspired by our purpose – to power better investment decisions. You’ll be part of an industry-leading network of creative, curious, and entrepreneurial pioneers. This is a space where you can challenge yourself, set new standards and perform beyond expectations for yourself, our clients, and our industry. MSCI is a leading provider of critical decision support tools and services for the global investment community. With over 50 years of expertise in research, data, and technology, we power better investment decisions by enabling clients to understand and analyze key drivers of risk and return and confidently build more effective portfolios. We create industry-leading research-enhanced solutions that clients use to gain insight into and improve transparency across the investment process. MSCI Inc. is an equal opportunity employer. It is the policy of the firm to ensure equal employment opportunity without discrimination or harassment on the basis of race, color, religion, creed, age, sex, gender, gender identity, sexual orientation, national origin, citizenship, disability, marital and civil partnership/union status, pregnancy (including unlawful discrimination on the basis of a legally protected parental leave), veteran status, or any other characteristic protected by law. MSCI is also committed to working with and providing reasonable accommodations to individuals with disabilities. If you are an individual with a disability and would like to request a reasonable accommodation for any part of the application process, please email Disability.Assistance@msci.com and indicate the specifics of the assistance needed. Please note, this e-mail is intended only for individuals who are requesting a reasonable workplace accommodation; it is not intended for other inquiries. To all recruitment agencies MSCI does not accept unsolicited CVs/Resumes. Please do not forward CVs/Resumes to any MSCI employee, location, or website. MSCI is not responsible for any fees related to unsolicited CVs/Resumes. Note on recruitment scams We are aware of recruitment scams where fraudsters impersonating MSCI personnel may try and elicit personal information from job seekers. Read our full note on careers.msci.com

Posted 2 months ago

Apply

7.0 - 10.0 years

2 - 6 Lacs

Pune

Work from Office

Responsibilities : - Design, develop, and deploy data pipelines using Databricks, including data ingestion, transformation, and loading (ETL) processes. - Develop and maintain high-quality, scalable, and maintainable Databricks notebooks using Python. - Work with Delta Lake and other advanced features. - Leverage Unity Catalog for data governance, access control, and data discovery. - Develop and optimize data pipelines for performance and cost-effectiveness. - Integrate with various data sources, including but not limited to databases and cloud storage (Azure Blob Storage, ADLS, Synapse), and APIs. - Experience working with Parquet files for data storage and processing. - Experience with data integration from Azure Data Factory, Azure Data Lake, and other relevant Azure services. - Perform data quality checks and validation to ensure data accuracy and integrity. - Troubleshoot and resolve data pipeline issues effectively. - Collaborate with data analysts, business analysts, and business stakeholders to understand their data needs and translate them into technical solutions. - Participate in code reviews and contribute to best practices within the team.

Posted 2 months ago

Apply

6.0 - 9.0 years

27 - 42 Lacs

Chennai

Work from Office

Description - External Role – AIML Data Scientist Location : Kochi Mode of Interview - In Person Date : 14th June 2025 (Saturday) Job Description: 1. Be a hands on problem solver with consultative approach, who can apply Machine Learning & Deep Learning algorithms to solve business challenges a. Use the knowledge of wide variety of AI/ML techniques and algorithms to find what combinations of these techniques can best solve the problem b. Improve Model accuracy to deliver greater business impact c. Estimate business impact due to deployment of model 2. Work with the domain/customer teams to understand business context , data dictionaries and apply relevant Deep Learning solution for the given business challenge 3. Working with tools and scripts for sufficiently pre-processing the data & feature engineering for model development – Python / R / SQL / Cloud data pipelines 4. Design , develop & deploy Deep learning models using Tensorflow / Pytorch 5. Experience in using Deep learning models with text, speech, image and video data a. Design & Develop NLP models for Text Classification, Custom Entity Recognition, Relationship extraction, Text Summarization, Topic Modeling, Reasoning over Knowledge Graphs, Semantic Search using NLP tools like Spacy and opensource Tensorflow, Pytorch, etc b. Design and develop Image recognition & video analysis models using Deep learning algorithms and open source tools like OpenCV c. Knowledge of State of the art Deep learning algorithms 6. Optimize and tune Deep Learnings model for best possible accuracy 7. Use visualization tools/modules to be able to explore and analyze outcomes & for Model validation eg: using Power BI / Tableau 8. Work with application teams, in deploying models on cloud as a service or on-prem a. Deployment of models in Test / Control framework for tracking b. Build CI/CD pipelines for ML model deployment 9. Integrating AI&ML models with other applications using REST APIs and other connector technologies 10. Constantly upskill and update with the latest techniques and best practices. Write white papers and create demonstrable assets to summarize the AIML work and its impact. Technology/Subject Matter Expertise Sufficient expertise in machine learning, mathematical and statistical sciences Use of versioning & Collaborative tools like Git / Github Good understanding of landscape of AI solutions – cloud, GPU based compute, data security and privacy, API gateways, microservices based architecture, big data ingestion, storage and processing, CUDA Programming Develop prototype level ideas into a solution that can scale to industrial grade strength Ability to quantify & estimate the impact of ML models Softskills Profile Curiosity to think in fresh and unique ways with the intent of breaking new ground. Must have the ability to share, explain and “sell” their thoughts, processes, ideas and opinions, even outside their own span of control Ability to think ahead, and anticipate the needs for solving the problem will be important Ability to communicate key messages effectively, and articulate strong opinions in large forums Desirable Experience: Keen contributor to open source communities, and communities like Kaggle Ability to process Huge amount of Data using Pyspark/Hadoop Development & Application of Reinforcement Learning Knowledge of Optimization/Genetic Algorithms Operationalizing Deep learning model for a customer and understanding nuances of scaling such models in real scenarios Optimize and tune deep learning model for best possible accuracy Understanding of stream data processing, RPA, edge computing, AR/VR etc Appreciation of digital ethics, data privacy will be important Experience of working with AI & Cognitive services platforms like Azure ML, IBM Watson, AWS Sagemaker, Google Cloud will all be a big plus Experience in platforms like Data robot, Cognitive scale, H2O.AI etc will all be a big plus

Posted 2 months ago

Apply

5.0 - 10.0 years

20 - 27 Lacs

Hyderabad

Work from Office

Position: Experienced Data Engineer We are seeking a skilled and experienced Data Engineer to join our fast-paced and innovative Data Science team. This role involves building and maintaining data pipelines across multiple cloud-based data platforms. Requirements: A minimum of 5 years of total experience, with at least 3-4 years specifically in Data Engineering on a cloud platform. Key Skills & Experience: Proficiency with AWS services such as Glue, Redshift, S3, Lambda, RDS , Amazon Aurora ,DynamoDB ,EMR, Athena, Data Pipeline , Batch Job. Strong expertise in: SQL and Python DBT and Snowflake OpenSearch, Apache NiFi, and Apache Kafka In-depth knowledge of ETL data patterns and Spark-based ETL pipelines. Advanced skills in infrastructure provisioning using Terraform and other Infrastructure-as-Code (IaC) tools. Hands-on experience with cloud-native delivery models, including PaaS, IaaS, and SaaS. Proficiency in Kubernetes, container orchestration, and CI/CD pipelines. Familiarity with GitHub Actions, GitLab, and other leading DevOps and CI/CD solutions. Experience with orchestration tools such as Apache Airflow and serverless/FaaS services. Exposure to NoSQL databases is a plus

Posted 2 months ago

Apply

8.0 - 11.0 years

35 - 37 Lacs

Kolkata, Ahmedabad, Bengaluru

Work from Office

Dear Candidate, We are hiring a Data Platform Engineer to build scalable infrastructure for data ingestion, processing, and analysis. Key Responsibilities: Architect distributed data systems. Enable data discoverability and quality. Develop data tooling and platform APIs. Required Skills & Qualifications: Experience with Spark, Kafka, and Delta Lake. Proficiency in Python, Scala, or Java. Familiar with cloud-based data platforms. Soft Skills: Strong troubleshooting and problem-solving skills. Ability to work independently and in a team. Excellent communication and documentation skills. Note: If interested, please share your updated resume and preferred time for a discussion. If shortlisted, our HR team will contact you. Kandi Srinivasa Reddy Delivery Manager Integra Technologies

Posted 2 months ago

Apply

4.0 - 9.0 years

7 - 17 Lacs

Pune

Hybrid

Looking for a Data Engineer (Python+ Linux Developer) with a strong background in Python, data engineering, data modeling, SQL Server, and SDLC *Must have : Hands-on experience on Linux with Python, PySpark, Airflow, MySql, Linux* Hybrid work

Posted 2 months ago

Apply

7.0 - 12.0 years

8 - 12 Lacs

Hyderabad

Work from Office

Job Summary: We are looking for an experienced and highly skilled Senior Python Developer with strong hands-on expertise in Snowflake to join our growing data engineering team. The ideal candidate will have a solid background in building scalable data pipelines, data modeling, and integrating Python-based solutions with Snowflake. Roles and Responsibilities: Design, develop, and maintain scalable and efficient data pipelines using Python and Snowflake. Collaborate with data architects and analysts to understand data requirements and translate them into technical solutions. Write complex SQL queries and stored procedures in Snowflake. Optimize Snowflake performance using best practices for data modeling, partitioning, and caching. Develop and deploy Python-based ETL/ELT processes. Integrate Snowflake with other data sources, APIs, or BI tools. Implement and maintain CI/CD pipelines for data solutions. Ensure data quality, governance, and security standards are maintained. Required Skills and Qualifications: Strong programming skills in Python with a focus on data processing and automation. Hands-on experience with Snowflake – including SnowSQL, Snowpipe, data sharing, and performance tuning. Proficiency in SQL and working with large, complex datasets. Experience in designing and implementing ETL/ELT pipelines. Strong understanding of data warehousing concepts and data modeling (star/snowflake schema). Familiarity with cloud platforms such as AWS , Azure , or GCP . Experience with version control (e.g., Git) and CI/CD tools. Excellent problem-solving skills and attention to detail. Preferred Qualifications: Experience with Apache Airflow , DBT , or other workflow orchestration tools. Knowledge of data security and compliance standards . Experience integrating Snowflake with BI tools (Tableau, Power BI, etc.). Certification in Snowflake or relevant cloud platforms is a plus.

Posted 2 months ago

Apply

5.0 - 9.0 years

12 - 17 Lacs

Noida, Pune, Bengaluru

Work from Office

Role : Dataiku Developer Location : Bangalore, Pune, Noida, Chennai Experience : 5-9years Role & Responsibilities Report Development: Design and develop complex tableau reports for scalability, manageability, and re-usability. Work through iterative review cycles to deliver results that meet or exceed user expectations. Ability to interpret technical or dashboard structure and translate complex requirements to technical specifications. Ensure reports meet business requirements and performance standards. Data Modelling: Design and implement data pipelines using Dataiku to support reporting needs. Develop data sources, including SQL queries, stored procedures, and database views. Optimize data models for performance and scalability. Ensure data integrity and accuracy in the reporting environment. Provide recommendations and best practices for reporting and data modelling. Testing and Validation: Validate report outputs against business requirements and data sources. Troubleshoot and resolve issues related to reports and data models. Preferred Candidate Profile: Strong experience as Dataiku or Tableau Developer with strong technical knowledge in data modelling using SQL Experience using tableau dashboard, creating Dataiku data pipeline flows. Experience handling Ashok data analysis on SQL Experience in managing business intelligence reporting using tableau and data engineering on Dataiku. B.Tech/BCA/BE and above candidates are eligible Mandatory Skills : Dataiku Developer, SQL Interested candidates share your updated cv at reshmi.das@sdnaglobal.com Mention subject line as "Applying for Dataiku Developer" Mention your Total Experience, CTC, Exp CTC, Notice Period, Location Note: Before applying kindly read the JD properly.

Posted 2 months ago

Apply

8.0 - 10.0 years

11 - 18 Lacs

Pune

Work from Office

Role Responsibilities : - Design and implement data pipelines using MS Fabric. - Develop data models to support business intelligence and analytics. - Manage and optimize ETL processes for data extraction, transformation, and loading. - Collaborate with cross-functional teams to gather and define data requirements. - Ensure data quality and integrity in all data processes. - Implement best practices for data management, storage, and processing. - Conduct performance tuning for data storage and retrieval for enhanced efficiency. - Generate and maintain documentation for data architecture and data flow. - Participate in troubleshooting data-related issues and implement solutions. - Monitor and optimize cloud-based solutions for scalability and resource efficiency. - Evaluate emerging technologies and tools for potential incorporation in projects. - Assist in designing data governance frameworks and policies. - Provide technical guidance and support to junior data engineers. - Participate in code reviews and ensure adherence to coding standards. - Stay updated with industry trends and best practices in data engineering. Qualifications : - 8+ years of experience in data engineering roles. - Strong expertise in MS Fabric and related technologies. - Proficiency in SQL and relational database management systems. - Experience with data warehousing solutions and data modeling. - Hands-on experience in ETL tools and processes. - Knowledge of cloud computing platforms (Azure, AWS, GCP). - Familiarity with Python or similar programming languages. - Ability to communicate complex concepts clearly to non-technical stakeholders. - Experience in implementing data quality measures and data governance. - Strong problem-solving skills and attention to detail. - Ability to work independently in a remote environment. - Experience with data visualization tools is a plus. - Excellent analytical and organizational skills. - Bachelor's degree in Computer Science, Engineering, or related field. - Experience in Agile methodologies and project management.

Posted 2 months ago

Apply

8.0 - 10.0 years

11 - 18 Lacs

Mumbai

Work from Office

Role Responsibilities : - Design and implement data pipelines using MS Fabric. - Develop data models to support business intelligence and analytics. - Manage and optimize ETL processes for data extraction, transformation, and loading. - Collaborate with cross-functional teams to gather and define data requirements. - Ensure data quality and integrity in all data processes. - Implement best practices for data management, storage, and processing. - Conduct performance tuning for data storage and retrieval for enhanced efficiency. - Generate and maintain documentation for data architecture and data flow. - Participate in troubleshooting data-related issues and implement solutions. - Monitor and optimize cloud-based solutions for scalability and resource efficiency. - Evaluate emerging technologies and tools for potential incorporation in projects. - Assist in designing data governance frameworks and policies. - Provide technical guidance and support to junior data engineers. - Participate in code reviews and ensure adherence to coding standards. - Stay updated with industry trends and best practices in data engineering. Qualifications : - 8+ years of experience in data engineering roles. - Strong expertise in MS Fabric and related technologies. - Proficiency in SQL and relational database management systems. - Experience with data warehousing solutions and data modeling. - Hands-on experience in ETL tools and processes. - Knowledge of cloud computing platforms (Azure, AWS, GCP). - Familiarity with Python or similar programming languages. - Ability to communicate complex concepts clearly to non-technical stakeholders. - Experience in implementing data quality measures and data governance. - Strong problem-solving skills and attention to detail. - Ability to work independently in a remote environment. - Experience with data visualization tools is a plus. - Excellent analytical and organizational skills. - Bachelor's degree in Computer Science, Engineering, or related field. - Experience in Agile methodologies and project management.

Posted 2 months ago

Apply

8.0 - 10.0 years

11 - 18 Lacs

Jaipur

Work from Office

Role Responsibilities : - Design and implement data pipelines using MS Fabric. - Develop data models to support business intelligence and analytics. - Manage and optimize ETL processes for data extraction, transformation, and loading. - Collaborate with cross-functional teams to gather and define data requirements. - Ensure data quality and integrity in all data processes. - Implement best practices for data management, storage, and processing. - Conduct performance tuning for data storage and retrieval for enhanced efficiency. - Generate and maintain documentation for data architecture and data flow. - Participate in troubleshooting data-related issues and implement solutions. - Monitor and optimize cloud-based solutions for scalability and resource efficiency. - Evaluate emerging technologies and tools for potential incorporation in projects. - Assist in designing data governance frameworks and policies. - Provide technical guidance and support to junior data engineers. - Participate in code reviews and ensure adherence to coding standards. - Stay updated with industry trends and best practices in data engineering. Qualifications : - 8+ years of experience in data engineering roles. - Strong expertise in MS Fabric and related technologies. - Proficiency in SQL and relational database management systems. - Experience with data warehousing solutions and data modeling. - Hands-on experience in ETL tools and processes. - Knowledge of cloud computing platforms (Azure, AWS, GCP). - Familiarity with Python or similar programming languages. - Ability to communicate complex concepts clearly to non-technical stakeholders. - Experience in implementing data quality measures and data governance. - Strong problem-solving skills and attention to detail. - Ability to work independently in a remote environment. - Experience with data visualization tools is a plus. - Excellent analytical and organizational skills. - Bachelor's degree in Computer Science, Engineering, or related field. - Experience in Agile methodologies and project management.

Posted 2 months ago

Apply

5.0 - 7.0 years

9 - 13 Lacs

Bengaluru

Work from Office

At Johnson & Johnson, we believe health is everything. Our strength in healthcare innovation empowers us to build a world where complex diseases are prevented, treated, and cured, where treatments are smarter and less invasive, and solutions are personal. Through our expertise in Innovative Medicine and MedTech, we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow, and profoundly impact health for humanity. Learn more at Job Function: Data Analytics & Computational Sciences Job Sub Function: Data Engineering Job Category: Scientific/Technology All Job Posting Locations: Bangalore, Karnataka, India Job Description: Position Summary Johnson & Johnson MedTech is seeking a Sr Eng Data Engineering for Digital Surgery Platform (DSP) in Bangalore, India. Johnson & Johnson (J&J) stands as the worlds leading manufacturer of healthcare products and a service provider in the pharmaceutical and medical device sectors. At Johnson & Johnson MedTechs Digital Surgery Platform, we are groundbreaking the future of healthcare by harnessing the power of people and technology, transitioning to a digital-first MedTech enterprise. With a focus on innovation and an ambitious strategic vision, we are integrating robotic-assisted surgery platforms, connected medical devices, surgical instruments, medical imaging, surgical efficiency solutions, and OR workflow into the next-generation MedTech platform. This initiative will also foster new surgical insights, improve supply chain innovation, use cloud infrastructure, incorporate cybersecurity, collaborate with hospital EMRs, and elevate our digital solutions. We are a diverse and growing team, that nurture creativity, deep understanding of data processing techniques, and the use of sophisticated analytics technologies to deliver results. Overview As a Sr Eng Data Engineering for J&J MedTech Digital Surgery Platform (DSP), you will play a pivotal role in building the modern cloud data platform by demonstrating your in-depth technical expertise and interpersonal skills. In this role, you will be required to focus on accelerating digital product development as part of the multifunctional and fast-paced DSP data platform team and will give to the digital transformation through innovative data solutions. One of the key success criteria for this role is to ensure the quality of DSP software solutions and demonstrate the ability to collaborate effectively with the core infrastructure and other engineering teams and work closely with the DSP security and technical quality partners. Responsibilities Work with platform data engineering, core platform, security, and technical quality to design, implement and deploy data engineering solutions. Develop pipelines for ingestion, transformation, orchestration, and consumption of various types of data. Design and deploy data layering pipelines that use modern Spark based data processing technologies such as Databricks and Delta Live Table (DLT). Integrate data engineering solutions with Azure data governance components not limited to Purview and Databricks Unity Catalog. Implement and support security monitoring solutions within Azure Databricks ecosystem. Design, implement, and support data monitoring solutions in data analytical workspaces. Configure and deploy Databricks Analytical workspaces in Azure with IaC (Terraform, Databricks API) with J&J DevOps automation tools within JPM/Xena framework. Implement automated CICD processes for data processing pipelines. Support DataOps for the distributed DSP data architecture. Function as a data engineering SME within the data platform. Manage authoring and execution of automated test scripts. Build effective partnerships with DSP architecture, core infrastructure and other domains to design and deploy data engineering solutions. Work closely with the DSP Product Managers to understand business needs, translate them to system requirements, demonstrate in-depth understanding of use cases for building prototypes and solutions for data processing pipelines. Operate in SAFe Agile DevOps principles and methodology in building quality DSP technical solutions. Author and implement automated test scripts as mandates DSP quality requirements. Qualifications Required Bachelor s degree or equivalent experience in software or computer science or data engineering. 8+ years of overall IT experience. 5-7 years of experience in cloud computing and data systems. Advanced Python programming skills. Expert level in Azure Databricks Spark technology and data engineering (Python) including Delta Live Tables (DLT). Experience in design and implementation of secure Azure data solutions. In-depth knowledge of the data architecture - infrastructure, network components, data processing Proficiency in building data pipelines in Azure Databricks. Proficiency in configuration and administration of Azure Databricks workspaces and Databricks Unity Catalog. Deep understanding of principles of modern data Lakehouse. Deep understanding of Azure system capabilities, data services, and ability to implement security controls. Proficiency with enterprise DevOps tools including Bitbucket, Jenkins, Artifactory. Experience with DataOps. Experience with quality software systems. Deep understanding of and experience in SAFe Agile. Understanding of SDLC. Preferred Master s degree or equivalent. Proven healthcare experience. Azure Databricks certification. Ability to analyze use cases and translate them into system requirements, make data driven decisions DevOpss automation tools with JPM/Xena framework. Expertise in automated testing. Experience in AI and MLs. Excellent verbal and written communication skills. Ability to travel up to 10% of domestic required. Johnson & Johnson is an Affirmative Action and Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, age, national origin, or protected veteran status and will not be discriminated against on the basis of disability.

Posted 2 months ago

Apply

8.0 - 10.0 years

11 - 18 Lacs

Bengaluru

Work from Office

Company Overview : Zorba Consulting India is a leading consultancy firm focused on delivering innovative solutions and strategies to enhance business performance. With a commitment to excellence, we prioritize collaboration, integrity, and customer-centric values in our operations. Our mission is to empower organizations by transforming data into actionable insights and enabling data-driven decision-making. We are dedicated to fostering a culture of continuous improvement and supporting our team members' professional development. Role Responsibilities : - Design and implement data pipelines using MS Fabric. - Develop data models to support business intelligence and analytics. - Manage and optimize ETL processes for data extraction, transformation, and loading. - Collaborate with cross-functional teams to gather and define data requirements. - Ensure data quality and integrity in all data processes. - Implement best practices for data management, storage, and processing. - Conduct performance tuning for data storage and retrieval for enhanced efficiency. - Generate and maintain documentation for data architecture and data flow. - Participate in troubleshooting data-related issues and implement solutions. - Monitor and optimize cloud-based solutions for scalability and resource efficiency. - Evaluate emerging technologies and tools for potential incorporation in projects. - Assist in designing data governance frameworks and policies. - Provide technical guidance and support to junior data engineers. - Participate in code reviews and ensure adherence to coding standards. - Stay updated with industry trends and best practices in data engineering. Qualifications : - 8+ years of experience in data engineering roles. - Strong expertise in MS Fabric and related technologies. - Proficiency in SQL and relational database management systems. - Experience with data warehousing solutions and data modeling. - Hands-on experience in ETL tools and processes. - Knowledge of cloud computing platforms (Azure, AWS, GCP). - Familiarity with Python or similar programming languages. - Ability to communicate complex concepts clearly to non-technical stakeholders. - Experience in implementing data quality measures and data governance. - Strong problem-solving skills and attention to detail. - Ability to work independently in a remote environment. - Experience with data visualization tools is a plus. - Excellent analytical and organizational skills. - Bachelor's degree in Computer Science, Engineering, or related field. - Experience in Agile methodologies and project management.

Posted 2 months ago

Apply

8.0 - 10.0 years

11 - 18 Lacs

Ahmedabad

Work from Office

Company Overview : Zorba Consulting India is a leading consultancy firm focused on delivering innovative solutions and strategies to enhance business performance. With a commitment to excellence, we prioritize collaboration, integrity, and customer-centric values in our operations. Our mission is to empower organizations by transforming data into actionable insights and enabling data-driven decision-making. We are dedicated to fostering a culture of continuous improvement and supporting our team members' professional development. Role Responsibilities : - Design and implement data pipelines using MS Fabric. - Develop data models to support business intelligence and analytics. - Manage and optimize ETL processes for data extraction, transformation, and loading. - Collaborate with cross-functional teams to gather and define data requirements. - Ensure data quality and integrity in all data processes. - Implement best practices for data management, storage, and processing. - Conduct performance tuning for data storage and retrieval for enhanced efficiency. - Generate and maintain documentation for data architecture and data flow. - Participate in troubleshooting data-related issues and implement solutions. - Monitor and optimize cloud-based solutions for scalability and resource efficiency. - Evaluate emerging technologies and tools for potential incorporation in projects. - Assist in designing data governance frameworks and policies. - Provide technical guidance and support to junior data engineers. - Participate in code reviews and ensure adherence to coding standards. - Stay updated with industry trends and best practices in data engineering. Qualifications : - 8+ years of experience in data engineering roles. - Strong expertise in MS Fabric and related technologies. - Proficiency in SQL and relational database management systems. - Experience with data warehousing solutions and data modeling. - Hands-on experience in ETL tools and processes. - Knowledge of cloud computing platforms (Azure, AWS, GCP). - Familiarity with Python or similar programming languages. - Ability to communicate complex concepts clearly to non-technical stakeholders. - Experience in implementing data quality measures and data governance. - Strong problem-solving skills and attention to detail. - Ability to work independently in a remote environment. - Experience with data visualization tools is a plus. - Excellent analytical and organizational skills. - Bachelor's degree in Computer Science, Engineering, or related field. - Experience in Agile methodologies and project management.

Posted 2 months ago

Apply

8.0 - 10.0 years

11 - 18 Lacs

Chennai

Work from Office

Role Responsibilities : - Design and implement data pipelines using MS Fabric. - Develop data models to support business intelligence and analytics. - Manage and optimize ETL processes for data extraction, transformation, and loading. - Collaborate with cross-functional teams to gather and define data requirements. - Ensure data quality and integrity in all data processes. - Implement best practices for data management, storage, and processing. - Conduct performance tuning for data storage and retrieval for enhanced efficiency. - Generate and maintain documentation for data architecture and data flow. - Participate in troubleshooting data-related issues and implement solutions. - Monitor and optimize cloud-based solutions for scalability and resource efficiency. - Evaluate emerging technologies and tools for potential incorporation in projects. - Assist in designing data governance frameworks and policies. - Provide technical guidance and support to junior data engineers. - Participate in code reviews and ensure adherence to coding standards. - Stay updated with industry trends and best practices in data engineering. Qualifications : - 8+ years of experience in data engineering roles. - Strong expertise in MS Fabric and related technologies. - Proficiency in SQL and relational database management systems. - Experience with data warehousing solutions and data modeling. - Hands-on experience in ETL tools and processes. - Knowledge of cloud computing platforms (Azure, AWS, GCP). - Familiarity with Python or similar programming languages. - Ability to communicate complex concepts clearly to non-technical stakeholders. - Experience in implementing data quality measures and data governance. - Strong problem-solving skills and attention to detail. - Ability to work independently in a remote environment. - Experience with data visualization tools is a plus. - Excellent analytical and organizational skills. - Bachelor's degree in Computer Science, Engineering, or related field. - Experience in Agile methodologies and project management.

Posted 2 months ago

Apply

3.0 - 5.0 years

8 - 12 Lacs

Gurugram, Delhi

Work from Office

Role Description This is a full-time hybrid role for an Apache Nifi Developer based in Gurugram with some work-from-home options. The Apache Nifi Developer will be responsible for designing, developing, and maintaining data workflows and pipelines. The role includes programming, implementing backend web development solutions, using object-oriented programming (OOP) principles, and collaborating with team members to enhance software solutions. Qualifications Knowledge of Apache Nifi and experience in programming Skills in Back-End Web Development and Software Development Data Pipeline Strong understanding of APACHE NIFI Background in Computer Science Excellent problem-solving and analytical skills Ability to work in a hybrid environment Experience in AI and Blockchain is a plus Bachelor's degree in Computer Science or related field

Posted 2 months ago

Apply

8.0 - 10.0 years

11 - 18 Lacs

Hyderabad

Work from Office

Role Responsibilities : - Design and implement data pipelines using MS Fabric. - Develop data models to support business intelligence and analytics. - Manage and optimize ETL processes for data extraction, transformation, and loading. - Collaborate with cross-functional teams to gather and define data requirements. - Ensure data quality and integrity in all data processes. - Implement best practices for data management, storage, and processing. - Conduct performance tuning for data storage and retrieval for enhanced efficiency. - Generate and maintain documentation for data architecture and data flow. - Participate in troubleshooting data-related issues and implement solutions. - Monitor and optimize cloud-based solutions for scalability and resource efficiency. - Evaluate emerging technologies and tools for potential incorporation in projects. - Assist in designing data governance frameworks and policies. - Provide technical guidance and support to junior data engineers. - Participate in code reviews and ensure adherence to coding standards. - Stay updated with industry trends and best practices in data engineering. Qualifications : - 8+ years of experience in data engineering roles. - Strong expertise in MS Fabric and related technologies. - Proficiency in SQL and relational database management systems. - Experience with data warehousing solutions and data modeling. - Hands-on experience in ETL tools and processes. - Knowledge of cloud computing platforms (Azure, AWS, GCP). - Familiarity with Python or similar programming languages. - Ability to communicate complex concepts clearly to non-technical stakeholders. - Experience in implementing data quality measures and data governance. - Strong problem-solving skills and attention to detail. - Ability to work independently in a remote environment. - Experience with data visualization tools is a plus. - Excellent analytical and organizational skills. - Bachelor's degree in Computer Science, Engineering, or related field. - Experience in Agile methodologies and project management.

Posted 2 months ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies