Jobs
Interviews

146 Spark Streaming Jobs - Page 2

Setup a job Alert
JobPe aggregates results for easy application access, but you actually apply on the job portal directly.

6.0 - 9.0 years

16 - 22 Lacs

hyderabad, chennai, bengaluru

Hybrid

Required Skills & Experience 6-9 years of experience in Consulting, Data, and Analytics , with a strong background in Databricks solutions . Hands-on experience in designing and implementing big data solutions , including data pipelines for large and complex datasets. Expertise in Python, PySpark , and orchestration frameworks such as Airflow, Oozie, Luigi , etc. Strong understanding of Databricks workspace and its components (workspace, compute, clusters, jobs, Unity Catalog, UC permissions). Ability to configure, schedule, and manage Databricks clusters and jobs , including performance optimization and monitoring. Experience with data modeling , SQL queries, joins, stored procedures, relational schemas , and schema design. Proficiency in working with structured, semi-structured, and unstructured data (CSV, Parquet, Delta Lake, Delta Tables, Delta Live Tables). Strong knowledge of Spark architecture , Spark-Streaming, and analyzing jobs using Spark UI for monitoring and troubleshooting. Knowledge of using REST API endpoints for data consumption and integration with external sources. Proven experience in ETL / Data Warehouse transformation processes and building streaming/real-time data processing solutions. Experience in cloud-based data platforms such as AWS (preferred), GCP, or Azure , including managing data sources like AWS S3. Ability to design and implement end-to-end data ingestion pipelines , ensuring data quality and consistency. Experience with CI/CD pipelines configuration in Databricks . Strong logical structuring, problem-solving, verbal, written, and presentation skills. Ability to work as an individual contributor as well as lead a team in an Agile setup . Capable of delivering and presenting Proof of Concepts (POCs) to stakeholders. Optional / Good to Have Experience in Databricks Administration (user access, Spark tuning, data frame manipulation, etc.). Exposure to real-time data movement solutions with security and encryption protocols. Experience in Application Development or Data Warehousing across enterprise technologies. Life Sciences industry domain knowledge.

Posted 1 week ago

Apply

3.0 - 8.0 years

5 - 9 Lacs

gurugram

Work from Office

Primary Responsibilities: Analyze business requirements & functional specifications Be able to determine the impact of changes in current functionality of the system Interaction with diverse Business Partners and Technical Workgroups Be flexible to collaborate with onshore business, during US business hours Be flexible to support project releases, during US business hours Required Qualifications: Undergraduate degree or equivalent experience 3+ years of working experience in Python, Pyspark, Scala 3+ years of experience working on MS Sql Server and NoSQL DBs like Cassandra, etc. Hands-on working experience in Azure Databricks Solid healthcare domain knowledge Exposure to following DevOps methodology and creating CI/CD deployment pipeline Exposure to following Agile methodology specifically using tools like Rally Ability to understand the existing application codebase, perform impact analysis and update the code when required based on the business logic or for optimization Proven excellent analytical and communication skills (Both verbal and written) Preferred Qualification: Experience in the Streaming application (Kafka, Spark Streaming, etc.).

Posted 1 week ago

Apply

5.0 - 8.0 years

2 - 7 Lacs

hyderabad, chennai, bengaluru

Work from Office

Job Title: Developer Experience: 4-6 Years Work Location: Chennai, TN || Bangalore, KA || Hyderabad, TS Skill Required: Digital : Bigdata and Hadoop Ecosystems Digital : PySpark Job Description: "? Need to work as a developer in Bigdata, Hadoop or Data Warehousing Tools and Cloud Computing ? Work on Hadoop, Hive SQL?s, Spark, Bigdata Eco System Tools.? Experience in working with teams in a complex organization involving multiple reporting lines.? The candidate should have strong functional and technical knowledge to deliver what is required and he/she should be well acquainted with Banking terminologies. ? The candidate should have strong DevOps and Agile Development Framework knowledge.? Create Scala/Spark jobs for data transformation and aggregation? Experience with stream-processing systems like Storm, Spark-Streaming, Flink" Essential Skills: "? Working experience of Hadoop, Hive SQL? s, Spark, Bigdata Eco System Tools.? Should be able to tweak queries and work on performance enhancement. ? The candidate will be responsible for delivering code, setting up environment, connectivity, deploying the code in production after testing. ? The candidate should have strong functional and technical knowledge to deliver what is required and he/she should be well acquainted with Banking terminologies. Occasionally, the candidate may have to be responsible as a primary contact and/or driver for small to medium size projects. ? The candidate should have strong DevOps and Agile Development Framework knowledge ? Preferable to have good technical knowledge on Cloud computing, AWS or Azure Cloud Services.? Strong conceptual and creative problem-solving skills, ability to work with considerable ambiguity, ability to learn new and complex concepts quickly.? Experience in working with teams in a complex organization involving multiple reporting lines? Solid understanding of object-oriented programming and HDFS concepts" Role & responsibilities Preferred candidate profile

Posted 1 week ago

Apply

6.0 - 10.0 years

0 Lacs

noida, uttar pradesh

On-site

The Machine Learning Engineer position based in GGN requires a professional with 6-9 years of experience. The ideal candidate should possess expertise in Spark, SQL, Python/Scala, AWS EMR, AWS S3, ML Life Cycle Management, and Machine Learning Operations (ML Ops). Additionally, experience with Airflow or any other orchestrator is considered a good to have qualification. Experience with Kafka, Spark Streaming, Datadog, and Kubernetes are also valued assets for this role. If you meet these qualifications and are passionate about machine learning, this position could be an excellent fit for you.,

Posted 1 week ago

Apply

7.0 - 12.0 years

10 - 15 Lacs

itanagar

Remote

Contract Duration : 4 Months (Extendable based on Performance) Job Timings : India Evening Shift (till 11 : 30 PM IST) We are looking for an experienced Databricks Tech Lead to join our team on a 4-month extendable contract. The ideal candidate will bring deep expertise in data engineering, big data platforms, and cloud-based data warehouse solutions, with the ability to work in a fast-paced remote environment. Key Responsibilities Lead the design, optimization, and management of large-scale data pipelines using Databricks, Spark (PySpark), and AWS data services. Productionize and deploy Big Data platforms and applications across multi-cloud environments (AWS, Azure, GCP). Build and manage data warehouse solutions, schema evolution, and data versioning. Implement and manage workflow orchestration using Airflow or similar tools. Work with complex business use cases and transform them into scalable data models and architectures. Collaborate with cross-functional teams to deliver high-quality data engineering and analytics solutions. Mentor team members and provide technical leadership on Databricks and modern data technologies. Required Skills & Experience 7+ years of experience in Data Warehouse, ETL, Data Modeling & Reporting. 5+ years of experience in productionizing & deploying Big Data platforms. 3+ years of hands-on Databricks experience. Strong expertise with : SQL, Python, Spark, Airflow AWS S3, Redshift, Hive Data Catalog, Delta Lake, Parquet, Avro Streaming platforms (Spark Streaming, Kafka, Hive) Proven experience in building Enterprise Data Warehouses and implementing frameworks such as Databricks, Apache Spark, Delta Lake, Tableau, Hive Metastore, Kafka, Kubernetes, Docker, and CI/CD pipelines. Hands-on experience with Machine Learning frameworks (TensorFlow, Keras, PyTorch) and MLOps practices. Strong leadership, problem-solving, and communication skills.

Posted 1 week ago

Apply

6.0 - 10.0 years

0 Lacs

noida, uttar pradesh

On-site

As a skilled professional with over 7 years of experience, you will be responsible for reviewing and understanding business requirements to ensure timely completion of development tasks with rigorous testing to minimize defects. Collaborating with a software development team is crucial to implement best practices and enhance the performance of Data applications, meeting client needs effectively. In this role, you will collaborate with various teams within the company and engage with customers to comprehend, translate, define, and design innovative solutions for their business challenges. Your tasks will also involve researching new Big Data technologies to evaluate their maturity and alignment with business and technology strategies. Operating within a rapid and agile development process, you will focus on accelerating speed to market while upholding necessary controls. Your qualifications should include a BE/B.Tech/MCA degree with a minimum of 6 years of IT experience, including 4 years of hands-on experience in design and development using the Hadoop technology stack and various programming languages. Furthermore, you are expected to have proficiency in multiple areas such as Hadoop, HDFS, MR, Spark Streaming, Spark SQL, Spark ML, Kafka/Flume, Apache NiFi, Hortonworks Data Platform, Hive, Pig, Sqoop, NoSQL Databases (HBase, Cassandra, Neo4j, MongoDB), Visualization & Reporting frameworks (D3.js, Zeppelin, Grafana, Kibana, Tableau, Pentaho), Scrapy for web crawling, Elastic Search, Google Analytics data streaming, and Data security protocols (Kerberos, Open LDAP, Knox, Ranger). A strong knowledge of the current technology landscape, industry trends, and experience in Big Data integration with Metadata Management, Data Quality, Master Data Management solutions, structured/unstructured data is essential. Your active participation in the community through articles, blogs, or speaking engagements at conferences will be highly valued in this role.,

Posted 2 weeks ago

Apply

10.0 - 15.0 years

0 Lacs

thiruvananthapuram, kerala

On-site

You are seeking a highly experienced Azure PySpark Solution Architect to lead the design and implementation of scalable data solutions on Microsoft Azure in Trivandrum. Your role will involve leveraging your expertise in Azure services, PySpark, and enterprise-grade solution architecture to drive efficient data processing and analytics workflows. Your responsibilities will include designing and implementing end-to-end data solutions using Azure Data Services and PySpark, developing high-performance ETL pipelines with tools such as Azure Databricks, Azure Data Factory, and Synapse Analytics, and architecting scalable, secure, and cost-efficient cloud solutions aligned with business goals. You will collaborate with data engineers, data scientists, and stakeholders to define technical requirements, optimize big data processing, ensure data governance, security, and compliance, and provide technical leadership for Azure and PySpark-based data solutions. To excel in this role, you must have expertise in Azure Cloud Services such as Azure Databricks, Data Factory, Synapse Analytics, and Azure Storage, along with strong hands-on experience in PySpark for data processing and transformation. A deep understanding of solution architecture, proficiency in SQL, NoSQL databases, and data modeling within Azure ecosystems, and knowledge of CI/CD pipelines, DevOps practices, and Infrastructure-as-Code tools are essential. Your problem-solving skills, communication abilities, and stakeholder management capabilities will be crucial in establishing best practices and optimizing large-scale data workloads. Preferred skills include experience with streaming technologies like Kafka, Event Hubs, or Spark Streaming. Joining UST, a global digital transformation solutions provider with a track record of delivering real impact through innovation, technology, and purpose, will offer you the opportunity to work alongside top companies worldwide. UST's deep domain expertise, future-proof philosophy, and commitment to innovation and agility ensure that you will be part of a team that builds for boundless impact, touching billions of lives in the process.,

Posted 2 weeks ago

Apply

8.0 - 10.0 years

4 - 9 Lacs

pune, chennai, bengaluru

Work from Office

Position: Spark-Scala Work location : Bangalore , Chennai, Hyderabad, Pune, Chandigarh. Type: C2H (Convert into Full-Time Employment (FTE) within 90 days) Experience: 8-10 yrs. Skillsets: Mandatory skills - Spark & Scala Desired skills - Spark Streaming, Hadoop, Hive, SQL, Sqoop, Impala Joining: Immediate joiner or 15 to 30 days

Posted 2 weeks ago

Apply

4.0 - 8.0 years

7 - 16 Lacs

gurugram, chennai, bengaluru

Work from Office

Roles and Responsibilities Collaborate with cross-functional teams to gather requirements and deliver high-quality solutions. Develop scalable and efficient ETL processes for large datasets on cloud platforms like AWS or GCP. Ensure compliance with security standards and best practices in the development of data systems. Design, develop, test, and deploy data pipelines using DataBricks, GCP, Microsoft Azure, Spark, SQL, and Python. Desired Candidate Profile 3.5 -7 years of experience as a Data Engineer with expertise in Big Data technologies such as pySpark (Data frame and SparkSQL), Hadoop, and Hive Good hands-on experience of python and Bash Scripts Strong analytical, problem-solving, data analysis and research skills Demonstrable ability to think outside of the box and not be dependent on readily available tools Excellent communication, presentation and interpersonal skills are a must Hands-on experience with using Cloud Platform provided Big Data technologies Orchestration with Airflow and Any job scheduler experience Experience in migrating workload from on-premises to cloud and cloud to cloud migrations

Posted 2 weeks ago

Apply

5.0 - 10.0 years

20 - 32 Lacs

pune, gurugram

Hybrid

EPAM has presence across 40+ countries globally with 55,000 + professionals & numerous delivery centers, Key locations are North America, Eastern Europe, Central Europe, Western Europe, APAC, Mid East & Development Centers in India (Hyderabad, Pune & Bangalore). Location: Gurgaon/Pune/Hyderabad/Bengaluru/ChennaiWork Mode: Hybrid (2-3 days office in a week) Job Description: 5-14 Years of in Big Data & Data related technology experience Expert level understanding of distributed computing principles Expert level knowledge and experience in Apache Spark Hands on programming with Python Experience with building stream-processing systems, using technologies such as Apache Storm or Spark-Streaming Experience with integration of data from multiple data sources such as RDBMS (SQL Server, Oracle), ERP, Files Good understanding of SQL queries, joins, stored procedures, relational schemas Experience with NoSQL databases, such as HBase, Cassandra, MongoDB Knowledge of ETL techniques and frameworks Performance tuning of Spark Jobs Experience with native Cloud data services AWS Ability to lead a team efficiently Experience with designing and implementing Big data solutions Experince with Databricks WE OFFER Opportunity to work on technical challenges that may impact across geographies Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications Opportunity to share your ideas on international platforms Sponsored Tech Talks & Hackathons Possibility to relocate to any EPAM office for short and long-term projects Focused individual development Benefit package: • Health benefits, Medical Benefits• Retirement benefits• Paid time off• Flexible benefits Forums to explore beyond work passion (CSR, photography, painting, sports, etc

Posted 2 weeks ago

Apply

8.0 - 11.0 years

15 - 25 Lacs

hyderabad, gurugram, bengaluru

Work from Office

Hiring Sr Data Engineer in Bangalore with 8+ years exp in the below skills: Must Have: - Big Data technologies: Hadoop, MapReduce, Spark, Kafka, Flink - Programming languages: Java/ Scala/ Python - Cloud: Azure, AWS, Google Cloud - Docker/Kubernetes Required Candidate profile - Strong in Communication Skills - Experience with relational SQL/ NoSQL databases- Postgres & Cassandra - Experience with ELK stack - Immediate Join is plus - Must be ready to work from office

Posted 2 weeks ago

Apply

2.0 - 6.0 years

0 Lacs

karnataka

On-site

At PwC, individuals in managed services focus on providing outsourced solutions and supporting clients across various functions. By managing key processes and functions on behalf of organisations, they help streamline operations, reduce costs, and enhance efficiency. Skilled in project management, technology, and process optimization, they deliver high-quality services to clients. Those specializing in managed service management and strategy at PwC concentrate on transitioning and running services, managing delivery teams, programmes, commercials, performance, and delivery risk. The role involves continuous improvement, optimizing managed services processes, tools, and services. Your focus lies in building meaningful client connections, managing and inspiring others, and deepening technical expertise while navigating complex situations. Embracing ambiguity, you anticipate the needs of teams and clients to deliver quality service. You are encouraged to ask questions and view unclear paths as opportunities for growth. Upholding professional and technical standards, including PwC tax and audit guidance, the firm's code of conduct, and independence requirements, is essential. As a Data Engineer Offshore - Tera Data, DataStage, AWS, Data Bricks, SQL, Delta Live tables, Delta tables, Spark - Kafka, Spark Streaming, MQ, ETL Associate, you will be responsible for designing, implementing, and maintaining scalable data pipelines and systems to support data-driven initiatives. The ideal candidate will have a Bachelor's degree in computer science/IT or a relevant field, with 2-5 years of experience and proficiency in data technologies such as Teradata, DataStage, AWS, Databricks, SQL, etc. Key responsibilities include designing, developing, and maintaining scalable ETL pipelines, leveraging AWS cloud services, utilizing Databricks for data processing, implementing Delta Live Tables and Delta Tables, working with Apache Spark and Kafka, integrating Spark Streaming, ensuring data quality, integrity, and security, and documenting data processes and technical specifications. Qualifications for this role include a Bachelor's degree in Computer Science or a related field, proven experience as a Data Engineer, proficiency in SQL and relational databases, hands-on experience with AWS services, and familiarity with Apache Spark, Kafka, and Spark Streaming. Preferred skills include experience in data warehousing, big data technologies, data governance, and data security best practices. Certification in AWS or Databricks is a plus.,

Posted 3 weeks ago

Apply

5.0 - 8.0 years

3 - 7 Lacs

hyderabad

Work from Office

Long Description Experienceand Expertise inany of the followingLanguagesat least 1 of them : Java, Scala, Python Experienceand expertise in SPARKArchitecture Experience in the range of 6-10 yrs plus Good Problem SolvingandAnalytical Skills Ability to Comprehend the Business requirementand translate to the Technical requirements Good communicationand collaborative skills with fellow teamandacross Vendors Familiar with development of life cycle includingCI/CD pipelines. Proven experienceand interested in supportingexistingstrategicapplications Familiarity workingwithagile methodology Mandatory Skills: Scala programming.: Experience: 5-8 Years.

Posted 3 weeks ago

Apply

8.0 - 11.0 years

6 - 16 Lacs

hyderabad, chennai, bengaluru

Work from Office

We are seeking an experienced Software Engineer with deep expertise in Scala programming and Big Data technologies to design, develop, and maintain large-scale distributed data processing systems. The ideal candidate will be a hands-on developer with a strong understanding of data pipelines, Spark ecosystem, and related technologies, capable of delivering clean, efficient, and scalable code in an Agile environment. Key Responsibilities Develop and maintain scalable, efficient, and robust data processing pipelines using Scala and Apache Spark (Spark Core, Spark SQL, Spark Streaming). Write clean, maintainable, and well-documented Scala code following industry best practices and coding standards. Design and implement batch and real-time data processing workflows handling large volumes of data. Work closely with cross-functional teams to understand business requirements and translate them into technical solutions that meet quality standards. Utilize Hadoop ecosystem components such as HDFS, Hive, Sqoop, Impala, and related tools to support data storage and retrieval needs. Develop and optimize ETL processes and data warehousing solutions leveraging Big Data technologies. Apply deep knowledge of Data Structures and algorithms to ensure efficient data processing and system performance. Conduct unit testing, code reviews, and performance tuning of data processing jobs. Automate application job scheduling and execution using UNIX shell scripting (advantageous). Participate actively in Agile development processes including daily standups, sprint planning, reviews, and retrospectives. Collaborate effectively with upstream and downstream teams to identify, troubleshoot, and resolve data pipeline issues. Stay current with emerging technologies, frameworks, and industry trends to continuously improve the architecture and implementation of data solutions. Support production environments by handling incidents, root cause analysis, and continuous improvements. Required Skills & Experience Minimum 8 years of professional software development experience with strong emphasis on Scala programming. Extensive experience designing and building distributed data processing pipelines using Apache Spark (Spark Core, Spark SQL, Spark Streaming). Strong understanding of Hadoop ecosystem technologies including HDFS, Hive, Sqoop, Impala , and related tools. Proficient in SQL and NoSQL databases with sound knowledge of database concepts and operations. Familiarity with Data Warehousing concepts and ETL methodologies. Solid foundation in Data Structures, Algorithms, and Object-Oriented Programming. Experience in UNIX/Linux shell scripting to manage and schedule data jobs (preferred). Proven track record of working in Agile software development environments. Excellent problem-solving skills, with the ability to analyze complex issues and provide efficient solutions. Strong verbal and written communication skills, with experience working in diverse, global delivery teams. Ability to manage multiple tasks, collaborate across teams, and adapt to changing priorities. Desired Qualifications Bachelors or Master’s degree in Computer Science, Engineering, or a related technical field. Previous experience working in a global delivery or distributed team environment. Certification or formal training in Big Data technologies or Scala programming is a plus.

Posted 3 weeks ago

Apply

10.0 - 20.0 years

35 - 60 Lacs

mumbai, navi mumbai

Work from Office

We Are Hiring Databricks Data Architect | Navi Mumbai (Onsite) Are you passionate about designing scalable, enterprise-grade data platforms? Join Celebal Technologies and work on cutting-edge Azure Databricks solutions in the manufacturing and energy sectors! Role: Databricks Data Architect Experience: 10+ Years Location: Navi Mumbai (Onsite) Notice Period: Immediate to 30 Days Preferred About the Role We are looking for an experienced Databricks Data Architect with strong expertise in Azure Databricks, data modeling, and big data solutions. Youll be responsible for architecting scalable, cloud-native data platforms, enabling real-time and batch processing for advanced analytics and AI-driven insights Key Skills Azure Databricks | Apache Spark | PySpark Delta Lake | Data Lakes | Data Warehouses | Lakehouse Architectures Kafka / Event Hub | Streaming & Batch Data Processing Data Modeling | ETL / ELT Pipelines | Metadata Management Data Governance | Security & Compliance Manufacturing / Energy Domain Experience (Preferred) Why Join Us? Work on innovative big data & cloud-native solutions Exposure to manufacturing and energy sector projects Collaborative, growth-oriented work environment Be part of a fast-growing leader in data engineering & AI solutions Interested? Let’s connect! Send your resume to Latha.kolla@celebaltech.com/8197451451 #Hiring #Databricks #AzureDatabricks #DataArchitect #BigData #DataEngineering #Lakehouse #DeltaLake #Kafka #Azure #CloudComputing #Manufacturing #Energy #CelebalTechnologies #Careers #JobSearch

Posted 3 weeks ago

Apply

3.0 - 7.0 years

30 - 45 Lacs

hyderabad, chennai

Hybrid

We are seeking a highly skilled Data Engineer with strong hands-on experience in Spark Streaming, PySpark, Python, and Scala. The ideal candidate will have solid working knowledge of Databricks and cloud platforms, preferably AWS (Azure is acceptable). The role involves building scalable data pipelines, processing real-time data, and supporting advanced analytics solutions. Strong problem-solving skills and the ability to work in a fast-paced environment are essential. Key Skills: Spark Streaming PySpark, Python, Scala Databricks AWS (preferred) or Azure ETL/Data Pipeline Development Big Data Processing Experience Required: 3+ years in data engineering or related field.

Posted 3 weeks ago

Apply

3.0 - 7.0 years

20 - 25 Lacs

pune

Work from Office

About the Role: We are looking for a highly motivated Senior Software Engineer, Data Analytics with experience to join our fast-paced engineering team. The ideal candidate takes full ownership of their work, thrives in cross-functional collaboration, and is passionate about building scalable, fault-tolerant big data systems. In this role, you will design and develop high-performance data platforms, mentor junior engineers, and contribute to delivering impactful analytics solutions that drive strategic business decisions. What Youll Do: Design, build, and optimize scalable and fault-tolerant Big Data pipelines for batch and streaming workloads. Develop real-time streaming applications using Apache Spark Streaming or Flink. Work with Snowflake, Hadoop, Kafka, and Spark for large-scale data processing and analytics. Implement workflow orchestration using tools like Apache Airflow, Oozie, or Luigi. Develop backend services and REST APIs to serve analytics and data products. Collaborate with product managers, stack holders, and cross-functional teams to deliver data-driven solutions. Ensure data quality, governance, and security across the data ecosystem. Guide and mentor junior engineers, providing technical leadership and best practice recommendations. Perform code reviews, performance tuning, and troubleshoot distributed system issues. Drive innovation by evaluating and implementing new tools, frameworks, and approaches for data engineering. We'd Love for You to Have: 4-7 years of experience in Big Data & Analytics engineering. Strong programming skills in Java, Scala, or Python. Hands-on experience with Apache Spark, Hadoop, Kafka, and distributed data systems. Proficiency in SQL and experience with Snowflake (preferred) or other cloud data warehouses. Practical experience with workflow orchestration tools such as Airflow, Oozie, or Luigi. Strong foundation in data structures, algorithms, and distributed system design. Familiarity with cloud platforms (AWS, GCP, Azure) and related data services. Experience with containerization and orchestration (Docker, Kubernetes). Exposure to data observability, monitoring tools, and AI/ML integration with data pipelines. Experience in mentoring and guiding team members. Proven track record of working on cross-team collaboration projects. Strong problem-solving skills with the ability to take ownership and deliver end-to-end solutions. Qualifications Should have a bachelors degree in engineering (CS / IT) or equivalent degree from a well-known Institute / University.

Posted 3 weeks ago

Apply

8.0 - 12.0 years

30 - 35 Lacs

pune

Work from Office

About the Role Zywave is seeking a Technical Lead Data Engineering (TurboRater RQR: CPQ Rating) with expertise in Snowflake, SQL Server, and modern data architecture principles including Medallion Architecture and Data Mesh . This role will play a critical part in the TurboRater RQR: CPQ Rating initiative , leading the design and implementation of scalable, secure, and high-performance data pipelines that power rating and CPQ (Configure, Price, Quote) capabilities. The ideal candidate will combine deep technical expertise with insurance domain knowledge to drive innovation and deliver business impact. Key Responsibilities Lead end-to-end design and development of data pipelines supporting TurboRater RQR: CPQ Rating . Architect and implement Medallion Architecture (Bronze, Silver, Gold layers) for structured and semi-structured data. Drive adoption of Data Mesh principles , decentralizing ownership and promoting domain-oriented data products. Collaborate with business/product teams to align CPQ Rating requirements with scalable technical solutions. Ensure data quality, lineage, and governance across rating-related data assets. Optimize workflows for rating performance, scalability, and cost-efficiency . Mentor and guide engineers working on TurboRater initiatives. Stay updated with data & insurtech innovations relevant to CPQ and rating platforms. Qualifications Bachelors or Masters degree in Computer Science, Data Engineering, or related field . 8+ years of experience in data engineering with at least 3 years in technical leadership . Strong hands-on experience with Snowflake (data ingestion, transformation, performance tuning). Proficiency in SQL Server and T-SQL . Deep understanding of Medallion Architecture and Data Mesh principles . Experience with data orchestration tools (Airflow, dbt), cloud platforms (Azure/AWS), and CI/CD pipelines. Strong leadership and problem-solving skills. Knowledge of Python or Scala for data processing. Exposure to real-time data streaming (Kafka, Spark Streaming). Mandatory Skills GIT Snowflake ELT tools SQL Server .NET CPQ Rating / TurboRater exposure preferred Good to Have Skills Prompt Engineering Kafka Spark Streaming AWS dbt Domain Advantage Insurance domain knowledge and prior work in CPQ / Rating platforms will be highly valued. Work Mode: 5 Days Work from Office

Posted 3 weeks ago

Apply

5.0 - 10.0 years

5 - 10 Lacs

bengaluru

Work from Office

The Team The Data Engineering team is responsible for architecting, building, and maintaining our evolving data infrastructure, as well as curating and governing the data assets created on our platform. We work closely with various stakeholders to acquire, process, and refine vast datasets, focusing on creating scalable and optimized data pipelines. Our team possesses broad expertise in critical data domains, technology stacks, and architectural patterns. We foster knowledge sharing and collaboration, resulting in a unified strategy and seamless data management. The Impact: This role is the foundation of the products delivered. The data onboarded is the base for the company as it feeds into the products, platforms, and essential for supporting our advanced analytics and machine learning initiatives. Whats in it for you Be the part of a successful team which works on delivering top priority projects which will directly contribute to Companys strategy. Drive the testing initiatives including supporting Automation strategy, performance, and security testing. This is the place to enhance your Testing skills while adding value to the business. As an experienced member of the team, you will have the opportunity to own and drive a project end to end and collaborate with developers, business analysts and product managers who are experts in their domain which can help you to build multiple skillsets. Responsibilities Design, develop, and maintain scalable and efficient data pipelines to process large volumes of data. To implement ETL processes to acquire, validate, and process incoming data from diverse sources. Collaborate with cross-functional teams, including data scientists, analysts, and software engineers, to understand data requirements and translate them into technical solutions. Implement data ingestion, transformation, and integration processes to ensure data quality, accuracy, and consistency. Optimize Spark jobs and data processing workflows for performance, scalability, and reliability. Troubleshoot and resolve issues related to data pipelines, data processing, and performance bottlenecks. Conduct code reviews and provide constructive feedback to junior team members to ensure code quality and best practices adherence. Stay updated with the latest advancements in Spark and related technologies and evaluate their potential for enhancing existing data engineering processes. Develop and maintain documentation, including technical specifications, data models, and system architecture diagrams. Stay abreast of emerging trends and technologies in the data engineering and big data space and propose innovative solutions to enhance data processing capabilities. What Were Looking For 5+ Years of experience in Data Engineering or related field Strong experience in Python programming with expertise in building data-intensive applications. Proven hands-on experience with Apache Spark, including Spark Core, Spark SQL, Spark Streaming, and Spark MLlib. Solid understanding of distributed computing concepts, parallel processing, and cluster computing frameworks. Proficiency in data modeling, data warehousing, and ETL techniques. Experience with workflow management platforms, preferably Airflow. Familiarity with big data technologies such as Hadoop, Hive, or HBase. Strong Knowledge of SQL and experience with relational databases. Hand on experience with AWS cloud data platform Strong problem-solving and troubleshooting skills, with the ability to analyze complex data engineering issues and provide effective solutions. Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams. Nice to have experience on DataBricks Preferred Qualifications Bachelors degree in Information Technology, Computer Information Systems, Computer Engineering, Computer Science, or other technical discipline Whats In It For You? Our Purpose: Progress is not a self-starter. It requires a catalyst to be set in motion. Information, imagination, people, technologythe right combination can unlock possibility and change the world. At S&P Global we transform data into Essential Intelligence, pinpointing risks and opening possibilities. Our People: Our Values: Integrity, Discovery, Partnership At S&P Global, we focus on Powering Global Markets. We start with a foundation of integrity in all we do, bring a spirit of discovery to our work, and collaborate in close partnership with each other and our customers to achieve shared goals. Benefits: We take care of you, so you cantake care of business. We care about our people. Thats why we provide everything youand your careerneed to thrive at S&P Global. Health & WellnessHealth care coverage designed for the mind and body. Flexible DowntimeGenerous time off helps keep you energized for your time on. Continuous LearningAccess a wealth of resources to grow your career and learn valuable new skills. Invest in Your FutureSecure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs. Family Friendly PerksIts not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families. Beyond the BasicsFrom retail discounts to referral incentive awardssmall perks can make a big difference.

Posted 3 weeks ago

Apply

3.0 - 5.0 years

5 - 8 Lacs

pune

Hybrid

Job Summary: Supports, develops and maintains a data and analytics platform. Effectively and efficiently process, store and make data available to analysts and other consumers. Works with the Business and IT teams to understand the requirements to best leverage the technologies to enable agile data delivery at scale. Key Responsibilities: Implements and automates deployment of our distributed system for ingesting and transforming data from various types of sources (relational, event-based, unstructured). Implements methods to continuously monitor and troubleshoot data quality and data integrity issues. Implements data governance processes and methods for managing metadata, access, retention to data for internal and external users. Develops reliable, efficient, scalable and quality data pipelines with monitoring and alert mechanisms that combine a variety of sources using ETL/ELT tools or scripting languages. Develops physical data models and implements data storage architectures as per design guidelines. Analyzes complex data elements and systems, data flow, dependencies, and relationships in order to contribute to conceptual physical and logical data models. Participates in testing and troubleshooting of data pipelines. Develops and operates large scale data storage and processing solutions using different distributed and cloud based platforms for storing data (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB, others). Uses agile development technologies, such as DevOps, Scrum, Kanban and continuous improvement cycle, for data driven application. External Qualifications and Competencies Competencies: System Requirements Engineering - Uses appropriate methods and tools to translate stakeholder needs into verifiable requirements to which designs are developed; establishes acceptance criteria for the system of interest through analysis, allocation and negotiation; tracks the status of requirements throughout the system lifecycle; assesses the impact of changes to system requirements on project scope, schedule, and resources; creates and maintains information linkages to related artifacts.Collaborates - Building partnerships and working collaboratively with others to meet shared objectives. Communicates effectively - Developing and delivering multi-mode communications that convey a clear understanding of the unique needs of different audiences. Customer focus - Building strong customer relationships and delivering customer-centric solutions.Decision quality - Making good and timely decisions that keep the organization moving forward. Data Extraction - Performs data extract-transform-load (ETL) activitiesfrom variety of sources and transforms them for consumption by various downstream applications and users using appropriate tools and technologies.Programming - Creates, writes and tests computer code, test scripts, and build scripts using algorithmic analysis and design, industry standards and tools, version control, and build and test automation to meet business, technical, security, governance and compliance requirements. Quality Assurance Metrics - Applies the science of measurement to assess whether a solution meets its intended outcomes using the IT Operating Model (ITOM), including the SDLC standards, tools, metrics and key performance indicators, to deliver a quality product. Solution Documentation - Documents information and solution based on knowledge gained as part of product development activities; communicates to stakeholders with the goal of enabling improved productivity and effective knowledge transfer to others who were not originally part of the initial learning. Solution Validation Testing - Validates a configuration item change or solution using the Function's defined best practices, including the Systems Development Life Cycle (SDLC) standards, tools and metrics, to ensure that it works as designed and meets customer requirements. Data Quality - Identifies, understands and corrects flaws in data that supports effective information governance across operational business processes and decision making. Problem Solving - Solves problems and may mentor others on effective problem solving by using a systematic analysis process by leveraging industry standard methodologies to create problem traceability and protect the customer; determines the assignable cause; implements robust, data-based solutions; identifies the systemic root causes and ensures actions to prevent problem reoccurrence are implemented.Values differences - Recognizing the value that different perspectives and cultures bring to an organization. Education, Licenses, Certifications: College, university, or equivalent degree in relevant technical discipline, or relevant equivalent experience required. This position may require licensing for compliance with export controls or sanctions regulations. Experience: Relevant experience preferred such as working in a temporary student employment, intern, co-op, or other extracurricular team activities.Knowledge of the latest technologies in data engineering is highly preferred and includes:- Exposure to Big Data open source- SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka or equivalent college coursework- SQL query language - Clustered compute cloud-based implementation experience- Familiarity developing applications requiring large file movement for a Cloud-based environment- Exposure to Agile software development- Exposure to building analytical solutions- Exposure to IoT technology Additional Responsibilities Unique to this Position it's a Hybrid role with 2 days Work from Office in Pune. Must-Have: 3 to 5 years of experience in data engineering with expertise in Azure Databricks and Scala/Python . Proven track record in developing efficient pipelines. Hands-on experience with Spark (Scala/PySpark) and SQL . Strong understanding of Spark Streaming , Spark Internals , and Query Optimization . Skilled in optimizing and troubleshooting batch/streaming data pipeline issues. Proficient in Azure Cloud Services (Azure Databricks, ADLS, EventHub, EventGrid, etc.). Experienced in unit testing of ETL/ELT pipelines. Expertise with CI/CD tools for automating deployments. Knowledgeable in big data storage strategies (optimization and performance). Strong problem-solving skills. Good understanding of data models (SQL/NoSQL), including Delta Lake or Lakehouse. Exposure to Agile software development methodologies. Quick learner with adaptability to new technologies. Work Schedule:Most of the work will be with stakeholders in the US, with an overlap of 2-3 hours during EST hours on a need basis.

Posted 3 weeks ago

Apply

5.0 - 8.0 years

7 - 11 Lacs

pune

Work from Office

Position Overview We are seeking a skilled and experienced Senior PySpark Developer with expertise in Apache spark, Spark Batch, and Spark Streaming to join our dynamic team. The ideal candidate will design, develop, and maintain high-performance, scalable applications for processing large-scale data in batch and real-time environments. Required Skills and Qualifications Experience : 7+ years of professional experience in PySpark development. Technical Skills : Strong proficiency in PySpark Deep understanding of Apache Spark architecture, distributed computing concepts and parallel processing. Proven experience in building and optimizing complex ETL pipeline using PySpark Experience with various spark components Expertise in Spark Batch for large-scale data processing and analytics. Experience with Spark Streaming for real-time data processing and streaming pipelines. Familiarity with distributed computing concepts and big data frameworks. Mandatory Skills: PySpark.Experience: 5-8 Years.

Posted 3 weeks ago

Apply

5.0 - 11.0 years

0 - 0 Lacs

chennai, tamil nadu

On-site

You will be working as an offshore Senior Developer in Chennai with expertise in Databricks and a willingness to adapt to new technologies. You will collaborate effectively with the team and the position involves replacing an existing Senior Developer. This role is long-term and may be renewed annually. As a team player, you must possess a minimum of 5 years of experience in IT development along with strong analytical and problem-solving skills. Your responsibilities will include designing solutions, conducting code reviews, and providing guidance to junior engineers. Proficiency in SQL, backend development, and experience in data-driven projects is essential. Your expertise should cover Python/PySpark, SQL, SCALA, Spark/Spark Streaming, Databricks, Big Data Tool Set, Linux, and Kafka. Moreover, you should have a proven track record of collaborating with development teams, project managers, and engineers. Excellent communication and teamwork skills are crucial for this role. The ideal candidate should have 5 to 11 years of relevant experience with a salary ranging from 6 Lac to 20 Lac P.A. The industry focus is on IT Software - Application Programming / Maintenance. The desired qualifications for this position include B.C.A, B.Sc, B.E, B.Tech, M.C.A, M.Sc, or M.Tech. Key skills required for this role are SQL, Python, Pyspark, Databricks, SCALA, Spark Streaming, Big Data Tool, Linux, and Kafka.,

Posted 1 month ago

Apply

2.0 - 6.0 years

0 Lacs

kolkata, west bengal

On-site

At EY, you'll have the chance to build a career as unique as you are, with the global scale, support, inclusive culture, and technology to become the best version of you. And we're counting on your unique voice and perspective to help EY become even better, too. Join us and build an exceptional experience for yourself, and a better working world for all. As a Staff - Data Engineer at EY, your responsibilities include designing and developing software components using various tools such as PySpark, Sqoop, Flume, Azure Databricks, and more. You will perform detailed analysis and effectively interact with onshore/offshore team members, ensuring all deliverables conform to the highest quality standards and are executed in a timely manner. This role is deadline-oriented and may require working under the US time schedule. Additionally, you will identify areas of improvement, conduct performance tests, consult with the design team, ensure high performance of applications, and work well with development/product engineering teams. To be successful in this role, you should have 2-4 years of experience in BCM or WAM industry, preferably with exposure to US-based asset management or fund administration firms. You should have a strong understanding of data in BCM/WAM space, including knowledge of KDEs such as Funds, Positions, Transactions, Trail Balance, Securities, Investors, and more. Proficiency in programming languages like Python, hands-on experience with Big Data tools such as PySpark, Sqoop, Hive, and Hadoop Cluster, as well as Cloud technologies like Azure Databricks are essential. Expertise in databases like Oracle, SQL Server, and exposure to Big Data is a plus. Knowledge of Data Visualization tools and the ability to write programs for file/data validations, EDA, and data cleansing are also desired. As an ideal candidate, you should be highly data-driven, capable of writing complex data transformation programs using PySpark and Python, and have experience in data integration and processing using Spark. Hands-on experience in creating real-time data streaming solutions using Spark Streaming and Flume, as well as handling large datasets and writing Spark jobs and hive queries for data analysis, are valuable assets. Experience working in an agile environment will be beneficial for this role. Join EY in building a better working world, where diverse teams across assurance, consulting, law, strategy, tax, and transactions help clients grow, transform, and operate. EY aims to create long-term value for clients, people, and society, while building trust in the capital markets through data and technology-enabled solutions worldwide.,

Posted 1 month ago

Apply

4.0 - 8.0 years

0 Lacs

karnataka

On-site

As a Senior AWS Data Engineer Cloud Data Platform at Teamware Solutions, a division of Quantum Leap Consulting Pvt. Ltd, located in Bangalore, you will be responsible for end-to-end implementation of Cloud data engineering solutions like Enterprise Data lake and Data hub in AWS. Working onsite in an office environment for 5 days a week, you will collaborate with the Offshore Manager and Onsite Business Analyst to understand the requirements and deliver scalable, distributed, cloud-based enterprise data solutions. You should have a strong background in AWS cloud technology, with 4-8 years of hands-on experience. Proficiency in architecting and delivering highly scalable solutions is a must, along with expertise in Cloud data engineering solutions, Lambda or Kappa Architectures, Data Management concepts, and Data Modelling. You should be proficient in AWS services such as EMR, Glue, S3, Redshift, and DynamoDB, as well as have experience in Big Data frameworks like Hadoop and Spark. Additionally, you must have hands-on experience with AWS compute and storage services, AWS Streaming Services, troubleshooting and performance tuning in Spark framework, and knowledge of Application DevOps tools like Git and CI/CD Frameworks. Familiarity with AWS CloudWatch, Cloud Trail, Account Config, Config Rules, security, key management, data migration processes, and analytical skills is required. Good communication and presentation skills are essential for this role. Desired skills include experience in building stream-processing systems, Big Data ML toolkits, Python, Offshore/Onsite Engagements, flow tools like Airflow, Nifi or Luigi, and AWS services like STEP & Lambda. A professional background in BE/B.Tech/MCA/M.Sc/M.E/M.Tech/MBA is preferred, and an AWS certified Data Engineer certification is recommended. If you are interested in this position and meet the qualifications mentioned above, please send your resume to netra.s@twsol.com.,

Posted 1 month ago

Apply

7.0 - 11.0 years

0 Lacs

pune, maharashtra

On-site

About the job: At Citi, we're not just building technology, we're building the future of banking. Encompassing a broad range of specialties, roles, and cultures, our teams are creating innovations used across the globe. Citi is constantly growing and progressing through our technology, with a laser focus on evolving the ways of doing things. As one of the world's most global banks, we're changing how the world does business. Shape your Career with Citi We're currently looking for a high-caliber professional to join our team as AVP- Data Engineer based in Pune, India. Being part of our team means that we'll provide you with the resources to meet your unique needs, empower you to make healthy decisions, and manage your financial well-being to help plan for your future. For instance: - We provide programs and services for your physical and mental well-being, including access to telehealth options, health advocates, confidential counseling, and more. Coverage varies by country. - We empower our employees to manage their financial well-being and help them plan for the future. - We provide access to an array of learning and development resources to help broaden and deepen your skills and knowledge as your career progresses. In this role, you're expected to: Responsibilities: - Data Pipeline Development, Design & Automation: - Design and implement efficient database structures to ensure optimal performance and support analytics. - Design, implement, and optimize secure data pipelines to ingest, process, and store large volumes of structured and unstructured data from diverse sources, including vulnerability scans, security tools, and assessments. - Work closely with stakeholders to provide clean, structured datasets that enable advanced analytics and insights into cybersecurity risks, trends, and remediation activities. Technical Competencies: - 7+ years of Hands-on experience with Scala & Hands-on experience with Spark. - 10+ years of experience in designing and developing Data Pipelines for Data Ingestion or Transformation using Spark with Scala. - Good experience in Big Data technologies (HDFS, Hive, Apache Spark, Spark-SQL, Spark Streaming, Spark jobs optimization & Kafka). - Good knowledge of Exposure to various file formats (JSON, AVRO, Parquet). - Knowledge of agile (scrum) development methodology is a plus. - Strong development/automation skills. - Right attitude to participate and contribute through all phases of Development Lifecycle. - Secondary Skillset: No SQL, Starburst, Python. - Optional: Java Spring, Kubernetes, Docker. Competencies (Soft skills): - Strong communication skills. - Candidate should be responsible for reporting to both business and technology senior management. - Need to work with stakeholders and keep them updated on developments, estimation, delivery, and issues. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity, review Accessibility at Citi. View Citi's EEO Policy Statement and the Know Your Rights poster.,

Posted 1 month ago

Apply
cta

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Featured Companies