Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
2.0 - 6.0 years
6 - 10 Lacs
pune
Work from Office
You bring systems design experience with the ability to architect and explain complex systems interactions, data flows, common interfaces and APIs. You bring a deep understanding of and experience with software development and programming languages such as Java/Kotlin, and Shell scripting. You have hands-on experience with the following technologies as a senior software developer: Java/Kotlin, Spring, Spring Boot, Wiremock, Docker, Terraform, GCP services (Kubernetes, CloudSQL, PubSub, Storage, Logging, Dashboards), Oracle & amp; Postgres, SQL, PgWeb, Git, Github & amp; Github Actions, GCP Professional Data Engineering certification Data Pipeline Development: Designing, implementing, and optimizing data pipelines on GCP using PySpark for efficient and scalable data processing. ETL Workflow Development: Building and maintaining ETL workflows for extracting, transforming, and loading data into various GCP services. GCP Service Utilization: Leveraging GCP services like BigQuery, Cloud Storage, Dataflow, and Dataproc for data storage, processing, and analysis. Data Transformation: Utilizing PySpark for data manipulation, cleansing, enrichment, and validation. Performance Optimization: Ensuring the performance and scalability of data processing jobs on GCP. Collaboration: Working with data scientists, analysts, and other stakeholders to understand data requirements and translate them into technical solutions. Data Quality and Governance: Implementing and maintaining data quality standards, security measures, and compliance with data governance policies on GCP. Troubleshooting and Support: Diagnosing and resolving issues related to data pipelines and infrastructure. Staying Updated: Keeping abreast of the latest GCP services, PySpark features, and best practices in data engineering. Required Skills: GCP Expertise: Strong understanding of GCP services like BigQuery, Cloud Storage, Dataflow, and Dataproc. PySpark Proficiency: Demonstrated experience in using PySpark for data processing, transformation, and analysis. Python Programming: Solid Python programming skills for data manipulation and scripting. Data Modeling and ETL: Experience with data modeling, ETL processes, and data warehousing concepts. SQL: Proficiency in SQL for querying and manipulating data in relational databases. Big Data Concepts: Understanding of big data principles and distributed computing concepts. Communication and Collaboration: Ability to effectively communicate technical solutions and collaborate with cross-functional teams
Posted 13 hours ago
4.0 - 8.0 years
12 - 22 Lacs
chennai
Work from Office
Role & responsibilities Job Summary: As a GCP Data Engineer you will be responsible for developing, optimizing, and maintaining data pipelines and infrastructure. Your expertise in SQL and Python will be instrumental in managing and transforming data, while your familiarity with cloud technologies will be considered an asset as we explore opportunities to enhance data engineering processes. Job Description: Building scalable Data Pipelines Design, implement, and maintain end-to-end data pipelines to efficiently extract, transform, and load (ETL) data from diverse sources. Ensure data pipelines are reliable, scalable, and performance oriented. SQL Expertise: Write and optimize complex SQL queries for data extraction, transformation, and reporting. Collaborate with analysts and data scientists to provide structured data for analysis. Cloud Platform Experience: Utilize cloud services to enhance data processing and storage capabilities. Work towards the integration of tools into the data ecosystem. Documentation and Collaboration: Document data pipelines, procedures, and best practices to facilitate knowledge sharing. Collaborate closely with cross-functional teams to understand data requirements and deliver solutions. Required skills: 4+ years of experience with SQL, Python, 4+ GCP BigQuery, DataFlow, GCS, Dataproc. 4+ years of experience building out data pipelines from scratch in a highly distributed and fault-tolerant manner. Comfortable with a broad array of relational and non-relational databases. Proven track record of building applications in a data-focused role (Cloud and Traditional Data Warehouse) Experience with CloudSQL, Cloud Functions and Pub/Sub, Cloud Composer etc., Inquisitive, proactive, and interested in learning new tools and techniques. Familiarity with big data and machine learning tools and platforms. Comfortable with open source technologies including Apache Spark, Hadoop, Kafka. Strong oral, written and interpersonal communication skills Comfortable working in a dynamic environment where problems are not always well-defined. We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status.
Posted 16 hours ago
7.0 - 12.0 years
7 - 17 Lacs
pune, chennai, bengaluru
Work from Office
• Handson experience in objectoriented programming using Python, PySpark, APIs, SQL, BigQuery, GCP • Building data pipelines for huge volume of data • Dataflow Dataproc and BigQuery • Deep understanding of ETL concepts
Posted 18 hours ago
8.0 - 12.0 years
25 - 37 Lacs
pune, bengaluru
Work from Office
re looking for an experienced GCP Technical Lead to architect, design, and lead the development of scalable cloud-based solutions. The ideal candidate should have strong expertise in Google Cloud Platform (GCP) , data engineering, and modern cloud-native architectures, along with the ability to mentor a team of engineers. Key Responsibilities Lead the design and development of GCP-based solutions (BigQuery, Dataflow, Composer, Pub/Sub, GKE, etc.) Define cloud architecture best practices and ensure adherence to security, scalability, and performance standards. Collaborate with stakeholders to understand requirements and translate them into technical designs & roadmaps . Lead and mentor a team of cloud/data engineers, providing guidance on technical challenges. Implement and optimize ETL/ELT pipelines , data lake, and data warehouse solutions on GCP. Drive DevOps/CI-CD practices using Cloud Build, Terraform, or similar tools. Ensure cost optimization, monitoring, and governance within GCP environments. Work with cross-functional teams on cloud migrations and modernization projects . Required Skills & Qualifications Strong experience in GCP services : BigQuery, Dataflow, Pub/Sub, Cloud Storage, Composer, GKE, etc. Expertise in data engineering, ETL, and cloud-native development . Hands-on experience with Python, SQL, and Shell scripting . Knowledge of Terraform, Kubernetes, and CI/CD pipelines . Familiarity with data security, IAM, and compliance on GCP . Proven experience in leading technical teams and delivering large-scale cloud solutions. Excellent problem-solving, communication, and leadership skills. Preferred GCP Professional Cloud Architect / Data Engineer certification . Experience with machine learning pipelines (Vertex AI, AI Platform)
Posted 18 hours ago
14.0 - 20.0 years
0 Lacs
maharashtra
On-site
As a Senior Architect - Data & Cloud at our company, you will be responsible for architecting, designing, and implementing end-to-end data pipelines and data integration solutions for varied structured and unstructured data sources and targets. You will need to have more than 15 years of experience in Technical, Solutioning, and Analytical roles, with 5+ years specifically in building and managing Data Lakes, Data Warehouse, Data Integration, Data Migration, and Business Intelligence/Artificial Intelligence solutions on Cloud platforms like GCP, AWS, or Azure. Key Responsibilities: - Translate business requirements into functional and non-functional areas, defining boundaries in terms of Availability, Scalability, Performance, Security, and Resilience. - Architect and design scalable data warehouse solutions on cloud platforms like Big Query or Redshift. - Work with various Data Integration and ETL technologies on Cloud such as Spark, Pyspark/Scala, Dataflow, DataProc, EMR, etc. - Deep knowledge of Cloud and On-Premise Databases like Cloud SQL, Cloud Spanner, Big Table, RDS, Aurora, DynamoDB, Oracle, Teradata, MySQL, DB2, SQL Server, etc. - Exposure to No-SQL databases like Mongo dB, CouchDB, Cassandra, Graph dB, etc. - Experience in using traditional ETL tools like Informatica, DataStage, OWB, Talend, etc. - Collaborate with internal and external stakeholders to design optimized data analytics solutions. - Mentor young talent within the team and contribute to building assets and accelerators. Qualifications Required: - 14-20 years of relevant experience in the field. - Strong understanding of Cloud solutions for IaaS, PaaS, SaaS, Containers, and Microservices Architecture and Design. - Experience with BI Reporting and Dashboarding tools like Looker, Tableau, Power BI, SAP BO, Cognos, Superset, etc. - Knowledge of Security features and Policies in Cloud environments like GCP, AWS, or Azure. - Ability to compare products and tools across technology stacks on Google, AWS, and Azure Cloud. In this role, you will lead multiple data engagements on GCP Cloud for data lakes, data engineering, data migration, data warehouse, and business intelligence. You will interface with multiple stakeholders within IT and business to understand data requirements and take complete responsibility for the successful delivery of projects. Additionally, you will have the opportunity to work in a high-growth startup environment, contribute to the digital transformation journey of customers, and collaborate with a diverse and proactive team of techies. Please note that flexible, remote working options are available to foster productivity and work-life balance.,
Posted 2 days ago
3.0 - 7.0 years
0 Lacs
karnataka
On-site
You will have the opportunity to work at Capgemini, a company that empowers you to shape your career according to your preferences. You will be part of a collaborative community of colleagues worldwide, where you can reimagine what is achievable and contribute to unlocking the value of technology for leading organizations to build a more sustainable and inclusive world. Your Role: - You should have a very good understanding of current work, tools, and technologies being used. - Comprehensive knowledge and clarity on Bigquery, ETL, GCS, Airflow/Composer, SQL, Python are required. - Experience with Fact and Dimension tables, SCD is necessary. - Minimum 3 years of experience in GCP Data Engineering is mandatory. - Proficiency in Java/ Python/ Spark on GCP, with programming experience in Python, Java, or PySpark, SQL. - Hands-on experience with GCS (Cloud Storage), Composer (Airflow), and BigQuery. - Ability to work with handling big data efficiently. Your Profile: - Strong data engineering experience using Java or Python programming languages or Spark on Google Cloud. - Experience in pipeline development using Dataflow or Dataproc (Apache Beam etc). - Familiarity with other GCP services or databases like Datastore, Bigtable, Spanner, Cloud Run, Cloud Functions, etc. - Possess proven analytical skills and a problem-solving attitude. - Excellent communication skills. What you'll love about working here: - You can shape your career with a range of career paths and internal opportunities within the Capgemini group. - Access to comprehensive wellness benefits including health checks, telemedicine, insurance with top-ups, elder care, partner coverage, or new parent support via flexible work. - Opportunity to learn on one of the industry's largest digital learning platforms with access to 250,000+ courses and numerous certifications. About Capgemini: Capgemini is a global business and technology transformation partner, helping organizations accelerate their dual transition to a digital and sustainable world while creating tangible impact for enterprises and society. With a diverse team of over 340,000 members in more than 50 countries, Capgemini leverages its over 55-year heritage to unlock the value of technology for clients across the entire breadth of their business needs. The company delivers end-to-end services and solutions, combining strengths from strategy and design to engineering, fueled by market-leading capabilities in AI, generative AI, cloud, and data, along with deep industry expertise and a strong partner ecosystem.,
Posted 2 days ago
4.0 - 8.0 years
0 Lacs
karnataka
On-site
You have an exciting opportunity for the role of Senior Data Engineer-GCP, ETL, SQL with a mandatory requirement of skills in GCP, ETL, SQL, and Data warehouse. With relevant experience of 4+ years, the job location is remote. Key Responsibilities: - GCP - GCS, PubSub, Dataflow or DataProc, Bigquery, Airflow/Composer, Python(preferred)/Java - ETL on GCP Cloud - Build pipelines (Python/Java) + Scripting, Best Practices, Challenges - Knowledge of Batch and Streaming data ingestion, build End to Data pipelines on GCP - Knowledge of Databases (SQL, NoSQL), On-Premise and On-Cloud, SQL vs No SQL, Types of No-SQL DB (At least 2 databases) - Data Warehouse concepts - Beginner to Intermediate level You are expected to have excellent problem-solving and communication skills. Immediate joiners to 15 days are preferred. If you are looking for a global opportunity in the technology sector, you can register on the world's first & only Global Technology Job Portal at www.iitjobs.com. Additionally, you can download the iitjobs app on the Apple App Store and Google Play Store. Don't forget to refer and earn 50,000! Thank you and best regards, iitjobs, Inc.,
Posted 3 days ago
3.0 - 6.0 years
5 - 8 Lacs
bengaluru
Work from Office
Why this job matters We are searching for a proficient AI/ML engineer who can help us to extract value from our data. The resource will be responsible for E2E processes including data collection, cleaning & pre-processing, training of the models and deployment in all production and non-production environments. What youll be doing Understanding business objectives and developing models that help to achieve them, along with metrics to track their progress. Analysing the ML algorithms that could be used to solve a given problem and ranking them by their success probability. Analysing the ML algorithms that could be used to solve a given problem and ranking them by their success probability. Verifying data quality, and/or ensuring it via data cleaning. Supervising the data acquisition process if more data is needed. Defining validation strategies Defining the pre-processing or feature engineering to be done on given data. Defining data augmentation pipelines. Training models and tuning their hyperparameters. analysing the errors of the model and designing strategies to overcome them. Perform statistical analysis and fine-tuning using test results. Train and retrain systems when necessary. Strong knowledge on model deployment pipeline MLOPS and knowledge of AWS/GCP deployment. Skills Required Proven experience (4 or more years) as a Machine Learning Engineer/ Artificial Intelligence Engineer or similar role. Solving business problems using Machine Learning algorithms, Deep Learning/Neural Network algorithms, Sequential model development, and Time series data modelling. Experience with Computer Vision techniques, Convolutional Neural Networks (CNN), Generative AI, and Large Language Models (LLMs) Experience with deploying models using MLOps pipelines. Proficiency in handling both structured and unstructured data, including SQL, BigQuery, and DataProc. Hands-on experience with API development using frameworks like Flask, Django, and FastAPI. Automating business and functional operations using AIOps. Experience with cloud platforms such as GCP and AWS, and tools like Qlik (Added advantage) Understanding of data structures, data modelling and software architecture. Expertise in visualizing and manipulating big datasets. Deep knowledge of math, probability, statistics and algorithms. Proficiency with Python and basic libraries for machine learning such as scikit-learn and pandas. Knowledge in R or Java is a plus. Proficiency in TensorFlow or Keras and OpenCV is a plus. Excellent communication skills. Team player. Outstanding analytical and problem-solving skill. Familiarity with Linux environment. Low to medium familiarity with JIRA, GIT, Nexus, Jenkins etc is a plus. Minimum educational qualification: BE/B.Tech or similar degree in relevant field. The skills youll need Troubleshooting Agile Development Database Design/Development Debugging Programming/Scripting Microservices/Service Oriented Architecture Version Control IT Security Cloud Computing Continuous Integration/Continuous Deployment Automation & Orchestration Software Testing Application Development Algorithm Design Software Development Lifecycle Decision Making Growth Mindset Inclusive Leadership
Posted 3 days ago
3.0 - 6.0 years
12 - 18 Lacs
hyderabad, chennai
Hybrid
Expert on MLOps, Model lifecycle + Python + PySpark + GCP (BigQuery, Dataproc & Airflow), CI/CD Call or Watsapp (ANUJ - 8249759636) for more details.
Posted 3 days ago
6.0 - 11.0 years
6 - 15 Lacs
chennai
Hybrid
Position Description: Employees in this job function are responsible for designing, building, and maintaining data solutions including data infrastructure, pipelines, etc. for collecting, storing, processing and analyzing large volumes of data efficiently and accurately Key Responsibilities: 1) Collaborate with business and technology stakeholders to understand current and future data requirements 2) Design, build and maintain reliable, efficient and scalable data infrastructure for data collection, storage, transformation, and analysis 3) Plan, design, build and maintain scalable data solutions including data pipelines, data models, and applications for efficient and reliable data workflow 4) Design, implement and maintain existing and future data platforms like data warehouses, data lakes, data lakehouse etc. for structured and unstructured data 5) Design and develop analytical tools, algorithms, and programs to support data engineering activities like writing scripts and automating tasks 6) Ensure optimum performance and identify improvement opportunities Skills Required: Python, Dataproc, dataflow, GCP Cloud Run, Agile Software Development, DataForm, TERRAFORM, Big Query, Data Fusion, GCP, Cloud SQL, KAFKA Experience Required: Bachelor's degree in Computer Science, Engineering, or a related technical field 5+ years of SQL development experience 5+ years of analytics/data product development experience required 3+ years of Google cloud experience with solutions designed and implemented at production scale Experience working in GCP native (or equivalent) services like Big Query, Google Cloud Storage, Dataflow, Dataproc, etc 2+ Experience working with Airflow for scheduling and orchestration of data pipelines 1+ Experience working with Terraform to provision Infrastructure as Code 2 + years professional development experience in Python Experience Preferred: In-depth understanding of Google's product technology (or other cloud platform) and underlying architectures Experience with development eco-system such as Tekton/Cloud Build, Git Experience in working with DBT/Dataform Education Required: Bachelor's Degree Additional Information : You will work on ingesting, transforming, and analyzing large datasets to support the Enterprise Securitization Solution Experience with large scale solution and operationalization of data lakes, data warehouses, and analytics platforms on Google Cloud Platform or other cloud environments is a must Work in collaborative environment that leverages paired programming Work on a small agile team to deliver curated data products Work effectively with product owners, data champions and other technical experts Demonstrate technical knowledge and communication skills with the ability to advocate for well-designed solutions Develop exceptional analytical data products using both streaming and batch ingestion patterns on Google CloudPlatform with solid data warehouse principles Be the Subject Matter Expert in Data Engineering with a focus on GCP native services and other well integrated third-party technologies
Posted 3 days ago
7.0 - 12.0 years
25 - 27 Lacs
hyderabad
Work from Office
3+ years experienced engineer who has worked on GCP environment and its relevant tools/services. (Big Query, Data Proc, Data flow, Cloud Storage, Terraform, Tekton , Cloudrun , Cloud scheduler, Astronomer/Airflow, Pub/sub, Kafka, Cloud Spanner streaming etc) 1 or 2 + years of strong experience in Python development (Object oriented/Functional Programming, Pandas, Pyspark etc) 1 or 2 + years of strong experience in SQL Language (CTEs, Window functions, Aggregate functions etc) Keywords :dataproc,pyspark,data flow,kafka,cloud storage,terraform,oops,cloud spanner,hadoop,java,hive,spark,mapreduce,big data,gcp,aws,javascript,mysql,postgresql,sql server,oracle,bigtable,software development,sql*,python development*,python*,bigquery*,pandas*
Posted 3 days ago
14.0 - 20.0 years
0 Lacs
maharashtra
On-site
Role Overview: As a Principal Architect - Data & Cloud at Quantiphi, you will be responsible for leveraging your extensive experience in technical, solutioning, and analytical roles to architect and design end-to-end data pipelines and data integration solutions for structured and unstructured data sources and targets. You will play a crucial role in building and managing data lakes, data warehouses, data integration, and business intelligence/artificial intelligence solutions on Cloud platforms like GCP, AWS, and Azure. Your expertise will be instrumental in designing scalable data warehouse solutions on Big Query or Redshift and working with various data integration, storage, and pipeline tools on Cloud. Additionally, you will serve as a trusted technical advisor to customers, lead multiple data engagements on GCP Cloud, and contribute to the development of assets and accelerators. Key Responsibilities: - Possess more than 15 years of experience in technical, solutioning, and analytical roles - Have 5+ years of experience in building and managing data lakes, data warehouses, data integration, and business intelligence/artificial intelligence solutions on Cloud platforms like GCP, AWS, and Azure - Ability to understand business requirements, translate them into functional and non-functional areas, and define boundaries in terms of availability, scalability, performance, security, and resilience - Architect, design, and implement end-to-end data pipelines and data integration solutions for structured and unstructured data sources and targets - Work with distributed computing and enterprise environments like Hadoop and Cloud platforms - Proficient in various data integration and ETL technologies on Cloud such as Spark, Pyspark/Scala, Dataflow, DataProc, EMR, etc. - Deep knowledge of Cloud and On-Premise databases like Cloud SQL, Cloud Spanner, Big Table, RDS, Aurora, DynamoDB, Oracle, Teradata, MySQL, DB2, SQL Server, etc. - Exposure to No-SQL databases like Mongo dB, CouchDB, Cassandra, Graph dB, etc. - Design scalable data warehouse solutions on Cloud with tools like S3, Cloud Storage, Athena, Glue, Sqoop, Flume, Hive, Kafka, Pub-Sub, Kinesis, Dataflow, DataProc, Airflow, Composer, Spark SQL, Presto, EMRFS, etc. - Experience with Machine Learning Frameworks like TensorFlow, Pytorch - Understand Cloud solutions for IaaS, PaaS, SaaS, Containers, and Microservices Architecture and Design - Good understanding of BI Reporting and Dashboarding tools like Looker, Tableau, Power BI, SAP BO, Cognos, Superset, etc. - Knowledge of security features and policies in Cloud environments like GCP, AWS, Azure - Work on business transformation projects for moving On-Premise data solutions to Cloud platforms - Serve as a trusted technical advisor to customers and solutions for complex Cloud and Data-related technical challenges - Be a thought leader in architecture design and development of cloud data analytics solutions - Liaise with internal and external stakeholders to design optimized data analytics solutions - Collaborate with SMEs and Solutions Architects from leading cloud providers to present solutions to customers - Support Quantiphi Sales and GTM teams from a technical perspective in building proposals and SOWs - Lead discovery and design workshops with potential customers globally - Design and deliver thought leadership webinars and tech talks with customers and partners - Identify areas for productization and feature enhancement for Quantiphi's product assets Qualifications Required: - Bachelor's or Master's degree in Computer Science, Information Technology, or related field - 14-20 years of experience in technical, solutioning, and analytical roles - Strong expertise in building and managing data lakes, data warehouses, data integration, and business intelligence/artificial intelligence solutions on Cloud platforms like GCP, AWS, and Azure - Proficiency in various data integration, ETL technologies on Cloud, and Cloud and On-Premise databases - Experience with Cloud solutions for IaaS, PaaS, SaaS, Containers, and Microservices Architecture and Design - Knowledge of BI Reporting and Dashboarding tools and security features in Cloud environments Additional Company Details: While technology is the heart of Quantiphi's business, the company attributes its success to its global and diverse culture built on transparency, diversity, integrity, learning, and growth. Working at Quantiphi provides you with the opportunity to be part of a culture that encourages innovation, excellence, and personal growth, fostering a work environment where you can thrive both professionally and personally. Joining Quantiphi means being part of a dynamic team of tech enthusiasts dedicated to translating data into tangible business value for clients. Flexible remote working options are available to promote productivity and work-life balance. ,
Posted 4 days ago
5.0 - 9.0 years
0 Lacs
telangana
On-site
As a Data Engineer at our company, you will play a crucial role in managing and optimizing data processes. Your responsibilities will include: - Designing and developing data pipelines using Python programming - Leveraging GCP services such as Dataflow, Dataproc, BigQuery, Cloud Storage, and Cloud Functions - Implementing data warehousing concepts and technologies - Performing data modeling and ETL processes - Ensuring data quality and adhering to data governance principles To excel in this role, you should possess the following qualifications: - Bachelor's degree in Computer Science, Engineering, or a related field - 5-7 years of experience in data engineering - Proficiency in Python programming - Extensive experience with GCP services - Familiarity with data warehousing and ETL processes - Strong understanding of SQL and database technologies - Experience in data quality and governance - Excellent problem-solving and analytical skills - Strong communication and collaboration abilities - Ability to work independently and in a team environment - Familiarity with version control systems like Git If you are looking to join a dynamic team and work on cutting-edge data projects, this position is perfect for you.,
Posted 4 days ago
4.0 - 10.0 years
0 Lacs
pune, maharashtra
On-site
As a Tech Lead in GCP Data Engineering, you will be responsible for providing technical leadership and strategic direction for GCP data engineering projects. Your key responsibilities will include: - Architecting and delivering end-to-end data solutions by leveraging BigQuery and DataProc. - Defining best practices, coding standards, and governance models for efficient project execution. - Collaborating with business stakeholders and solution architects to design scalable platforms. - Managing and mentoring a team of data engineers to ensure delivery excellence. - Driving innovation by evaluating new GCP services and data engineering tools. To qualify for this role, you should meet the following requirements: - Over 10 years of overall experience with a minimum of 4 years in GCP data engineering. - Demonstrated expertise in BigQuery, PySpark (DataProc), and cloud-native data pipelines. - Strong leadership, team management, and client engagement skills. - Sound understanding of data architecture, governance, and security principles. - Exposure to Informatica and other ETL tools would be considered desirable. If you are passionate about building scalable data platforms and working with cutting-edge GCP technologies, we would love to hear from you!,
Posted 4 days ago
1.0 - 6.0 years
15 - 25 Lacs
bengaluru
Work from Office
We have developed API gateway aggregators using frameworks like Hystrix and spring-cloud-gateway for circuit breaking and parallel processing. Our serving microservices handle more than 15K RPS on normal days and during saledays this can go to 30K RPS. Being a consumer app, these systems have SLAs of ~10ms Our distributed scheduler tracks more than 50 million shipments periodically fromdifferent partners and does async processing involving RDBMS. We use an in-house video streaming platform to support a wide variety of devices and networks. What Youll Do Design and implement scalable and fault-tolerant data pipelines (batch and streaming) using frameworks like Apache Spark , Flink , and Kafka . Lead the design and development of data platforms and reusable frameworks that serve multiple teams and use cases. Build and optimize data models and schemas to support large-scale operational and analytical workloads. Deeply understand Apache Spark internals and be capable of modifying or extending the open-source Spark codebase as needed. Develop streaming solutions using tools like Apache Flink , Spark Structured Streaming . Drive initiatives that abstract infrastructure complexity , enabling ML, analytics, and product teams to build faster on the platform. Champion a platform-building mindset focused on reusability , extensibility , and developer self-service . Ensure data quality, consistency, and governance through validation frameworks, observability tooling, and access controls. Optimize infrastructure for cost, latency, performance , and scalability in modern cloud-native environments . Mentor and guide junior engineers , contribute to architecture reviews, and uphold high engineering standards. Collaborate cross-functionally with product, ML, and data teams to align technical solutions with business needs. What Were Looking For 5-8 years of professional experience in software/data engineering with a focus on distributed data systems . Strong programming skills in Java , Scala , or Python , and expertise in SQL . At least 2 years of hands-on experience with big data systems including Apache Kafka , Apache Spark/EMR/Dataproc , Hive , Delta Lake , Presto/Trino , Airflow , and data lineage tools (e.g., Datahb,Marquez, OpenLineage). Experience implementing and tuning Spark/Delta Lake/Presto at terabyte-scale or beyond. Strong understanding of Apache Spark internals (Catalyst, Tungsten, shuffle, etc.) with experience customizing or contributing to open-source code. Familiarity and worked with modern open-source and cloud-native data stack components such as: Apache Iceberg , Hudi , or Delta Lake Trino/Presto , DuckDB , or ClickHouse,Pinot ,Druid Airflow , Dagster , or Prefect DBT , Great Expectations , DataHub , or OpenMetadata Kubernetes , Terraform , Docker Strong analytical and problem-solving skills , with the ability to debug complex issues in large-scale systems. Exposure to data security, privacy, observability , and compliance frameworks is a plus. Good to Have Contributions to open-source projects in the big data ecosystem (e.g., Spark, Kafka, Hive, Airflow) Hands-on data modeling experience and exposure to end-to-end data pipeline development Familiarity with OLAP data cubes and BI/reporting tools such as Tableau, Power BI, Superset, or Looker Working knowledge of tools and technologies like ELK Stack (Elasticsearch, Logstash, Kibana) , Redis , and MySQL Exposure to backend technologies including RxJava , Spring Boot , and Microservices architecture
Posted 4 days ago
5.0 - 10.0 years
20 - 35 Lacs
bengaluru
Hybrid
EPAM has presence across 40+ countries globally with 55,000 + professionals & numerous delivery centers, Key locations are North America, Eastern Europe, Central Europe, Western Europe, APAC, Mid East & Development Centers in India (Hyderabad, Pune & Bangalore). Location: Bengaluru Work Mode: Hybrid (2-3 days office in a week) Full time Job Description: 5-14 Years of in Big Data & Data related technology experience Expert level understanding of distributed computing principles Expert level knowledge and experience in Apache Spark Hands on programming with Python Experience with building stream-processing systems, using technologies such as Apache Storm or Spark-Streaming Experience with integration of data from multiple data sources such as RDBMS (SQL Server, Oracle), ERP, Files Good understanding of SQL queries, joins, stored procedures, relational schemas Experience with NoSQL databases, such as HBase, Cassandra, MongoDB Knowledge of ETL techniques and frameworks Performance tuning of Spark Jobs Experience with native Cloud data services GCP Ability to lead a team efficiently Experience with designing and implementing Big data solutions Practitioner of AGILE methodology WE OFFER Opportunity to work on technical challenges that may impact across geographies Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications Opportunity to share your ideas on international platforms Sponsored Tech Talks & Hackathons Possibility to relocate to any EPAM office for short and long-term projects Focused individual development Benefit package: • Health benefits, Medical Benefits• Retirement benefits• Paid time off• Flexible benefits Forums to explore beyond work passion (CSR, photography, painting, sports, etc
Posted 4 days ago
3.0 - 8.0 years
6 - 10 Lacs
bengaluru
Work from Office
Choosing Capgemini means choosing a company where you will be empowered to shape your career in the way youd like, where youll be supported and inspired bya collaborative community of colleagues around the world, and where youll be able to reimagine whats possible. Join us and help the worlds leading organizationsunlock the value of technology and build a more sustainable, more inclusive world. Job Role Very good Understanding of current work and the tools and technologies being used. Comprehensive knowledge and clarity on Bigquery, ETL, GCS, Airflow/Composer, SQL, Python. Experience working with Fact and Dimension tables, SCD. Minimum 3 years" experience in GCP Data Engineering. Java/ Python/ Spark on GCP, Programming experience in any one language - either Python or Java or PySpark,SQL. GCS(Cloud Storage), Composer (Airflow) and BigQuery experience. Should have worked on handling big data Your Profile Strong data engineering experience using Java or Python programming languages or Spark on Google Cloud. Pipeline development experience using Dataflow or Dataproc (Apache Beam etc). Any other GCP services or databases like Datastore, Bigtable, Spanner, Cloud Run, Cloud Functions etc. Proven analytical skills and Problem-solving attitude. Excellent Communication Skills. What youll love about working here You can shape yourcareerwith us. We offer a range of career paths and internal opportunities within Capgemini group. You will also get personalized career guidance from our leaders. You will get comprehensive wellness benefits including health checks, telemedicine, insurance with top-ups, elder care, partner coverage or new parent support via flexible work. You will have theopportunity to learnon one of the industry"s largest digital learning platforms, with access to 250,000+ courses and numerous certifications. About Capgemini Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem. Capgemini is a global business and technology transformation partner, helping organizations to accelerate their dual transition to a digital and sustainable world, while creating tangible impact for enterprises and society. It is a responsible and diverse group of 340,000 team members in more than 50 countries. With its strong over 55-year heritage, Capgemini is trusted by its clients to unlock the value of technology to address the entire breadth of their business needs. It delivers end-to-end services and solutions leveraging strengths from strategy and design to engineering, all fueled by its market leading capabilities in AI, generative AI, cloud and data, combined with its deep industry expertise and partner ecosystem.
Posted 4 days ago
4.0 - 6.0 years
10 - 14 Lacs
bengaluru
Work from Office
Google Cloud Platform o GCS, DataProc, Big Query, Data Flow Programming Languages o Java, Scripting Languages like Python, Shell Script, SQL Google Cloud Platform o GCS, DataProc, Big Query, Data Flow 5+ years of experience in IT application delivery with proven experience in agile development methodologies 1 to 2 years of experience in Google Cloud Platform (GCS, DataProc, Big Query, Composer, Data Processing like Data Flow) Mandatory Key SkillsAgile Development Methodology,Bigquery,Python,Dataproc,Sql,Data Flow,Java,Shell Scripting,Agile,Google,Gen,Data Processing,Spark,Hadoop,Hive,Sqoop,Big Data,Scala,Aws,Javascript,Kafka,Airflow,Html,Mysql,Linux,Microsoft Azure,Unix,Jenkins,Gcp*
Posted 4 days ago
8.0 - 13.0 years
13 - 23 Lacs
kolkata, hyderabad, pune
Work from Office
Primary skill - Google BigQuery Experience:: 8+ Yrs GCP Senior Data Engineer
Posted 4 days ago
0.0 - 1.0 years
8 - 10 Lacs
hyderabad
Work from Office
Google Cloud Platform o GCS, DataProc, Big Query, Data Flow Programming Languages o Java, Scripting Languages like Python, Shell Script, SQL Google Cloud Platform o GCS, DataProc, Big Query, Data Flow 5+ years of experience in IT application delivery with proven experience in agile development methodologies 1 to 2 years of experience in Google Cloud Platform (GCS, DataProc, Big Query, Composer, Data Processing like Data Flow) Mandatory Key Skills sql,data flow,java,shell scripting,agile,google,gen,data processing,spark,hadoop,hive,sqoop,big data,scala,aws,javascript,kafka,airflow,html,mysql,linux,microsoft azure,unix,jenkins,gcp*,agile development methodology*,bigquery*,python*,dataproc*
Posted 4 days ago
1.0 - 3.0 years
9 - 12 Lacs
bengaluru
Work from Office
Google Cloud Platform o GCS, DataProc, Big Query, Data Flow Programming Languages o Java, Scripting Languages like Python, Shell Script, SQL Google Cloud Platform o GCS, DataProc, Big Query, Data Flow 5+ years of experience in IT application delivery with proven experience in agile development methodologies 1 to 2 years of experience in Google Cloud Platform (GCS, DataProc, Big Query, Composer, Data Processing like Data Flow) Mandatory Key Skillssql,data flow,java,shell scripting,agile,google,gen,data processing,spark,hadoop,hive,sqoop,big data,scala,aws,javascript,kafka,airflow,html,mysql,linux,microsoft azure,unix,jenkins,gcp*,agile development methodology*,bigquery*,python*,dataproc*
Posted 4 days ago
3.0 - 6.0 years
6 - 8 Lacs
noida
Work from Office
3+ years experienced engineer who has worked on GCP environment and its relevant tools/services. (Big Query, Data Proc, Data flow, Cloud Storage, Terraform, Tekton , Cloudrun , Cloud scheduler, Astronomer/Airflow, Pub/sub, Kafka, Cloud Spanner streaming etc) 1 or 2 + years of strong experience in Python development (Object oriented/Functional Programming, Pandas, Pyspark etc) 1 or 2 + years of strong experience in SQL Language (CTEs, Window functions, Aggregate functions etc)KeywordsPython Development,Python,Bigquery,Pandas,Dataproc,Pyspark,Data Flow,Kafka,Cloud Storage,Terraform,Oops,Cloud Spanner,Hadoop,Java,Hive,Spark,Mapreduce,Big Data,Gcp,Aws,Javascript,Mysql,Postgresql,Sql Server,Oracle,Bigtable,Software Development,Sql*
Posted 4 days ago
4.0 - 8.0 years
22 - 25 Lacs
hyderabad, chennai, bengaluru
Work from Office
3+ years experienced engineer who has worked on GCP environment and its relevant tools/services. (Big Query, Data Proc, Data flow, Cloud Storage, Terraform, Tekton , Cloudrun , Cloud scheduler, Astronomer/Airflow, Pub/sub, Kafka, Cloud Spanner streaming etc) 1 or 2 + years of strong experience in Python development (Object oriented/Functional Programming, Pandas, Pyspark etc) 1 or 2 + years of strong experience in SQL Language (CTEs, Window functions, Aggregate functions etc)Keywordsdataproc,pyspark,data flow,kafka,cloud storage,terraform,oops,cloud spanner,hadoop,java,hive,spark,mapreduce,big data,gcp,aws,javascript,mysql,postgresql,sql server,oracle,bigtable,software development,sql*,python development*,python*,bigquery*,pandas*
Posted 4 days ago
3.0 - 7.0 years
0 Lacs
telangana
On-site
As an MLOps Engineer at our company, you will play a crucial role in building, deploying, and maintaining machine learning models in production using Google Cloud Platform (GCP). You will collaborate closely with data scientists and engineers to ensure the reliability, scalability, and performance of our ML systems. Your expertise in software engineering, data engineering, and machine learning will be utilized to automate ML pipelines, monitor model performance, and troubleshoot any issues that arise. This is an exciting opportunity to work on cutting-edge ML projects and have a significant impact on our business. Key Responsibilities: - Design, develop, and maintain scalable and reliable MLOps pipelines on GCP - Automate the deployment and monitoring of machine learning models in production - Collaborate with data scientists to productionize their models and experiments - Implement CI/CD pipelines for machine learning models - Monitor model performance and identify areas for improvement - Troubleshoot and resolve issues related to ML infrastructure and deployments - Optimize ML pipelines for performance and cost efficiency - Develop and maintain documentation for MLOps processes and infrastructure - Stay up-to-date with the latest MLOps tools and techniques - Ensure compliance with security and data privacy regulations Required Skills & Qualifications: - Bachelor's or Master's degree in Computer Science, Engineering, or a related field - 3+ years of experience in MLOps or a related role - Strong proficiency in Python and PySpark - Extensive experience with Google Cloud Platform (GCP) services, including BigQuery, Airflow, and Dataproc - Experience with containerization technologies such as Docker and Kubernetes - Experience with CI/CD pipelines and automation tools - Solid understanding of machine learning concepts and algorithms - Experience with model monitoring and alerting tools - Excellent problem-solving and communication skills - Ability to work independently and as part of a team - Experience with infrastructure-as-code tools such as Terraform or CloudFormation is a plus (Note: Additional Information section is omitted as it is not provided in the job description),
Posted 5 days ago
10.0 - 14.0 years
0 Lacs
noida, uttar pradesh
On-site
Role Overview: You are required to work as a GCP Data Architect with a total experience of 12+ years. Your relevant experience for engagement should be 10 years. Your primary responsibilities will include maintaining architecture principles, guidelines, and standards, data warehousing, programming in Python/Java, working with Big Data, Data Analytics, and GCP Services. You will be responsible for designing and implementing solutions in various technology domains related to Google Cloud Platform Data Components like BigQuery, BigTable, CloudSQL, Dataproc, Data Flow, Data Fusion, etc. Key Responsibilities: - Maintain architecture principles, guidelines, and standards - Work on Data Warehousing projects - Program in Python and Java for various data-related tasks - Utilize Big Data technologies for data processing and analysis - Implement solutions using GCP Services such as BigQuery, BigTable, CloudSQL, Dataproc, Data Flow, Data Fusion, etc. Qualifications Required: - Strong experience in Big Data including data modeling, design, architecting, and solutioning - Proficiency in programming languages like SQL, Python, and R-Scala - Good Python skills with experience in data visualization tools such as Google Data Studio or Power BI - Knowledge of A/B Testing, Statistics, Google Cloud Platform, Google Big Query, Agile Development, DevOps, Data Engineering, and ETL Data Processing - Migration experience of production Hadoop Cluster to Google Cloud will be an added advantage Additional Company Details: The company is looking for individuals who are experts in Big Query, Dataproc, Data Fusion, Dataflow, Bigtable, Fire Store, CloudSQL, Cloud Spanner, Google Cloud Storage, Cloud Composer, Cloud Interconnect, etc. Relevant certifications such as Google Professional Cloud Architect will be preferred.,
Posted 5 days ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
73564 Jobs | Dublin
Wipro
27625 Jobs | Bengaluru
Accenture in India
22690 Jobs | Dublin 2
EY
20638 Jobs | London
Uplers
15021 Jobs | Ahmedabad
Bajaj Finserv
14304 Jobs |
IBM
14148 Jobs | Armonk
Accenture services Pvt Ltd
13138 Jobs |
Capgemini
12942 Jobs | Paris,France
Amazon.com
12683 Jobs |