Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
7.0 years
8 - 9 Lacs
Thiruvananthapuram
On-site
7 - 9 Years 4 Openings Trivandrum Role description Role Proficiency: This role requires proficiency in developing data pipelines including coding and testing for ingesting wrangling transforming and joining data from various sources. The ideal candidate should be adept in ETL tools like Informatica Glue Databricks and DataProc with strong coding skills in Python PySpark and SQL. This position demands independence and proficiency across various data domains. Expertise in data warehousing solutions such as Snowflake BigQuery Lakehouse and Delta Lake is essential including the ability to calculate processing costs and address performance issues. A solid understanding of DevOps and infrastructure needs is also required. Outcomes: Act creatively to develop pipelines/applications by selecting appropriate technical options optimizing application development maintenance and performance through design patterns and reusing proven solutions. Support the Project Manager in day-to-day project execution and account for the developmental activities of others. Interpret requirements create optimal architecture and design solutions in accordance with specifications. Document and communicate milestones/stages for end-to-end delivery. Code using best standards debug and test solutions to ensure best-in-class quality. Tune performance of code and align it with the appropriate infrastructure understanding cost implications of licenses and infrastructure. Create data schemas and models effectively. Develop and manage data storage solutions including relational databases NoSQL databases Delta Lakes and data lakes. Validate results with user representatives integrating the overall solution. Influence and enhance customer satisfaction and employee engagement within project teams. Measures of Outcomes: TeamOne's Adherence to engineering processes and standards TeamOne's Adherence to schedule / timelines TeamOne's Adhere to SLAs where applicable TeamOne's # of defects post delivery TeamOne's # of non-compliance issues TeamOne's Reduction of reoccurrence of known defects TeamOne's Quickly turnaround production bugs Completion of applicable technical/domain certifications Completion of all mandatory training requirementst Efficiency improvements in data pipelines (e.g. reduced resource consumption faster run times). TeamOne's Average time to detect respond to and resolve pipeline failures or data issues. TeamOne's Number of data security incidents or compliance breaches. Outputs Expected: Code: Develop data processing code with guidance ensuring performance and scalability requirements are met. Define coding standards templates and checklists. Review code for team and peers. Documentation: Create/review templates checklists guidelines and standards for design/process/development. Create/review deliverable documents including design documents architecture documents infra costing business requirements source-target mappings test cases and results. Configure: Define and govern the configuration management plan. Ensure compliance from the team. Test: Review/create unit test cases scenarios and execution. Review test plans and strategies created by the testing team. Provide clarifications to the testing team. Domain Relevance: Advise data engineers on the design and development of features and components leveraging a deeper understanding of business needs. Learn more about the customer domain and identify opportunities to add value. Complete relevant domain certifications. Manage Project: Support the Project Manager with project inputs. Provide inputs on project plans or sprints as needed. Manage the delivery of modules. Manage Defects: Perform defect root cause analysis (RCA) and mitigation. Identify defect trends and implement proactive measures to improve quality. Estimate: Create and provide input for effort and size estimation and plan resources for projects. Manage Knowledge: Consume and contribute to project-related documents SharePoint libraries and client universities. Review reusable documents created by the team. Release: Execute and monitor the release process. Design: Contribute to the creation of design (HLD LLD SAD)/architecture for applications business components and data models. Interface with Customer: Clarify requirements and provide guidance to the Development Team. Present design options to customers. Conduct product demos. Collaborate closely with customer architects to finalize designs. Manage Team: Set FAST goals and provide feedback. Understand team members' aspirations and provide guidance and opportunities. Ensure team members are upskilled. Engage the team in projects. Proactively identify attrition risks and collaborate with BSE on retention measures. Certifications: Obtain relevant domain and technology certifications. Skill Examples: Proficiency in SQL Python or other programming languages used for data manipulation. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery). Conduct tests on data pipelines and evaluate results against data quality and performance specifications. Experience in performance tuning. Experience in data warehouse design and cost improvements. Apply and optimize data models for efficient storage retrieval and processing of large datasets. Communicate and explain design/development aspects to customers. Estimate time and resource requirements for developing/debugging features/components. Participate in RFP responses and solutioning. Mentor team members and guide them in relevant upskilling and certification. Knowledge Examples: Knowledge Examples Knowledge of various ETL services used by cloud providers including Apache PySpark AWS Glue GCP DataProc/Dataflow Azure ADF and ADLF. Proficient in SQL for analytics and windowing functions. Understanding of data schemas and models. Familiarity with domain-related data. Knowledge of data warehouse optimization techniques. Understanding of data security concepts. Awareness of patterns frameworks and automation practices. Additional Comments: We are seeking a highly experienced Senior Data Engineer to design, develop, and optimize scalable data pipelines in a cloud-based environment. The ideal candidate will have deep expertise in PySpark, SQL, Azure Databricks, and experience with either AWS or GCP. A strong foundation in data warehousing, ELT/ETL processes, and dimensional modeling (Kimball/star schema) is essential for this role. Must-Have Skills 8+ years of hands-on experience in data engineering or big data development. Strong proficiency in PySpark and SQL for data transformation and pipeline development. Experience working in Azure Databricks or equivalent Spark-based cloud platforms. Practical knowledge of cloud data environments – Azure, AWS, or GCP. Solid understanding of data warehousing concepts, including Kimball methodology and star/snowflake schema design. Proven experience designing and maintaining ETL/ELT pipelines in production. Familiarity with version control (e.g., Git), CI/CD practices, and data pipeline orchestration tools (e.g., Airflow, Azure Data Factory Skills Azure Data Factory,Azure Databricks,Pyspark,Sql About UST UST is a global digital transformation solutions provider. For more than 20 years, UST has worked side by side with the world’s best companies to make a real impact through transformation. Powered by technology, inspired by people and led by purpose, UST partners with their clients from design to operation. With deep domain expertise and a future-proof philosophy, UST embeds innovation and agility into their clients’ organizations. With over 30,000 employees in 30 countries, UST builds for boundless impact—touching billions of lives in the process.
Posted 1 week ago
5.0 - 7.0 years
0 Lacs
Noida
On-site
5 - 7 Years 2 Openings Noida Role description Role Proficiency: This role requires proficiency in data pipeline development including coding and testing data pipelines for ingesting wrangling transforming and joining data from various sources. Must be skilled in ETL tools such as Informatica Glue Databricks and DataProc with coding expertise in Python PySpark and SQL. Works independently and has a deep understanding of data warehousing solutions including Snowflake BigQuery Lakehouse and Delta Lake. Capable of calculating costs and understanding performance issues related to data solutions. Outcomes: Act creatively to develop pipelines and applications by selecting appropriate technical options optimizing application development maintenance and performance using design patterns and reusing proven solutions.rnInterpret requirements to create optimal architecture and design developing solutions in accordance with specifications. Document and communicate milestones/stages for end-to-end delivery. Code adhering to best coding standards debug and test solutions to deliver best-in-class quality. Perform performance tuning of code and align it with the appropriate infrastructure to optimize efficiency. Validate results with user representatives integrating the overall solution seamlessly. Develop and manage data storage solutions including relational databases NoSQL databases and data lakes. Stay updated on the latest trends and best practices in data engineering cloud technologies and big data tools. Influence and improve customer satisfaction through effective data solutions. Measures of Outcomes: Adherence to engineering processes and standards Adherence to schedule / timelines Adhere to SLAs where applicable # of defects post delivery # of non-compliance issues Reduction of reoccurrence of known defects Quickly turnaround production bugs Completion of applicable technical/domain certifications Completion of all mandatory training requirements Efficiency improvements in data pipelines (e.g. reduced resource consumption faster run times). Average time to detect respond to and resolve pipeline failures or data issues. Number of data security incidents or compliance breaches. Outputs Expected: Code Development: Develop data processing code independently ensuring it meets performance and scalability requirements. Define coding standards templates and checklists. Review code for team members and peers. Documentation: Create and review templates checklists guidelines and standards for design processes and development. Create and review deliverable documents including design documents architecture documents infrastructure costing business requirements source-target mappings test cases and results. Configuration: Define and govern the configuration management plan. Ensure compliance within the team. Testing: Review and create unit test cases scenarios and execution plans. Review the test plan and test strategy developed by the testing team. Provide clarifications and support to the testing team as needed. Domain Relevance: Advise data engineers on the design and development of features and components demonstrating a deeper understanding of business needs. Learn about customer domains to identify opportunities for value addition. Complete relevant domain certifications to enhance expertise. Project Management: Manage the delivery of modules effectively. Defect Management: Perform root cause analysis (RCA) and mitigation of defects. Identify defect trends and take proactive measures to improve quality. Estimation: Create and provide input for effort and size estimation for projects. Knowledge Management: Consume and contribute to project-related documents SharePoint libraries and client universities. Review reusable documents created by the team. Release Management: Execute and monitor the release process to ensure smooth transitions. Design Contribution: Contribute to the creation of high-level design (HLD) low-level design (LLD) and system architecture for applications business components and data models. Customer Interface: Clarify requirements and provide guidance to the development team. Present design options to customers and conduct product demonstrations. Team Management: Set FAST goals and provide constructive feedback. Understand team members' aspirations and provide guidance and opportunities for growth. Ensure team engagement in projects and initiatives. Certifications: Obtain relevant domain and technology certifications to stay competitive and informed. Skill Examples: Proficiency in SQL Python or other programming languages used for data manipulation. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery). Conduct tests on data pipelines and evaluate results against data quality and performance specifications. Experience in performance tuning of data processes. Expertise in designing and optimizing data warehouses for cost efficiency. Ability to apply and optimize data models for efficient storage retrieval and processing of large datasets. Capacity to clearly explain and communicate design and development aspects to customers. Ability to estimate time and resource requirements for developing and debugging features or components. Knowledge Examples: Knowledge Examples Knowledge of various ETL services offered by cloud providers including Apache PySpark AWS Glue GCP DataProc/DataFlow Azure ADF and ADLF. Proficiency in SQL for analytics including windowing functions. Understanding of data schemas and models relevant to various business contexts. Familiarity with domain-related data and its implications. Expertise in data warehousing optimization techniques. Knowledge of data security concepts and best practices. Familiarity with design patterns and frameworks in data engineering. Additional Comments: Skills Cloud Platforms ( AWS, MS Azure, GC etc.) Containerization and Orchestration ( Docker, Kubernetes etc..) APIs - Change APIs to APIs development Data Pipeline construction using languages like Python, PySpark, and SQL Data Streaming (Kafka and Azure Event Hub etc..) Data Parsing ( Akka and MinIO etc..) Database Management ( SQL and NoSQL, including Clickhouse, PostgreSQL etc..) Agile Methodology ( Git, Jenkins, or Azure DevOps etc..) JS like Connectors/ framework for frontend/backend Collaboration and Communication Skills Aws Cloud,Azure Cloud,Docker,Kubernetes About UST UST is a global digital transformation solutions provider. For more than 20 years, UST has worked side by side with the world’s best companies to make a real impact through transformation. Powered by technology, inspired by people and led by purpose, UST partners with their clients from design to operation. With deep domain expertise and a future-proof philosophy, UST embeds innovation and agility into their clients’ organizations. With over 30,000 employees in 30 countries, UST builds for boundless impact—touching billions of lives in the process.
Posted 1 week ago
4.0 - 10.0 years
0 Lacs
Gurugram, Haryana, India
On-site
About the Role 4-10 years of experience in the role of implementation of high end software products. Responsibilities Visualize and evangelize next generation infrastructure in Cloud platform/Big Data space (Batch, Near Real-time, Real-time technologies). Passionate for continuous learning, experimenting, applying and contributing towards cutting edge open source technologies and software paradigms. Developing and implementing an overall organizational data strategy that is in line with business processes. The strategy includes data model designs, database development standards, implementation and management of data warehouses and data analytics systems. Expert-level proficiency in at-least 4-5 GCP services. Experience with technical solutions based on industry standards using GCP - IaaS, PaaS and SaaS capabilities. Strong understanding and experience in distributed computing frameworks, particularly. Experience working within a Linux computing environment, and use of command line tools including knowledge of shell/Python scripting for automating common tasks. Required Skills Must have: Operating knowledge of cloud computing platforms (GCP, especially Big Query, Dataflow, Dataproc, Storage, VMs, Networking, Pub Sub, Cloud Functions, Composer services). Operating knowledge of cloud computing platforms (GCP, especially Big Query, Dataflow, Dataproc, Storage, VMs, Networking, Pub Sub, Cloud Functions, Composer services). Expert-level proficiency in at-least 4-5 GCP services. Strong understanding and experience in distributed computing frameworks. Experience working within a Linux computing environment. Preferred Skills Should be aware with columnar database e.g parquet, ORC etc. Experience with technical solutions based on industry standards using GCP - IaaS, PaaS and SaaS capabilities.
Posted 1 week ago
0 years
0 Lacs
Gurugram, Haryana, India
On-site
we are hiring good GCP Data Engineers for Gurgaon Location. We are looking candidate should have strong experience in Bigdata, Pyspark, Python Or Java AND GCP, GCS, Bigquery, Dataflow, Dataproc, Pub sub, Storage. If you are good & strong on these expertise & can join us in 0-30 Days period. Please do share your resume at vaishali.tyagi@impetus.com Required Skill-Set Able to effectively use GCP managed services e.g. Dataproc, Dataflow, pub/sub, Cloud functions, Big Query, GCS - At least 4 of these Services. Good to have knowledge on Cloud Composer, Cloud SQL, Big Table, Cloud Function. Strong experience in Big Data technologies – Hadoop, Sqoop, Hive and Spark including DevOPs. Good hands on expertise on either Python or Java programming. Good Understanding of GCP core services like Google cloud storage, Google compute engine, Cloud SQL, Cloud IAM. Good to have knowledge on GCP services like App engine, GKE, Cloud Run, Cloud Built, Anthos. Ability to drive the deployment of the customers’ workloads into GCP and provide guidance, cloud adoption model, service integrations, appropriate recommendations to overcome blockers and technical road-maps for GCP cloud implementations. Experience with technical solutions based on industry standards using GCP - IaaS, PaaS and SaaS capabilities. Act as a subject-matter expert OR developer around GCP and become a trusted advisor to multiple teams.
Posted 1 week ago
1.0 - 5.0 years
0 Lacs
chennai, tamil nadu
On-site
You should have at least 3 years of hands-on experience in data modeling, ETL processes, developing reporting systems, and data engineering using tools such as ETL, Big Query, SQL, Python, or Alteryx. Additionally, you should possess advanced knowledge in SQL programming and database management. Moreover, you must have a minimum of 3 years of solid experience working with Business Intelligence reporting tools like Power BI, Qlik Sense, Looker, or Tableau, along with a good understanding of data warehousing concepts and best practices. Excellent problem-solving and analytical skills are essential for this role, as well as being detail-oriented with strong communication and collaboration skills. The ability to work both independently and as part of a team is crucial for success in this position. Preferred skills include experience with GCP cloud services such as BigQuery, Cloud Composer, Dataflow, CloudSQL, Looker, Looker ML, Data Studio, and GCP QlikSense. Strong SQL skills and proficiency in various BI/Reporting tools to build self-serve reports, analytic dashboards, and ad-hoc packages leveraging enterprise data warehouses are also desired. Moreover, having at least 1 year of experience in Python and Hive/Spark/Scala/JavaScript is preferred. Additionally, you should have a solid understanding of consuming data models, developing SQL, addressing data quality issues, proposing BI solution architecture, articulating best practices in end-user visualizations, and development delivery experience. Furthermore, it is important to have a good grasp of BI tools, architectures, and visualization solutions, coupled with an inquisitive and proactive approach to learning new tools and techniques. Strong oral, written, and interpersonal communication skills are necessary, and you should be comfortable working in a dynamic environment where problems are not always well-defined.,
Posted 1 week ago
6.0 - 10.0 years
0 Lacs
noida, uttar pradesh
On-site
The ideal candidate for the position will have the responsibility of designing, developing, and maintaining an optimal data pipeline architecture. You will be required to monitor incidents, perform root cause analysis, and implement appropriate actions to solve issues related to abnormal job execution and data corruption conditions. Additionally, you will automate jobs, notifications, and reports to improve efficiency. You should possess the ability to optimize existing queries, reverse engineer for data research and analysis, and calculate the impact of issues on the downstream side for effective communication. Supporting failures, data quality issues, and ensuring environment health will also be part of your role. Furthermore, you will maintain ingestion and pipeline runbooks, portfolio summaries, and DBAR, while enabling infrastructure changes, enhancements, and updates roadmap. Building the infrastructure for optimal extraction, transformation, and loading data from various sources using big data technologies, python, or web-based APIs will be essential. You will participate in code reviews with peers, have excellent communication skills for understanding and conveying requirements effectively. As a candidate, you are expected to have a Bachelor's degree in Engineering/Computer Science or a related quantitative field. Technical skills required include a minimum of 8 years of programming experience with python and SQL, experience with massively parallel processing systems like Spark or Hadoop, and a minimum of 6-7 years of hands-on experience with GCP, BigQuery, Dataflow, Data Warehousing, Data modeling, Apache Beam, and Cloud Storage. Proficiency in source code control systems (GIT) and CI/CD processes, involvement in designing, prototyping, and delivering software solutions within the big data ecosystem, and hands-on experience in generative AI models are also necessary. You should be able to perform code reviews to ensure code meets acceptance criteria, have experience with Agile development methodologies and tools, and work towards improving data governance and quality to enhance data reliability. EXL Analytics offers a dynamic and innovative environment where you will collaborate with experienced analytics consultants. You will gain insights into various business aspects, develop effective teamwork and time-management skills, and receive training in analytical tools and techniques. Our mentoring program provides guidance and coaching to every employee, fostering personal and professional growth. The opportunities for growth and development at EXL Analytics are limitless, setting the stage for a successful career within the company and beyond.,
Posted 1 week ago
2.0 - 6.0 years
0 Lacs
karnataka
On-site
As a GCP Senior Data Engineer/Architect, you will play a crucial role in our team by designing, developing, and implementing robust and scalable data solutions on the Google Cloud Platform (GCP). Collaborating closely with Architects and Business Analysts, especially for our US clients, you will translate data requirements into effective technical solutions. Your responsibilities will include designing and implementing scalable data warehouse and data lake solutions, orchestrating complex data pipelines, leading cloud data lake implementation projects, participating in cloud migration projects, developing containerized applications, optimizing SQL queries, writing automation scripts in Python, and utilizing various GCP data services such as BigQuery, Bigtable, and Cloud SQL. Your expertise in data warehouse and data lake design and implementation, experience in data pipeline development and tuning, hands-on involvement in cloud migration and data lake projects, proficiency in Docker and GKE, strong SQL and Python scripting skills, and familiarity with GCP services like BigQuery, Cloud SQL, Dataflow, and Composer will be essential for this role. Additionally, knowledge of data governance principles, experience with dbt, and the ability to work effectively within a team and adapt to project needs are highly valued. Strong communication skills, the willingness to work in UK shift timings, and the openness to giving and receiving feedback are important traits that will contribute to your success in this role.,
Posted 1 week ago
5.0 - 13.0 years
0 Lacs
karnataka
On-site
Dexcom Corporation is a pioneer and global leader in continuous glucose monitoring (CGM), with a vision to forever change how diabetes is managed and to provide actionable insights for better health outcomes. With a history of 25 years in the industry, Dexcom is broadening its focus beyond diabetes to empower individuals to take control of their health. The company is dedicated to developing solutions for serious health conditions and aims to become a leading consumer health technology company. The software quality team at Dexcom is collaborative and innovative, focusing on ensuring the reliability and performance of CGM systems. The team's mission is to build quality into every stage of the development lifecycle through smart automation, rigorous testing, and a passion for improving lives. They are seeking individuals who are eager to grow their skills while contributing to life-changing technology. As a member of the team, your responsibilities will include participating in building quality into products by writing automated tests, contributing to software requirements and design specifications, designing, developing, executing, and maintaining automated and manual test scripts, creating verification and validation test plans, traceability matrices, and test reports, as well as recording and tracking issues using the bug tracking system. You will also analyze test failures, collaborate with development teams to investigate root causes, and contribute to the continuous improvement of the release process. To be successful in this role, you should have 13 years of hands-on experience in software development or software test development using Python or other object-oriented programming languages. Experience with SQL and NoSQL databases, automated test development for API testing, automated testing frameworks like Robot Framework, API testing, microservices, distributed systems in cloud environments, automated UI testing, cloud platforms like Google Cloud or AWS, containerization tools such as Docker and Kubernetes, and familiarity with FDA design control processes in the medical device industry are desired qualifications. Additionally, knowledge of GCP tools like Airflow, Dataflow, and BigQuery, distributed event streaming platforms like Kafka, performance testing, CI/CD experience, and Agile development and test development experience are valued. Effective collaboration across functions, self-starting abilities, and clear communication skills are also essential for success in this role. Please note that Dexcom does not accept unsolicited resumes or applications from agencies. Staffing and recruiting agencies must be authorized to submit profiles, applications, or resumes on specific requisitions. Dexcom is not responsible for any fees related to unsolicited resumes or applications.,
Posted 1 week ago
8.0 - 13.0 years
0 Lacs
hyderabad, telangana
On-site
You are an experienced GCP Data Engineer with 8+ years of expertise in designing and implementing robust, scalable data architectures on Google Cloud Platform. Your role involves defining and leading the implementation of data architecture strategies using GCP services to meet business and technical requirements. As a visionary GCP Data Architect, you will be responsible for architecting and optimizing scalable data pipelines using Google Cloud Storage, BigQuery, Dataflow, Cloud Composer, Dataproc, and Pub/Sub. You will design solutions for large-scale batch processing and real-time streaming, leveraging tools like Dataproc for distributed data processing. Your responsibilities also include establishing and enforcing data governance, security frameworks, and best practices for data management. You will conduct architectural reviews and performance tuning for GCP-based data solutions, ensuring cost-efficiency and scalability. Collaborating with cross-functional teams, you will translate business needs into technical requirements and deliver innovative data solutions. The required skills for this role include strong expertise in GCP services such as Google Cloud Storage, BigQuery, Dataflow, Cloud Composer, Dataproc, and Pub/Sub. Proficiency in designing and implementing data processing frameworks for ETL/ELT, batch, and real-time workloads is essential. You should have an in-depth understanding of data modeling, data warehousing, and distributed data processing using tools like Dataproc and Spark. Hands-on experience with Python, SQL, and modern data engineering practices is required. Your knowledge of data governance, security, and compliance best practices on GCP will be crucial in this role. Strong problem-solving, leadership, and communication skills are necessary for guiding teams and engaging stakeholders effectively.,
Posted 1 week ago
3.0 years
0 Lacs
Chennai, Tamil Nadu, India
On-site
Job Description As a Data Engineer, you will leverage your technical expertise in data, analytics, cloud technologies, and analytic software tools to identify best designs, improve business processes, and generate measurable business outcomes. You will work with Data Engineering teams from within D&A, across the Pro Tech portfolio and additional Ford organizations such as GDI&A (Global Data Insight & Analytics), Enterprise Connectivity, Ford Customer Service Division, Ford Credit, etc. Develop EL/ELT/ETL pipelines to make data available in BigQuery analytical data store from disparate batch, streaming data sources for the Business Intelligence and Analytics teams. Work with on-prem data sources (Hadoop, SQL Server), understand the data model, business rules behind the data and build data pipelines (with GCP, Informatica) for one or more Ford Pro verticals. This data will be landed in GCP BigQuery. Build cloud-native services and APIs to support and expose data-driven solutions. Partner closely with our data scientists to ensure the right data is made available in a timely manner to deliver compelling and insightful solutions. Design, build and launch shared data services to be leveraged by the internal and external partner developer community. Building out scalable data pipelines and choosing the right tools for the right job. Manage, optimize and Monitor data pipelines. Provide extensive technical, strategic advice and guidance to key stakeholders around data transformation efforts. Understand how data is useful to the enterprise. Responsibilities Bachelors Degree 3+ years of experience with SQL and Python 2+ years of experience with GCP or AWS cloud services; Strong candidates with 5+ years in a traditional data warehouse environment (ETL pipelines with Informatica) will be considered 3+ years of experience building out data pipelines from scratch in a highly distributed and fault-tolerant manner. Comfortable with a broad array of relational and non-relational databases. Proven track record of building applications in a data-focused role (Cloud and Traditional Data Warehouse) Qualifications Experience with GCP cloud services including BigQuery, Cloud Composer, Dataflow, CloudSQL, GCS, Cloud Functions and Pub/Sub. Inquisitive, proactive, and interested in learning new tools and techniques. Familiarity with big data and machine learning tools and platforms. Comfortable with open source technologies including Apache Spark, Hadoop, Kafka. 1+ year experience with Hive, Spark, Scala, JavaScript. Strong oral, written and interpersonal communication skills Comfortable working in a dynamic environment where problems are not always well-defined. M.S. in a science-based program and/or quantitative discipline with a technical emphasis.
Posted 1 week ago
2.0 years
0 Lacs
Chennai, Tamil Nadu, India
On-site
Job Description We are seeking a Senior GCP Software Engineer with Data Engineering knowledge to be part of the Data products Development team. This individual contributor role is pivotal in delivering Data products at speed for our Analytical business customers. The role involves handling real time repair order data, designing and modeling solutions, and developing solutions for real world business problems. Expectations are that the candidate will be able to interface with business customers, management, and is knowledgeable about the GCP platform. Looking for a candidate that is self-motivated to expand their role, proactively address challenges that arise, share knowledge with the team, and is not afraid to think outside the box to identify solutions. Your skills shall be utilized to analyse and transform large datasets, support Analytics in GCP, and ensure development process and standards are met to sustain the growing infrastructure and business needs. Responsibilities Work on a small agile team to deliver curated data products for the Product Organization. Work effectively with fellow data engineers, product owners, data champions and other technical experts. Demonstrate technical knowledge and communication skills with the ability to advocate for well-designed solutions. Develop exceptional analytical data products using both streaming and batch ingestion patterns on Google Cloud Platform with solid data warehouse principles. Be the Subject Matter Expert in Data Engineering with a focus on GCP native services and other well integrated third-party technologies. Architect and implement sophisticated ETL pipelines, ensuring efficient data integration into Big Query from diverse batch and streaming sources. Spearhead the development and maintenance of data ingestion and analytics pipelines using cutting-edge tools and technologies, including Python, SQL, and DBT/Data form. Ensure the highest standards of data quality and integrity across all data processes. Data workflow management using Astronomer and Terraform for cloud infrastructure, promoting best practices in Infrastructure as Code Rich experience in Application Support in GCP. Experienced in data mapping, impact analysis, root cause analysis, and document data lineage to support robust data governance. Develop comprehensive documentation for data engineering processes, promoting knowledge sharing and system maintainability. Utilize GCP monitoring tools to proactively address performance issues and ensure system resilience, while providing expert production support. Provide strategic guidance and mentorship to team members on data transformation initiatives, championing data utility within the enterprise. Ability to model data products to implement standardization and optimization of data products from inception Qualifications Experience working in API services (Kafka topics) and GCP native (or equivalent) services like Big Query, Google Cloud Storage, PubSub, Dataflow, Dataproc etc. Experience working with Airflow for scheduling and orchestration of data pipelines. Experience working with Terraform to provision Infrastructure as Code. 2 + years professional development experience in Java or Python. Bachelor’s degree in computer science or related scientific field. Experience in analysing complex data, organizing raw data, and integrating massive datasets from multiple data sources to build analytical domains and reusable data products. Experience in working with architects to evaluate and productionalize data pipelines for data ingestion, curation, and consumption. Experience in working with stakeholders to formulate business problems as technical data requirements, identify and implement technical solutions while ensuring key business drivers are captured in collaboration with product management.
Posted 1 week ago
7.0 years
0 Lacs
India
Remote
Job Title: MS Fabric Solution Engineer lead and Architect role Experience: 7-10 Years Location: Remote Budget : 1.2 LPM for 7+Years(lead role) & 1.4 LPM for 8+Years(Architect) Shift : IST JD for MS Fabric Solution Engineer Key Responsibilities: ● Lead the technical design, architecture, and hands-on implementation of Microsoft Fabric PoCs. This includes translating business needs into effective data solutions, often applying Medallion Architecture principles within the Lakehouse.. ● Develop and optimize ELT/ETL pipelines for diverse data sources: o Static data (e.g., CIM XML, equipment models, Velocity Suite data). o Streaming data (e.g., measurements from grid devices, Event Hub and IoT Hub). ● Seamlessly integrate Fabric with internal systems (e.g., CRM, ERP) using RESTful APIs, data mirroring, Azure Integration Services, and CDC (Change Data Capture) mechanisms. ● Hands-on configuration and management of core Fabric components: OneLake, Lakehouse, Notebooks (PySpark/KQL), and Real-Time Analytics databases. ● Facilitate data access via GraphQL interfaces, Power BI Embedded, and Direct Lake connections, ensuring optimal performance for self-service BI and adhering to RLS/OLS. ● Work closely with Microsoft experts, SMEs, and stakeholders. ● Document architecture, PoC results, and provide recommendations for production readiness and data governance (e.g., Purview integration). ______________ Required Skills & Experience: ● 5–10 years of experience in Data Engineering / BI / Cloud Analytics, with at least 1–2 projects using Microsoft Fabric (or strong Power BI + Synapse background transitioning to Fabric). ● Proficient in: o OneLake, Data Factory, Lakehouse, Real-Time Intelligence, Dataflow Gen2 o Ingestion using CIM XML, CSV, APIs, SDKs o Power BI Embedded, GraphQL interfaces o Azure Notebooks / PySpark / Fabric SDK ● Experience with data modeling (asset registry, nomenclature alignment, schema mapping). ● Familiarity with real-time streaming (Kafka/Kinesis/IoT Hub) and data governance concepts. ● Strong problem-solving and debugging skills. ● Prior experience with PoC/Prototype-style projects with tight timelines. ______________ Good to Have: ● Knowledge of grid operations / energy asset management systems. ● Experience working on Microsoft-Azure joint engagements. ● Understanding of AI/ML workflow integration via Azure AI Foundry or similar. ● Relevant certifications: DP-600/700 or DP-203.
Posted 1 week ago
0 years
0 Lacs
Noida, Uttar Pradesh, India
On-site
Role Description Role Proficiency: This role requires proficiency in data pipeline development including coding and testing data pipelines for ingesting wrangling transforming and joining data from various sources. Must be skilled in ETL tools such as Informatica Glue Databricks and DataProc with coding expertise in Python PySpark and SQL. Works independently and has a deep understanding of data warehousing solutions including Snowflake BigQuery Lakehouse and Delta Lake. Capable of calculating costs and understanding performance issues related to data solutions. Outcomes Act creatively to develop pipelines and applications by selecting appropriate technical options optimizing application development maintenance and performance using design patterns and reusing proven solutions.rnInterpret requirements to create optimal architecture and design developing solutions in accordance with specifications. Document and communicate milestones/stages for end-to-end delivery. Code adhering to best coding standards debug and test solutions to deliver best-in-class quality. Perform performance tuning of code and align it with the appropriate infrastructure to optimize efficiency. Validate results with user representatives integrating the overall solution seamlessly. Develop and manage data storage solutions including relational databases NoSQL databases and data lakes. Stay updated on the latest trends and best practices in data engineering cloud technologies and big data tools. Influence and improve customer satisfaction through effective data solutions. Measures Of Outcomes Adherence to engineering processes and standards Adherence to schedule / timelines Adhere to SLAs where applicable # of defects post delivery # of non-compliance issues Reduction of reoccurrence of known defects Quickly turnaround production bugs Completion of applicable technical/domain certifications Completion of all mandatory training requirements Efficiency improvements in data pipelines (e.g. reduced resource consumption faster run times). Average time to detect respond to and resolve pipeline failures or data issues. Number of data security incidents or compliance breaches. Outputs Expected Code Development: Develop data processing code independently ensuring it meets performance and scalability requirements. Define coding standards templates and checklists. Review code for team members and peers. Documentation Create and review templates checklists guidelines and standards for design processes and development. Create and review deliverable documents including design documents architecture documents infrastructure costing business requirements source-target mappings test cases and results. Configuration Define and govern the configuration management plan. Ensure compliance within the team. Testing Review and create unit test cases scenarios and execution plans. Review the test plan and test strategy developed by the testing team. Provide clarifications and support to the testing team as needed. Domain Relevance Advise data engineers on the design and development of features and components demonstrating a deeper understanding of business needs. Learn about customer domains to identify opportunities for value addition. Complete relevant domain certifications to enhance expertise. Project Management Manage the delivery of modules effectively. Defect Management Perform root cause analysis (RCA) and mitigation of defects. Identify defect trends and take proactive measures to improve quality. Estimation Create and provide input for effort and size estimation for projects. Knowledge Management Consume and contribute to project-related documents SharePoint libraries and client universities. Review reusable documents created by the team. Release Management Execute and monitor the release process to ensure smooth transitions. Design Contribution Contribute to the creation of high-level design (HLD) low-level design (LLD) and system architecture for applications business components and data models. Customer Interface Clarify requirements and provide guidance to the development team. Present design options to customers and conduct product demonstrations. Team Management Set FAST goals and provide constructive feedback. Understand team members' aspirations and provide guidance and opportunities for growth. Ensure team engagement in projects and initiatives. Certifications Obtain relevant domain and technology certifications to stay competitive and informed. Skill Examples Proficiency in SQL Python or other programming languages used for data manipulation. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery). Conduct tests on data pipelines and evaluate results against data quality and performance specifications. Experience in performance tuning of data processes. Expertise in designing and optimizing data warehouses for cost efficiency. Ability to apply and optimize data models for efficient storage retrieval and processing of large datasets. Capacity to clearly explain and communicate design and development aspects to customers. Ability to estimate time and resource requirements for developing and debugging features or components. Knowledge Examples Knowledge Examples Knowledge of various ETL services offered by cloud providers including Apache PySpark AWS Glue GCP DataProc/DataFlow Azure ADF and ADLF. Proficiency in SQL for analytics including windowing functions. Understanding of data schemas and models relevant to various business contexts. Familiarity with domain-related data and its implications. Expertise in data warehousing optimization techniques. Knowledge of data security concepts and best practices. Familiarity with design patterns and frameworks in data engineering. Additional Comments Skills Cloud Platforms ( AWS, MS Azure, GC etc.) Containerization and Orchestration ( Docker, Kubernetes etc..) APIs - Change APIs to APIs development Data Pipeline construction using languages like Python, PySpark, and SQL Data Streaming (Kafka and Azure Event Hub etc..) Data Parsing ( Akka and MinIO etc..) Database Management ( SQL and NoSQL, including Clickhouse, PostgreSQL etc..) Agile Methodology ( Git, Jenkins, or Azure DevOps etc..) JS like Connectors/ framework for frontend/backend Collaboration and Communication Skills Aws Cloud,Azure Cloud,Docker,Kubernetes
Posted 1 week ago
7.0 years
0 Lacs
India
On-site
Job Title: MS Fabric Solution Engineer Architect Experience: 7-10 Years Shift : IST JD for MS Fabric Solution Engineer Key Responsibilities: ● Lead the technical design, architecture, and hands-on implementation of Microsoft Fabric PoCs. This includes translating business needs into effective data solutions, often applying Medallion Architecture principles within the Lakehouse.. ● Develop and optimize ELT/ETL pipelines for diverse data sources: o Static data (e.g., CIM XML, equipment models, Velocity Suite data). o Streaming data (e.g., measurements from grid devices, Event Hub and IoT Hub). ● Seamlessly integrate Fabric with internal systems (e.g., CRM, ERP) using RESTful APIs, data mirroring, Azure Integration Services, and CDC (Change Data Capture) mechanisms. ● Hands-on configuration and management of core Fabric components: OneLake, Lakehouse, Notebooks (PySpark/KQL), and Real-Time Analytics databases. ● Facilitate data access via GraphQL interfaces, Power BI Embedded, and Direct Lake connections, ensuring optimal performance for self-service BI and adhering to RLS/OLS. ● Work closely with Microsoft experts, SMEs, and stakeholders. ● Document architecture, PoC results, and provide recommendations for production readiness and data governance (e.g., Purview integration). ______________ Required Skills & Experience: ● 7-10 years of experience in Data Engineering / BI / Cloud Analytics, with at least 1–2 projects using Microsoft Fabric (or strong Power BI + Synapse background transitioning to Fabric). ● Proficient in: o OneLake, Data Factory, Lakehouse, Real-Time Intelligence, Dataflow Gen2 o Ingestion using CIM XML, CSV, APIs, SDKs o Power BI Embedded, GraphQL interfaces o Azure Notebooks / PySpark / Fabric SDK ● Experience with data modeling (asset registry, nomenclature alignment, schema mapping). ● Familiarity with real-time streaming (Kafka/Kinesis/IoT Hub) and data governance concepts. ● Strong problem-solving and debugging skills. ● Prior experience with PoC/Prototype-style projects with tight timelines. ______________ Good to Have: ● Knowledge of grid operations / energy asset management systems. ● Experience working on Microsoft-Azure joint engagements. ● Understanding of AI/ML workflow integration via Azure AI Foundry or similar. ● Relevant certifications: DP-600/700 or DP-203. If Intrested. Please submit your CV to Khushboo@Sourcebae.com or share it via WhatsApp at 8827565832 Stay updated with our latest job opportunities and company news by following us on LinkedIn: :https://www.linkedin.com/company/sourcebae
Posted 1 week ago
5.0 years
0 Lacs
Gurgaon
On-site
Manager EXL/M/1430835 ServicesGurgaon Posted On 23 Jul 2025 End Date 06 Sep 2025 Required Experience 5 - 10 Years Basic Section Number Of Positions 1 Band C1 Band Name Manager Cost Code D012515 Campus/Non Campus NON CAMPUS Employment Type Permanent Requisition Type New Max CTC 1800000.0000 - 3000000.0000 Complexity Level Not Applicable Work Type Hybrid – Working Partly From Home And Partly From Office Organisational Group Analytics Sub Group Retail Media & Hi-Tech Organization Services LOB Retail Media & Hi-Tech SBU Services Country India City Gurgaon Center EXL - Gurgaon Center 38 Skills Skill BIG DATA ETL JAVA SPARK Minimum Qualification ANY GRADUATE Certification No data available Job Description Job Title: Senior Data Engineer – Big Data, ETL & Java Experience Level: 5+ Years Employment Type: Full-time About the Role EXL is seeking a Senior Software Engineer with a strong foundation in Java, along with expertise in Big Data technologies and ETL development. In this role, you'll design and implement scalable, high-performance data and backend systems for clients in retail, media, and other data-driven industries. You’ll work across cloud platforms such as AWS and GCP to build end-to-end data and application pipelines. Key Responsibilities Design, develop, and maintain scalable data pipelines and ETL workflows using Apache Spark, Apache Airflow, and cloud platforms (AWS/GCP). Build and support Java-based backend components, services, or APIs as part of end-to-end data solutions. Work with large-scale datasets to support transformation, integration, and real-time analytics. Optimize Spark, SQL, and Java processes for performance, scalability, and reliability. Collaborate with cross-functional teams to understand business requirements and deliver robust solutions. Follow engineering best practices in coding, testing, version control, and deployment. Required Qualifications 5+ years of hands-on experience in software or data engineering. Proven experience in developing ETL pipelines using Java and Spark. Strong programming experience in Java (preferably with frameworks such as Spring or Spring Boot). Experience in Big Data tools including Apache Spark, Apache Airflow, and cloud services such as AWS EMR, Glue, S3, Lambda or GCP BigQuery, Dataflow, Cloud Functions. Proficiency in SQL and experience with performance tuning for large datasets. Familiarity with data modeling, warehousing, and distributed systems. Experience working in Agile development environments. Strong problem-solving skills and attention to detail. Excellent communication skills Preferred Qualifications Experience building and integrating RESTful APIs or microservices using Java. Exposure to data platforms like Snowflake, Databricks, or Kafka. Background in retail, merchandising, or media domains is a plus. Familiarity with CI/CD pipelines, DevOps tools, and cloud-based development workflows. Workflow Workflow Type L&S-DA-Consulting
Posted 1 week ago
0 years
0 Lacs
Kizhake Chālakudi
On-site
Company Description We are a technology startup located in Chalakudy, Thrissur, Kerala, specializing in the design and development of advanced Battery Management Systems (BMS) for Electric Vehicles (EVs) and Energy Storage Systems (ESS). Our solutions are engineered for performance, safety, and scalability to support the evolving demands of the clean energy and e-mobility sectors. Role Description We are inviting applications for an Embedded Software Internship program at Shade Energy Pvt. Ltd., suitable for candidates who are interested in the field of Firmware programming, Modelling and Simulation, C- programming, Data flow programming. Skills Preferred Excellent in basic C programming Familiar with C-programming concepts (like RTOS, State machines, OOP, Structures) Familiar with dataflow programming tools like labVIEW, MATLAB Simulink etc. Other preferences Candidates in Kerala preferred especially within a 100km proximity Ready to work in a competitive Startup firm Job Type: Internship Contract length: 6 months Benefits: Commuter assistance Application Question(s): Are you ready to Relocate/commute to this Location: Chalakudy,Thrissur,Kerala? Education: Bachelor's (Required) Work Location: In person
Posted 1 week ago
3.0 - 7.0 years
2 - 10 Lacs
India
Remote
Job Title: ETL Automation Tester (SQL, Python, Cloud) Location: [On-site / Remote / Hybrid – City, State or “Anywhere, USA”] Employment Type: [Full-time / Contract / C2C / Part Time ] NOTE : Candidate has to work US Night Shifts Job Summary: We are seeking a highly skilled ETL Automation Tester with expertise in SQL , Python scripting , and experience working with Cloud technologies such as Azure, AWS, or GCP . The ideal candidate will be responsible for designing and implementing automated testing solutions to ensure the accuracy, performance, and reliability of ETL pipelines and data integration processes. Key Responsibilities: Design and implement test strategies for ETL processes and data pipelines. Develop automated test scripts using Python and integrate them into CI/CD pipelines. Validate data transformations and data integrity across source, staging, and target systems. Write complex SQL queries for test data creation, validation, and result comparison. Perform cloud-based testing on platforms such as Azure Data Factory, AWS Glue, or GCP Dataflow/BigQuery. Collaborate with data engineers, analysts, and DevOps teams to ensure seamless data flow and test coverage. Log, track, and manage defects through tools like JIRA, Azure DevOps, or similar. Participate in performance and volume testing for large-scale datasets. Required Skills and Qualifications: 3–7 years of experience in ETL/data warehouse testing. Strong hands-on experience in SQL (joins, CTEs, window functions, aggregation). Proficient in Python for automation scripting and data manipulation. Solid understanding of ETL tools such as Informatica, Talend, SSIS, or custom Python-based ETL. Experience with at least one Cloud Platform : Azure : Data Factory, Synapse, Blob Storage AWS : Glue, Redshift, S3 GCP : Dataflow, BigQuery, Cloud Storage Familiarity with data validation , data quality , and data profiling techniques. Experience with CI/CD tools such as Jenkins, GitHub Actions, or Azure DevOps. Excellent problem-solving, communication, and documentation skills. Preferred Qualifications: Knowledge of Apache Airflow , PySpark , or Databricks . Experience with containerization (Docker) and orchestration tools (Kubernetes). ISTQB or similar testing certification. Familiarity with Agile methodologies and Scrum ceremonies . Job Types: Part-time, Contractual / Temporary, Freelance Contract length: 6 months Pay: ₹18,074.09 - ₹86,457.20 per month Expected hours: 40 per week Benefits: Work from home
Posted 1 week ago
0 years
0 Lacs
Chennai
On-site
Join our team focused on Google Cloud Data Messaging Services, leveraging technologies like Pub/Sub and Kafka to build scalable, decoupled, and resilient cloud-native applications. This position involves close collaboration with development teams, as well as product vendors, to implement and support the suite of Data Messaging Services offered within GCP and Confluent Kafka. GCP Data Messaging Services provide powerful capabilities for handling streaming data and asynchronous communication. Key benefits include: Enabling real-time data processing and event-driven architectures Decoupling applications for improved resilience and scalability Leveraging managed services like Cloud Pub/Sub and integrating with Kafka environments (Apache Kafka, Confluent Cloud) Providing highly scalable and available infrastructure for data streams Enhancing automation for messaging setup and management Supporting Infrastructure as Code practices for messaging components The Data Messaging Services Specialist plays a crucial role as the corporation migrates and onboards applications that rely on robust data streaming and asynchronous communication onto GCP Pub/Sub and Confluent Kafka. This position requires staying abreast of the continual evolution of cloud data technologies and understanding how GCP messaging services like Pub/Sub, alongside Kafka, integrate with other native services like Cloud Run, Dataflow, etc., within the new Ford Standard app hosting environment to meet customer needs. This is an exciting opportunity to work on highly visible data streaming technologies that are becoming industry standards for real-time data processing. Highly motivated individual with strong technical skills and an understanding of emerging data streaming technologies (including Google Pub/Sub, Kafka, Tekton, and Terraform). Experience with Apache Kafka or Confluent Cloud Kafka, including concepts like brokers, topics, partitions, producers, consumers, and consumer groups. Working experience in CI/CD pipelines, including building continuous integration and deployment pipelines using Tekton or similar technologies for applications interacting with Pub/Sub or Kafka. Understanding of GitOps and other DevOps processes and principles as applied to managing messaging infrastructure and application deployments. Understanding of Google Identity and Access Management (IAM) concepts and various authentication/authorization options for securing access to Pub/Sub and Kafka. Knowledge of any programming language (e.g., Java, Python, Go) commonly used for developing messaging producers/consumers. Experience with public cloud platforms (preferably GCP), with a focus on data messaging services. Understanding of agile methodologies and concepts, or experience working in an agile environment. Develop a solid understanding of Google Cloud Pub/Sub and Kafka (Apache Kafka and/or Confluent Cloud). Gain experience in using Git/GitHub and CI/CD pipelines for deploying messaging-related cluster and infrastructure. Collaborate with Business IT and business owners to prioritize improvement efforts related to data messaging patterns and infrastructure. Work with team members to establish best practices for designing, implementing, and operating scalable and reliable data messaging solutions. Identify opportunities for adopting new data streaming technologies and patterns to solve existing needs and anticipate future challenges. Create and maintain Terraform modules and documentation for provisioning and managing Pub/Sub topics/subscriptions, Kafka clusters, and related networking configurations, often with a paired partner. Develop automated processes to simplify the experience for application teams adopting Pub/Sub and Kafka client libraries and deployment patterns. Improve continuous integration tooling by automating manual processes within the delivery pipeline for messaging applications and enhancing quality gates based on past learnings.
Posted 1 week ago
8.0 years
0 Lacs
Trivandrum, Kerala, India
On-site
Role Description Role Proficiency: This role requires proficiency in developing data pipelines including coding and testing for ingesting wrangling transforming and joining data from various sources. The ideal candidate should be adept in ETL tools like Informatica Glue Databricks and DataProc with strong coding skills in Python PySpark and SQL. This position demands independence and proficiency across various data domains. Expertise in data warehousing solutions such as Snowflake BigQuery Lakehouse and Delta Lake is essential including the ability to calculate processing costs and address performance issues. A solid understanding of DevOps and infrastructure needs is also required. Outcomes Act creatively to develop pipelines/applications by selecting appropriate technical options optimizing application development maintenance and performance through design patterns and reusing proven solutions. Support the Project Manager in day-to-day project execution and account for the developmental activities of others. Interpret requirements create optimal architecture and design solutions in accordance with specifications. Document and communicate milestones/stages for end-to-end delivery. Code using best standards debug and test solutions to ensure best-in-class quality. Tune performance of code and align it with the appropriate infrastructure understanding cost implications of licenses and infrastructure. Create data schemas and models effectively. Develop and manage data storage solutions including relational databases NoSQL databases Delta Lakes and data lakes. Validate results with user representatives integrating the overall solution. Influence and enhance customer satisfaction and employee engagement within project teams. Measures Of Outcomes TeamOne's Adherence to engineering processes and standards TeamOne's Adherence to schedule / timelines TeamOne's Adhere to SLAs where applicable TeamOne's # of defects post delivery TeamOne's # of non-compliance issues TeamOne's Reduction of reoccurrence of known defects TeamOne's Quickly turnaround production bugs Completion of applicable technical/domain certifications Completion of all mandatory training requirementst Efficiency improvements in data pipelines (e.g. reduced resource consumption faster run times). TeamOne's Average time to detect respond to and resolve pipeline failures or data issues. TeamOne's Number of data security incidents or compliance breaches. Code Outputs Expected: Develop data processing code with guidance ensuring performance and scalability requirements are met. Define coding standards templates and checklists. Review code for team and peers. Documentation Create/review templates checklists guidelines and standards for design/process/development. Create/review deliverable documents including design documents architecture documents infra costing business requirements source-target mappings test cases and results. Configure Define and govern the configuration management plan. Ensure compliance from the team. Test Review/create unit test cases scenarios and execution. Review test plans and strategies created by the testing team. Provide clarifications to the testing team. Domain Relevance Advise data engineers on the design and development of features and components leveraging a deeper understanding of business needs. Learn more about the customer domain and identify opportunities to add value. Complete relevant domain certifications. Manage Project Support the Project Manager with project inputs. Provide inputs on project plans or sprints as needed. Manage the delivery of modules. Manage Defects Perform defect root cause analysis (RCA) and mitigation. Identify defect trends and implement proactive measures to improve quality. Estimate Create and provide input for effort and size estimation and plan resources for projects. Manage Knowledge Consume and contribute to project-related documents SharePoint libraries and client universities. Review reusable documents created by the team. Release Execute and monitor the release process. Design Contribute to the creation of design (HLD LLD SAD)/architecture for applications business components and data models. Interface With Customer Clarify requirements and provide guidance to the Development Team. Present design options to customers. Conduct product demos. Collaborate closely with customer architects to finalize designs. Manage Team Set FAST goals and provide feedback. Understand team members' aspirations and provide guidance and opportunities. Ensure team members are upskilled. Engage the team in projects. Proactively identify attrition risks and collaborate with BSE on retention measures. Certifications Obtain relevant domain and technology certifications. Skill Examples Proficiency in SQL Python or other programming languages used for data manipulation. Experience with ETL tools such as Apache Airflow Talend Informatica AWS Glue Dataproc and Azure ADF. Hands-on experience with cloud platforms like AWS Azure or Google Cloud particularly with data-related services (e.g. AWS Glue BigQuery). Conduct tests on data pipelines and evaluate results against data quality and performance specifications. Experience in performance tuning. Experience in data warehouse design and cost improvements. Apply and optimize data models for efficient storage retrieval and processing of large datasets. Communicate and explain design/development aspects to customers. Estimate time and resource requirements for developing/debugging features/components. Participate in RFP responses and solutioning. Mentor team members and guide them in relevant upskilling and certification. Knowledge Examples Knowledge Examples Knowledge of various ETL services used by cloud providers including Apache PySpark AWS Glue GCP DataProc/Dataflow Azure ADF and ADLF. Proficient in SQL for analytics and windowing functions. Understanding of data schemas and models. Familiarity with domain-related data. Knowledge of data warehouse optimization techniques. Understanding of data security concepts. Awareness of patterns frameworks and automation practices. Additional Comments We are seeking a highly experienced Senior Data Engineer to design, develop, and optimize scalable data pipelines in a cloud-based environment. The ideal candidate will have deep expertise in PySpark, SQL, Azure Databricks, and experience with either AWS or GCP. A strong foundation in data warehousing, ELT/ETL processes, and dimensional modeling (Kimball/star schema) is essential for this role. Must-Have Skills 8+ years of hands-on experience in data engineering or big data development. Strong proficiency in PySpark and SQL for data transformation and pipeline development. Experience working in Azure Databricks or equivalent Spark-based cloud platforms. Practical knowledge of cloud data environments – Azure, AWS, or GCP. Solid understanding of data warehousing concepts, including Kimball methodology and star/snowflake schema design. Proven experience designing and maintaining ETL/ELT pipelines in production. Familiarity with version control (e.g., Git), CI/CD practices, and data pipeline orchestration tools (e.g., Airflow, Azure Data Factory Skills Azure Data Factory,Azure Databricks,Pyspark,Sql
Posted 1 week ago
3.0 years
0 Lacs
India
On-site
Job Summary: We are looking for a skilled Data Engineer with strong experience in Google Cloud Platform (GCP) and Apache Airflow to design, build, and maintain scalable data pipelines and infrastructure. The ideal candidate should have a strong foundation in data engineering best practices, ETL/ELT processes, and cloud-native tools to support data-driven decision-making. Key Responsibilities: Design and implement scalable, reliable data pipelines and data Ingestion using Airflow on GCP Build ETL/ELT workflows to ingest, transform, and load structured and unstructured data Work with tools like Cloud Storage , Cloud Composer , and Pub/Sub Collaborate with data analysts, data scientists, and business stakeholders to understand data requirements Optimize data processing for performance and cost-efficiency Monitor, troubleshoot, and ensure the reliability of data pipelines and jobs Required Skills: 3+ years of experience as a Data Engineer Strong hands-on experience with GCP services (, Cloud Composer, Dataflow, etc.) Proficiency in Apache Airflow for workflow orchestration Solid programming skills in Python and SQL Experience with CI/CD and version control (Git) Good understanding of data modeling, data governance, and pipeline performance tuning
Posted 1 week ago
7.0 - 10.0 years
1 - 6 Lacs
Chennai
Work from Office
Key Responsibilities Design and develop large-scale data pipelines using GCP services (BigQuery, Dataflow, Dataproc, Pub/Sub). Implement batch and real-time ETL/ELT pipelines using Apache Beam and Spark. Manage and optimize BigQuery queries, partitioning, clustering, and cost control. Build distributed processing jobs on Dataproc (Hadoop/Spark) clusters. Develop and maintain streaming data pipelines with Pub/Sub and Dataflow. Work with Cloud Spanner to support highly available and globally scalable databases. Integrate data from various sources, manage schema evolution, and ensure data quality. Collaborate with data analysts, data scientists, and business teams to deliver scalable data solutions. Follow CI/CD , DevOps, and infrastructure-as-code best practices using tools like Terraform or Cloud Build . Monitor, debug, and tune data pipelines for performance and reliability. Must-Have Skills GCP expertise BigQuery, Dataflow, Dataproc, Cloud Spanner, Pub/Sub. Strong SQL skills and performance optimization in BigQuery. Solid experience in streaming (real-time) and batch processing . Proficiency in Apache Beam , Apache Spark , or similar frameworks. Python or Java for data processing logic. Understanding of data architecture , pipeline design patterns, and distributed systems . Experience with IAM roles , service accounts , and GCP security best practices. Familiarity with monitoring tools – Stackdriver, Dataflow job metrics, BQ Query Plans.
Posted 1 week ago
0 years
0 Lacs
Chalakkudy, Kerala, India
On-site
Company Description We are a technology startup located in Chalakudy, Thrissur, Kerala, specializing in the design and development of advanced Battery Management Systems (BMS) for Electric Vehicles (EVs) and Energy Storage Systems (ESS). Our solutions are engineered for performance, safety, and scalability to support the evolving demands of the clean energy and e-mobility sectors. Role Description We are inviting applications for an Embedded Software Internship program at Shade Energy Pvt. Ltd., suitable for candidates who are interested in the field of Firmware programming, Modelling and Simulation, C- programming, Data flow programming. Skills Preferred Excellent in basic C programming Familiar with C-programming concepts (like RTOS, State machines, OOP, Structures) Familiar with dataflow programming tools like labVIEW, MATLAB Simulink etc. Other preferences Candidates in Kerala preferred especially within a 100km proximity Ready to work in a competitive Startup firm
Posted 1 week ago
0.0 years
0 Lacs
Kizhake Chalakudi, Kerala
On-site
Company Description We are a technology startup located in Chalakudy, Thrissur, Kerala, specializing in the design and development of advanced Battery Management Systems (BMS) for Electric Vehicles (EVs) and Energy Storage Systems (ESS). Our solutions are engineered for performance, safety, and scalability to support the evolving demands of the clean energy and e-mobility sectors. Role Description We are inviting applications for an Embedded Software Internship program at Shade Energy Pvt. Ltd., suitable for candidates who are interested in the field of Firmware programming, Modelling and Simulation, C- programming, Data flow programming. Skills Preferred Excellent in basic C programming Familiar with C-programming concepts (like RTOS, State machines, OOP, Structures) Familiar with dataflow programming tools like labVIEW, MATLAB Simulink etc. Other preferences Candidates in Kerala preferred especially within a 100km proximity Ready to work in a competitive Startup firm Job Type: Internship Contract length: 6 months Benefits: Commuter assistance Application Question(s): Are you ready to Relocate/commute to this Location: Chalakudy,Thrissur,Kerala? Education: Bachelor's (Required) Work Location: In person
Posted 1 week ago
3.0 - 7.0 years
0 Lacs
karnataka
On-site
As a Data Specialist, you will be responsible for utilizing your expertise in ETL Fundamentals, SQL, BigQuery, Dataproc, Python, Data Catalog, Data Warehousing, and various other tools to contribute to the successful implementation of data projects. Your role will involve working with technologies such as Cloud Trace, Cloud Logging, Cloud Storage, and Datafusion to build and maintain a modern data platform. To excel in this position, you should possess a minimum of 5 years of experience in the data engineering field, with a focus on GCP cloud data implementation suite including BigQuery, Pub Sub, Data Flow/Apache Beam, Airflow/Composer, and Cloud Storage. Your strong understanding of very large-scale data architecture and hands-on experience in data warehouses, data lakes, and analytics platforms will be crucial for the success of our projects. Key Requirements: - Minimum 5 years of experience in data engineering - Hands-on experience in GCP cloud data implementation suite - Strong expertise in GBQ Query, Python, Apache Airflow, and SQL (BigQuery preferred) - Extensive hands-on experience with SQL and Python for working with data If you are passionate about data and have a proven track record of delivering results in a fast-paced environment, we invite you to apply for this exciting opportunity to be a part of our dynamic team.,
Posted 1 week ago
5.0 years
0 Lacs
Gurgaon, Haryana, India
On-site
J ob Title: Senior Data Engineer – Big Data, ETL & Java Experience Level: 5 + Years Employment Type: Full - time About The Role EXL is seeking a Senior Software Engineer with a strong foundation in Java , along with expertise in Big Data technologies and ETL development . In this role, you'll design and implement scalable, high - performance data and backend systems for clients in retail, media, and other data - driven industries. You’ll work across cloud platforms such as AWS and GCP to build end - to - end data and application pipelines. Key Responsibilities Design, develop, and maintain scalable data pipelines and ETL workflows using Apache Spark, Apache Airflow, and cloud platforms (AWS/GCP). Build and support Java - based backend components , services, or APIs as part of end - to - end data solutions. Work with large - scale datasets to support transformation, integration, and real - time analytics. Optimize Spark, SQL, and Java processes for performance, scalability, and reliability. Collaborate with cross - functional teams to understand business requirements and deliver robust solutions. Follow engineering best practices in coding, testing, version control, and deployment. Required Qualifications 5 + years of hands - on experience in software or data engineering. Proven experience in developing ETL pipelines using Java and Spark . Strong programming experience in Java (preferably with frameworks such as Spring or Spring Boot). Experience in Big Data tools including Apache Spark , Apache Airflow , and cloud services such as AWS EMR, Glue, S3, Lambda or GCP BigQuery, Dataflow, Cloud Functions. Proficiency in SQL and experience with performance tuning for large datasets. Familiarity with data modeling, warehousing , and distributed systems. Experience working in Agile development environments. Strong problem - solving skills and attention to detail. Excellent communication skills Preferred Qualifications Experience building and integrating RESTful APIs or microservices using Java. Exposure to data platforms like Snowflake, Databricks, or Kafka. Background in retail, merchandising, or media domains is a plus. Familiarity with CI/CD pipelines , DevOps tools, and cloud - based development workflows.
Posted 1 week ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough