Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
14.0 - 20.0 years
0 Lacs
maharashtra
On-site
As a Senior Architect - Data & Cloud at our company, you will be responsible for architecting, designing, and implementing end-to-end data pipelines and data integration solutions for varied structured and unstructured data sources and targets. You will need to have more than 15 years of experience in Technical, Solutioning, and Analytical roles, with 5+ years specifically in building and managing Data Lakes, Data Warehouse, Data Integration, Data Migration, and Business Intelligence/Artificial Intelligence solutions on Cloud platforms like GCP, AWS, or Azure. Key Responsibilities: - Translate business requirements into functional and non-functional areas, defining boundaries in terms of Availability, Scalability, Performance, Security, and Resilience. - Architect and design scalable data warehouse solutions on cloud platforms like Big Query or Redshift. - Work with various Data Integration and ETL technologies on Cloud such as Spark, Pyspark/Scala, Dataflow, DataProc, EMR, etc. - Deep knowledge of Cloud and On-Premise Databases like Cloud SQL, Cloud Spanner, Big Table, RDS, Aurora, DynamoDB, Oracle, Teradata, MySQL, DB2, SQL Server, etc. - Exposure to No-SQL databases like Mongo dB, CouchDB, Cassandra, Graph dB, etc. - Experience in using traditional ETL tools like Informatica, DataStage, OWB, Talend, etc. - Collaborate with internal and external stakeholders to design optimized data analytics solutions. - Mentor young talent within the team and contribute to building assets and accelerators. Qualifications Required: - 14-20 years of relevant experience in the field. - Strong understanding of Cloud solutions for IaaS, PaaS, SaaS, Containers, and Microservices Architecture and Design. - Experience with BI Reporting and Dashboarding tools like Looker, Tableau, Power BI, SAP BO, Cognos, Superset, etc. - Knowledge of Security features and Policies in Cloud environments like GCP, AWS, or Azure. - Ability to compare products and tools across technology stacks on Google, AWS, and Azure Cloud. In this role, you will lead multiple data engagements on GCP Cloud for data lakes, data engineering, data migration, data warehouse, and business intelligence. You will interface with multiple stakeholders within IT and business to understand data requirements and take complete responsibility for the successful delivery of projects. Additionally, you will have the opportunity to work in a high-growth startup environment, contribute to the digital transformation journey of customers, and collaborate with a diverse and proactive team of techies. Please note that flexible, remote working options are available to foster productivity and work-life balance.,
Posted 1 day ago
14.0 - 20.0 years
0 Lacs
maharashtra
On-site
Role Overview: As a Principal Architect - Data & Cloud at Quantiphi, you will be responsible for leveraging your extensive experience in technical, solutioning, and analytical roles to architect and design end-to-end data pipelines and data integration solutions for structured and unstructured data sources and targets. You will play a crucial role in building and managing data lakes, data warehouses, data integration, and business intelligence/artificial intelligence solutions on Cloud platforms like GCP, AWS, and Azure. Your expertise will be instrumental in designing scalable data warehouse solutions on Big Query or Redshift and working with various data integration, storage, and pipeline tools on Cloud. Additionally, you will serve as a trusted technical advisor to customers, lead multiple data engagements on GCP Cloud, and contribute to the development of assets and accelerators. Key Responsibilities: - Possess more than 15 years of experience in technical, solutioning, and analytical roles - Have 5+ years of experience in building and managing data lakes, data warehouses, data integration, and business intelligence/artificial intelligence solutions on Cloud platforms like GCP, AWS, and Azure - Ability to understand business requirements, translate them into functional and non-functional areas, and define boundaries in terms of availability, scalability, performance, security, and resilience - Architect, design, and implement end-to-end data pipelines and data integration solutions for structured and unstructured data sources and targets - Work with distributed computing and enterprise environments like Hadoop and Cloud platforms - Proficient in various data integration and ETL technologies on Cloud such as Spark, Pyspark/Scala, Dataflow, DataProc, EMR, etc. - Deep knowledge of Cloud and On-Premise databases like Cloud SQL, Cloud Spanner, Big Table, RDS, Aurora, DynamoDB, Oracle, Teradata, MySQL, DB2, SQL Server, etc. - Exposure to No-SQL databases like Mongo dB, CouchDB, Cassandra, Graph dB, etc. - Design scalable data warehouse solutions on Cloud with tools like S3, Cloud Storage, Athena, Glue, Sqoop, Flume, Hive, Kafka, Pub-Sub, Kinesis, Dataflow, DataProc, Airflow, Composer, Spark SQL, Presto, EMRFS, etc. - Experience with Machine Learning Frameworks like TensorFlow, Pytorch - Understand Cloud solutions for IaaS, PaaS, SaaS, Containers, and Microservices Architecture and Design - Good understanding of BI Reporting and Dashboarding tools like Looker, Tableau, Power BI, SAP BO, Cognos, Superset, etc. - Knowledge of security features and policies in Cloud environments like GCP, AWS, Azure - Work on business transformation projects for moving On-Premise data solutions to Cloud platforms - Serve as a trusted technical advisor to customers and solutions for complex Cloud and Data-related technical challenges - Be a thought leader in architecture design and development of cloud data analytics solutions - Liaise with internal and external stakeholders to design optimized data analytics solutions - Collaborate with SMEs and Solutions Architects from leading cloud providers to present solutions to customers - Support Quantiphi Sales and GTM teams from a technical perspective in building proposals and SOWs - Lead discovery and design workshops with potential customers globally - Design and deliver thought leadership webinars and tech talks with customers and partners - Identify areas for productization and feature enhancement for Quantiphi's product assets Qualifications Required: - Bachelor's or Master's degree in Computer Science, Information Technology, or related field - 14-20 years of experience in technical, solutioning, and analytical roles - Strong expertise in building and managing data lakes, data warehouses, data integration, and business intelligence/artificial intelligence solutions on Cloud platforms like GCP, AWS, and Azure - Proficiency in various data integration, ETL technologies on Cloud, and Cloud and On-Premise databases - Experience with Cloud solutions for IaaS, PaaS, SaaS, Containers, and Microservices Architecture and Design - Knowledge of BI Reporting and Dashboarding tools and security features in Cloud environments Additional Company Details: While technology is the heart of Quantiphi's business, the company attributes its success to its global and diverse culture built on transparency, diversity, integrity, learning, and growth. Working at Quantiphi provides you with the opportunity to be part of a culture that encourages innovation, excellence, and personal growth, fostering a work environment where you can thrive both professionally and personally. Joining Quantiphi means being part of a dynamic team of tech enthusiasts dedicated to translating data into tangible business value for clients. Flexible remote working options are available to promote productivity and work-life balance. ,
Posted 3 days ago
8.0 - 12.0 years
30 - 45 Lacs
hyderabad, chennai, bengaluru
Work from Office
About Koantek: Koantek is a Databricks Pure-Play Elite Partner, helping enterprises modernize faster and unlock the full power of Data and AI. Backed by Databricks Ventures and honored as a six- As time Databricks Partner of the Year, we enable global enterprises to modernize at speed, operationalize AI, and realize the full value of their data. Our deep expertise spans industries such as healthcare, financial services, retail, and SaaS, delivering end-to-end solutions from rapid prototyping to production-scale AI deployments. We deliver tailored solutions that enable businesses to leverage data for growth and innovation. Our team of experts utilizes deep industry knowledge combined with cutting-edge technologies, tools, and methodologies to drive impactful results. By partnering with clients across a diverse range of industries, from emerging startups to established enterprises we help them uncover new opportunities and achieve a competitive advantage in the digital age. About the Role: As a Solutions Architect at Koantek, you will collaborate with customers to design scalable data architectures utilizing Databricks technology and services. The RSA at Koantek builds secure, highly scalable big data solutions to achieve tangible, data-driven outcomes all the while keeping simplicity and operational effectiveness in mind. Leveraging your technical expertise and business acumen, you will navigate complex technology discussions, showcasing the value of the Databricks platform throughout the sales process. Working alongside Account Executives, you will engage with customers' technical leaders, including architects, engineers, and operations teams, aiming to become a trusted advisor who delivers concrete outcomes. This role collaborates with teammates, product teams, and cross-functional project teams to lead the adoption and integration of the Databricks Platform into the enterprise ecosystem and AWS/Azure/GCP architecture. The impact you will have: Develop Account Strategies: Work with Sales and other essential partners to develop strategies for your assigned accounts to grow their usage of Databricks platform. Establish Architecture Standards: Establish the Databricks Lakehouse architecture as the standard data architecture for customers through excellent. Technical account planning. Demonstrate Value: Build and present reference architectures and demos applications to help prospects understand how Databricks can be used to achieve their goals and land new use cases. Capture Technical Wins: Consult on big data architectures, data engineering pipelines, and data science/machine learning projects to prove out Databricks technology for strategic customer projects. Validate integrations with cloud services and other third-party applications. Promote Open-Source Projects: Become an expert in and promote Databricks- inspired open-source projects (Spark, Delta Lake, MLflow) acrothe ss developer communities through meetups, conferences, and webinars. Technical Expertise: Experience translating a customer's business needs to technology solutions, including establishing buy-in with essential customer stakeholders at all levels of the business. Experienced at designing, architecting, and presenting data systems for customers and managing the delivery of production solutions of those data architectures. Projects delivered with hands-on experience in development on databricks Expert-level knowledge of data frameworks, data lakes and open-source projects such as Apache Spark, MLflow, and Delta Lake Expert-level hands-on coding experience in Spark/Scala, Python or PySpark In-depth understanding of Spark Architecture, including Spark Core, Spark SQL, and Data Frames, Spark Streaming, RDD caching, Spark MLibT/event-driven/microservices in the cloud Deep experience with distributed computing with spark with knowledge of spark runtime Experience with private and public cloud architectures, pros/cons, and migration considerations. Extensive hands-on experience implementing data migration and data processing using AWS/Azure/GCP services Familiarity with CI/CD for production deployments Familiarity with optimization for performance and scalability Completed data engineering professional certification and required classes SQL Proficiency: Fluent in SQL and database technology Educational Background: Degree in a quantitative discipline (Computer Science, Applied Mathematics, Operations Researh). Relevant certifications (e.g., Databricks certifications, AWS/Azure/GCP AI/ML certificatins) are a plus. Workplace Flexibility This is a hybrid role with remote flexibility. On-site presence at customer locations MAY be required based on project and business needs. Candidates should be willing and able to travel for short or medium-term assignments when necessary.
Posted 4 days ago
4.0 - 7.0 years
6 - 9 Lacs
bengaluru
Work from Office
What this job involves: JLL, an international real estate management company, is seeking an Data Engineer to join our JLL Technologies Team. We are seeking candidates that are self-starters to work in a diverse and fast-paced environment that can join our Enterprise Data team. We are looking for a candidate that is responsible for designing and developing of data solutions that are strategic for the business using the latest technologies Azure Databricks, Python, PySpark, SparkSQL, Azure functions, Delta Lake, Azure DevOps CI/CD. Responsibilities Design, Architect, and Develop solutions leveraging cloud big data technology to ingest, process and analyze large, disparate data sets to exceed business requirements. Design & develop data management and data persistence solutions for application use cases leveraging relational, non-relational databases and enhancing our data processing capabilities. Develop POCs to influence platform architects, product managers and software engineers to validate solution proposals and migrate. Develop data lake solution to store structured and unstructured data from internal and external sources and provide technical guidance to help migrate colleagues to modern technology platform. Contribute and adhere to CI/CD processes, development best practices and strengthen the discipline in Data Engineering Org. Develop systems that ingest, cleanse and normalize diverse datasets, develop data pipelines from various internal and external sources and build structure for previously unstructured data. Using PySpark and Spark SQL, extract, manipulate, and transform data from various sources, such as databases, data lakes, APIs, and files, to prepare it for analysis and modeling. Build and optimize ETL workflows using Azure Databricks and PySpark. This includes developing efficient data processing pipelines, data validation, error handling, and performance tuning. Perform the unit testing, system integration testing, regression testing and assist with user acceptance testing. Articulates business requirements in a technical solution that can be designed and engineered. Consults with the business to develop documentation and communication materials to ensure accurate usage and interpretation of JLL data. Implement data security best practices, including data encryption, access controls, and compliance with data protection regulations. Ensure data privacy, confidentiality, and integrity throughout the data engineering processes. Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues. Experience & Education Minimum of 4 years of experience as a data developer using Python, PySpark, Spark Sql, ETL knowledge, SQL Server, ETL Concepts. Bachelors degree in Information Science, Computer Science, Mathematics, Statistics or a quantitative discipline in science, business, or social science. Experience in Azure Cloud Platform, Databricks, Azure storage. Effective written and verbal communication skills, including technical writing. Excellent technical, analytical and organizational skills. Technical Skills & Competencies Experience handling un-structured, semi-structured data, working in a data lake environment, leveraging data streaming and developing data pipelines driven by events/queues Hands on Experience and knowledge on real time/near real time processing and ready to code Hands on Experience in PySpark, Databricks, and Spark Sql. Knowledge on json, Parquet and Other file format and work effectively with them No Sql Databases Knowledge like Hbase, Mongo, Cosmos etc. Preferred Cloud Experience on Azure or AWS Python-spark, Spark Streaming, Azure SQL Server, Cosmos DB/Mongo DB, Azure Event Hubs, Azure Data Lake Storage, Azure Search etc. Team player, Reliable, self-motivated, and self-disciplined individual capable of executing on multiple projects simultaneously within a fast-paced environment working with cross functional teams.
Posted 5 days ago
4.0 - 7.0 years
6 - 9 Lacs
thane
Work from Office
What this job involves: JLL, an international real estate management company, is seeking an Data Engineer to join our JLL Technologies Team. We are seeking candidates that are self-starters to work in a diverse and fast-paced environment that can join our Enterprise Data team. We are looking for a candidate that is responsible for designing and developing of data solutions that are strategic for the business using the latest technologies Azure Databricks, Python, PySpark, SparkSQL, Azure functions, Delta Lake, Azure DevOps CI/CD. Responsibilities Design, Architect, and Develop solutions leveraging cloud big data technology to ingest, process and analyze large, disparate data sets to exceed business requirements. Design & develop data management and data persistence solutions for application use cases leveraging relational, non-relational databases and enhancing our data processing capabilities. Develop POCs to influence platform architects, product managers and software engineers to validate solution proposals and migrate. Develop data lake solution to store structured and unstructured data from internal and external sources and provide technical guidance to help migrate colleagues to modern technology platform. Contribute and adhere to CI/CD processes, development best practices and strengthen the discipline in Data Engineering Org. Develop systems that ingest, cleanse and normalize diverse datasets, develop data pipelines from various internal and external sources and build structure for previously unstructured data. Using PySpark and Spark SQL, extract, manipulate, and transform data from various sources, such as databases, data lakes, APIs, and files, to prepare it for analysis and modeling. Build and optimize ETL workflows using Azure Databricks and PySpark. This includes developing efficient data processing pipelines, data validation, error handling, and performance tuning. Perform the unit testing, system integration testing, regression testing and assist with user acceptance testing. Articulates business requirements in a technical solution that can be designed and engineered. Consults with the business to develop documentation and communication materials to ensure accurate usage and interpretation of JLL data. Implement data security best practices, including data encryption, access controls, and compliance with data protection regulations. Ensure data privacy, confidentiality, and integrity throughout the data engineering processes. Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues. Experience & Education Minimum of 4 years of experience as a data developer using Python, PySpark, Spark Sql, ETL knowledge, SQL Server, ETL Concepts. Bachelors degree in Information Science, Computer Science, Mathematics, Statistics or a quantitative discipline in science, business, or social science. Experience in Azure Cloud Platform, Databricks, Azure storage. Effective written and verbal communication skills, including technical writing. Excellent technical, analytical and organizational skills. Technical Skills & Competencies Experience handling un-structured, semi-structured data, working in a data lake environment, leveraging data streaming and developing data pipelines driven by events/queues Hands on Experience and knowledge on real time/near real time processing and ready to code Hands on Experience in PySpark, Databricks, and Spark Sql. Knowledge on json, Parquet and Other file format and work effectively with them No Sql Databases Knowledge like Hbase, Mongo, Cosmos etc. Preferred Cloud Experience on Azure or AWS Python-spark, Spark Streaming, Azure SQL Server, Cosmos DB/Mongo DB, Azure Event Hubs, Azure Data Lake Storage, Azure Search etc. Team player, Reliable, self-motivated, and self-disciplined individual capable of executing on multiple projects simultaneously within a fast-paced environment working with cross functional teams.
Posted 5 days ago
5.0 - 9.0 years
0 Lacs
karnataka
On-site
As a Senior Python Developer specializing in Python and Spark programming, you will be responsible for leveraging your thorough and hands-on knowledge to write efficient, reusable, and reliable Python code. You should possess a minimum of 5+ years of experience in Python programming and have a strong proficiency in both Python and Spark programming. Your key responsibilities will include implementing Spark Core, Spark SQL, and Spark Streaming, working with Spark within the Hadoop Ecosystem, and designing and implementing low-latency, high-availability, and performance applications. Additionally, you will lead and guide a team of junior Python developers, collaborate with other team members and stakeholders, and contribute to performance tuning, improvement, balancing, usability, and automation throughout the application development process. To excel in this role, you must have experience with data manipulation and analysis using Pandas, as well as knowledge of Polars for efficient data processing. Strong problem-solving skills, attention to detail, and the ability to work collaboratively in a team environment are essential qualities for success in this position. This is a full-time position based in either Chennai or Bangalore. If you are passionate about Python and Spark programming and are eager to take on a challenging role that involves leading a team and driving the development of high-performance applications, we encourage you to apply.,
Posted 5 days ago
5.0 - 9.0 years
0 Lacs
haryana
On-site
At Capgemini Invent, we believe that difference drives change. As inventive transformation consultants, we blend our strategic, creative, and scientific capabilities to collaborate closely with clients in delivering cutting-edge solutions tailored to address the challenges of today and tomorrow. Our approach is informed and validated by science and data, superpowered by creativity and design, and all underpinned by purpose-driven technology. Your role will involve proficiency in various technologies such as MS Fabric, Azure Data Factory, Azure Synapse Analytics, Azure Databricks Lakehouses, OneLake, Data Pipelines, Real-Time Analytics, Power BI Integration, and Semantic Model. You will be responsible for integrating Fabric capabilities to ensure seamless data flow, governance, and collaboration across teams. A strong understanding of Delta Lake, Parquet, and distributed data systems is essential. Additionally, strong programming skills in Python, PySpark, Scala or Spark SQL/TSQL for data transformations are required. In terms of your profile, we are looking for individuals with strong experience in the implementation and management of Lake House using Databricks and Azure Tech stack (ADLS Gen2, ADF, Azure SQL). Proficiency in data integration techniques, ETL processes, and data pipeline architectures is crucial. An understanding of Machine Learning Algorithms & AI/ML frameworks (such as TensorFlow, PyTorch) and Power BI will be an added advantage. MS Fabric and PySpark proficiency are must-have skills for this role. Working with us, you will appreciate the significance of flexible work arrangements that support remote work or flexible work hours, allowing you to maintain a healthy work-life balance. Our commitment to your career growth is at the heart of our mission, offering an array of career growth programs and diverse professions to help you explore a world of opportunities. You will have the opportunity to equip yourself with valuable certifications in the latest technologies like Generative AI. Capgemini is a global business and technology transformation partner that accelerates organizations" dual transition to a digital and sustainable world, creating tangible impact for enterprises and society. With a diverse team of over 340,000 members in more than 50 countries, Capgemini's strong over 55-year heritage is built on trust from clients to unlock technology's value in addressing their entire business needs. The company delivers end-to-end services and solutions spanning strategy, design, and engineering, driven by market-leading capabilities in AI, generative AI, cloud, and data, complemented by deep industry expertise and a strong partner ecosystem.,
Posted 5 days ago
2.0 - 7.0 years
7 - 17 Lacs
bengaluru
Work from Office
About this role: Wells Fargo is seeking a Analytics Consultant. In this role, you will: Consult with business line and enterprise functions on less complex research Use functional knowledge to assist in non-model quantitative tools that support strategic decision making Perform analysis of findings and trends using statistical analysis and document process Present recommendations to increase revenue, reduce expense, maximize operational efficiency, quality, and compliance Identify and define business requirements and translate data and business needs into research and recommendations to improve efficiency Participate in all group technology efforts including design and implementation of database structures, analytics software, storage, and processing Develop customized reports and ad hoc analyses to make recommendations and provide guidance to less experienced staff Understand compliance and risk management requirements for supported area Ensure adherence to data management or data governance regulations and policies Participate in company initiatives or processes to assist in meeting risk and capital objectives and other strategic goals Collaborate and consult with more experienced consultants and with partners in technology and other business groups Required Qualifications: 2+ years of Analytics experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education Desired Qualifications: Hands-on Proficiency in Business Intelligence (BI) particularly Microsoft Power BI, MS Fabric, Power Automate and Power Platforms. Hands-on Proficiency in any or all of the programming languages used for analytics & data science such as SAS, Python, PySpark, Spark SQL or Scala. Hands-on strong knowledge of SQL and experience with database management systems (e.g., Teradata, PostgreSQL, MySQL, or NoSQL databases). Familiarity with data warehousing and big data technologies (e.g., Hadoop, Spark, Snowflake, Redshift). Experience with ELT/ETL tools and data integration techniques. Experience optimizing code for performance and cost. Comfortable with using code and agile process management tools like GitHub and JIRA. Someone with an exposure towards developing solutions for high volume, low latency applications and can operate in a fast paced, highly collaborative environment. Provide production support for data assets and products as required. Knowledge of data modelling and data warehousing best practices. Understanding of data governance, data quality, and data security principles. Strong problem-solving and communication skills. Ability to work in a collaborative team environment. Knowledge of cloud platforms (e.g., Azure and/OR Google Cloud) is a plus. Job Expectations: Design, develop, and maintain ETL (Extract, Transform, Load) ETL processes and data pipelines to move and transform data from various sources into a centralized data repository. Design, implement, and optimize data warehouses and data lakes to ensure scalability, performance, and data consistency. Create and manage data models to support business requirements, ensuring data accuracy, integrity, and accessibility. Integrate data from diverse sources, including databases, APIs, third-party services, and streaming data, and ensure data quality and consistency. Cleanse, transform, and enrich raw data to make it suitable for analysis and reporting. Implement and enforce data security measures to protect sensitive information and ensure compliance with data privacy regulations (e.g., GDPR, HIPAA). Independently build, operate, maintain, enhance, publish and sunset BI Products (own end-to-end life cycle) across enterprise stakeholders along with up-to-date maintenance of all required documentation and artefacts such as SOPs, previous versions, secondary quality reviews, etc. in various BI tools such as Tableau, PowerBI, etc. Continuously monitor and optimize data pipelines and databases for improved performance and efficiency. Develop and implement automated testing procedures to validate data quality and pipeline reliability. Maintain thorough documentation of data processes, schemas, and data lineage to support data governance efforts. Collaborate with wider team such as data scientists, analysts, software engineers, and other stakeholders to understand their data requirements and provide data solutions that meet their needs. Utilize version control systems to manage code and configurations related to data pipelines. Diagnose and resolve data-related issues and provide technical support as needed. Working Hours: 1:30PM-10:30PM India Time
Posted 6 days ago
5.0 - 9.0 years
0 Lacs
karnataka
On-site
You are a highly skilled and experienced Data Modeler who will be joining our data engineering team. Your expertise lies in designing scalable and efficient data models for cloud platforms, with a focus on Oracle Data Warehousing and Databricks Lakehouse architecture. Your role will be crucial in our strategic transition from an on-prem Oracle data warehouse to a modern cloud-based Databricks platform. Your responsibilities will include designing and implementing conceptual, logical, and physical data models to meet business requirements and analytics needs. You will lead the migration of data models from Oracle Data Warehouse to Databricks on AWS or Azure cloud, reverse-engineer complex Oracle schemas, and collaborate with data architects and engineers to define optimal data structures in Databricks. Furthermore, you will optimize data models for performance, scalability, and cost-efficiency in a cloud-native environment, develop and maintain dimensional models using star and snowflake schemas, and ensure data governance standards are met through metadata management, data lineage, and documentation practices. Your input will be valuable in data architecture reviews and best practices in modeling and data pipeline integration. To be successful in this role, you should have at least 5 years of hands-on experience in data modeling, including conceptual, logical, and physical design. You must have proven experience in migrating large-scale Oracle DWH environments to Databricks Lakehouse or similar platforms, expertise in Oracle database schemas, PL/SQL, and performance tuning, as well as proficiency in Databricks, Delta Lake, Spark SQL, and DataFrame APIs. Deep knowledge of dimensional modeling techniques, familiarity with metadata management tools, and strong analytical and communication skills are essential. You should also be able to work collaboratively in Agile teams and effectively communicate data model designs to technical and non-technical stakeholders.,
Posted 6 days ago
4.0 - 10.0 years
0 Lacs
haryana
On-site
As a Data Engineer at Capgemini, you will have the opportunity to design, build, and optimize modern data solutions using your expertise in SQL, Spark SQL, Databricks, Unity Catalog, and PySpark. With a focus on data warehousing concepts and best practices, you will contribute to the development of scalable and high-performance data platforms. You will bring 4-10 years of experience in Data Engineering/ETL Development, showcasing your strong skills in SQL, Spark SQL, and Databricks on Azure or AWS. Your proficiency in PySpark for data processing and knowledge of Unity Catalog for data governance will be key in ensuring the success of data solutions. Additionally, your understanding of data warehousing principles, dimensional modeling, and familiarity with Azure Data Services or AWS Data Services will be advantageous. Your role will involve collaborating with teams globally, demonstrating strong customer orientation, decision-making, problem-solving, communication, and presentation skills. You will have the opportunity to shape compelling solutions, interact with multicultural teams, and exhibit leadership qualities to achieve common goals through collaboration. Capgemini, a global business and technology transformation partner, values diversity and innovation in a sustainable world. With a legacy of over 55 years, Capgemini is known for delivering end-to-end services and solutions in AI, cloud, data, and more. Join us in unlocking the value of technology and contributing to a more inclusive world.,
Posted 6 days ago
2.0 - 6.0 years
0 Lacs
maharashtra
On-site
Unlock your potential with Dassault Systmes, a global leader in Scientific Software Engineering as a Big Data Engineer in Pune, Maharashtra! Role Description & Responsibilities: - Data Pipeline Development: Design, develop, and maintain robust ETL pipelines for batch and real-time data ingestion, processing, and transformation using Spark, Kafka, and Python. - Data Architecture: Build and optimize scalable data architectures, including data lakes, data marts, and data warehouses, to support business intelligence, reporting, and machine learning. - Data Governance: Ensure data reliability, integrity, and governance by enabling accurate, consistent, and trustworthy data for decision-making. - Collaboration: Work closely with data analysts, data scientists, and business stakeholders to gather requirements, identify inefficiencies, and deliver scalable and impactful data solutions. - Optimization: Develop efficient workflows to handle large-scale datasets, improving performance and minimizing downtime. - Documentation: Create detailed documentation for data processes, pipelines, and architecture to support seamless collaboration and knowledge sharing. - Innovation: Contribute to a thriving data engineering culture by introducing new tools, frameworks, and best practices to improve data processes across the organization. Qualifications: - Educational Background: Bachelor's degree in Computer Science, Engineering, or a related field. - Professional Experience: 2-3 years of experience in data engineering, with expertise in designing and managing complex ETL pipelines. Technical Skills: - Proficiency in Python, PySpark, and Spark SQL for distributed and real-time data processing. - Deep understanding of real-time streaming systems using Kafka. - Experience with data lake and data warehousing technologies (Hadoop, HDFS, Hive, Iceberg, Apache Spark). - Strong knowledge of relational and non-relational databases (SQL, NoSQL). - Experience in cloud and on-premises environments for building and managing data pipelines. - Experience with ETL tools like SAP BODS or similar platforms. - Knowledge of reporting tools like SAP BO for designing dashboards and reports. - Hands-on experience building end-to-end data frameworks and working with data lakes. Analytical and Problem-Solving Skills: Ability to translate complex business requirements into scalable and efficient technical solutions. Collaboration and Communication: Strong communication skills and the ability to work with cross-functional teams, including analysts, scientists, and stakeholders. Location: Willingness to work from Pune (on-site). What is in it for you - Work for one of the biggest software companies. - Work in a culture of collaboration and innovation. - Opportunities for personal development and career progression. - Chance to collaborate with various internal users of Dassault Systmes and also stakeholders of various internal and partner projects. Inclusion Statement: As a game-changer in sustainable technology and innovation, Dassault Systmes is striving to build more inclusive and diverse teams across the globe. We believe that our people are our number one asset and we want all employees to feel empowered to bring their whole selves to work every day. It is our goal that our people feel a sense of pride and a passion for belonging. As a company leading change, it's our responsibility to foster opportunities for all people to participate in a harmonized Workforce of the Future.,
Posted 6 days ago
5.0 - 9.0 years
0 Lacs
pune, maharashtra
On-site
This position falls under the ICG TTS Operations Technology (OpsTech) Group, focusing on assisting in the implementation of a next-generation Digital Automation Platform and Imaging Workflow Technologies. The ideal Candidate should have relevant experience in managing development teams within the distributed systems Eco-System and must exhibit strong teamwork skills. The candidate is expected to possess superior technical knowledge of current programming languages, technologies, and leading-edge development tools. The primary objective of this role is to contribute to applications, systems analysis, and programming activities. As a Lead Spark Scala Engineer, the candidate should have hands-on knowledge of SPARK, Py-Spark, Scala, Java, and RDBMS like MS-SQL/Oracle. Familiarity with CI/CD tools such as LightSpeed and uDeploy is also required. Key Responsibilities include: - Development & Optimization: Develop, test, and deploy production-grade Spark applications in Scala, ensuring optimal performance, scalability, and resource utilization. - Technical Leadership: Provide guidance to a team of data engineers, promoting a culture of technical excellence and collaboration. - Code Review & Best Practices: Conduct thorough code reviews, establish coding standards, and enforce best practices for Spark Scala development, data governance, and data quality. - Performance Tuning: Identify and resolve performance bottlenecks in Spark applications through advanced tuning techniques. - Deep Spark Expertise: Profound understanding of Spark's architecture, execution model, and optimization techniques. - Scala Proficiency: Expert-level proficiency in Scala programming, including functional programming paradigms and object-oriented design. - Big Data Ecosystem: Strong hands-on experience with the broader Hadoop ecosystem and related big data technologies. - Database Knowledge: Solid understanding of relational databases and NoSQL databases. - Communication: Excellent communication, interpersonal, and leadership skills to convey complex technical concepts effectively. - Problem-Solving: Exceptional analytical and problem-solving abilities with meticulous attention to detail. Education Requirement: - Bachelors degree/University degree or equivalent experience This position is a full-time role falling under the Technology Job Family Group and Applications Development Job Family. The most relevant skills include those mentioned in the requirements section, while additional complementary skills can be found above or by contacting the recruiter.,
Posted 1 week ago
8.0 - 12.0 years
0 Lacs
maharashtra
On-site
As a Data Engineer, you will be responsible for building scalable data pipelines using PySpark. Your role will involve implementing complex business logic using Spark SQL, DataFrame, and RDD APIs. You should have strong programming skills in Python, with a solid understanding of data structures, algorithms, and software engineering principles. Your expertise in designing, developing, and maintaining batch and streaming data pipelines will be crucial. You should be familiar with ETL/ELT processes and best practices for data transformation, data quality, and performance optimization. Knowledge of the modern data engineering ecosystem, including distributed data processing, storage systems, and workflow orchestration tools like Apache Airflow, dbt, and Delta Lake, is desirable. Experience with cloud data platforms, preferably AWS, is preferred. You should have hands-on experience with AWS services such as S3 for data lake, Glue/EMR for Spark workloads, Lambda, Step Functions for orchestration, and Redshift or other cloud data warehouses. As an expert in Spark APIs, you should be able to choose and apply the right APIs (DataFrame, Dataset, RDD) for efficient implementation of business logic at scale. This role offers a 12+ month contract with a likely long-term opportunity, following a hybrid work mode with an immediate to 15 days notice period. If you have a passion for data engineering and the skills mentioned above, we would like to hear from you.,
Posted 1 week ago
2.0 - 6.0 years
0 Lacs
maharashtra
On-site
As a Data Engineer II at Media.net, you will be responsible for designing, executing, and managing large and complex distributed data systems. Your role will involve monitoring performance, optimizing existing projects, and researching and integrating Big Data tools and frameworks as required to meet business and data requirements. You will play a key part in implementing scalable solutions, creating reusable components and data tools, and collaborating with teams across the company to integrate with the data platform efficiently. The team you will be a part of ensures that every web page view is seamlessly processed through high-scale services, handling a large volume of requests across 5 million unique topics. Leveraging cutting-edge Machine Learning and AI technologies on a large Hadoop cluster, you will work with a tech stack that includes Java, Elastic Search/Solr, Kafka, Spark, Machine Learning, NLP, Deep Learning, Redis, and Big Data technologies such as Hadoop, HBase, and YARN. To excel in this role, you should have 2 to 4 years of experience in big data technologies like Apache Hadoop and relational databases (MS SQL Server/Oracle/MySQL/Postgres). Proficiency in programming languages such as Java, Python, or Scala is required, along with expertise in SQL (T-SQL/PL-SQL/SPARK-SQL/HIVE-QL) and Apache Spark. Hands-on knowledge of working with Data Frames, Data Sets, RDDs, Spark SQL/PySpark/Scala APIs, and deep understanding of Performance Optimizations will be essential. Additionally, you should have a good grasp of Distributed Storage (HDFS/S3), strong analytical and quantitative skills, and experience with data integration across multiple sources. Experience with Message Queues like Apache Kafka, MPP systems such as Redshift/Snowflake, and NoSQL storage like MongoDB would be considered advantageous for this role. If you are passionate about working with cutting-edge technologies, collaborating with global teams, and contributing to the growth of a leading ad tech company, we encourage you to apply for this challenging and rewarding opportunity.,
Posted 1 week ago
7.0 - 11.0 years
0 Lacs
maharashtra
On-site
As a Databricks AWS/Azure/GCP Architect at Koantek based in Mumbai, you will play a crucial role in building secure and highly scalable big data solutions that drive tangible, data-driven outcomes while emphasizing simplicity and operational efficiency. Collaborating with teammates, product teams, and cross-functional project teams, you will lead the adoption and integration of the Databricks Lakehouse Platform into the enterprise ecosystem and AWS/Azure/GCP architecture. Your responsibilities will include implementing securely architected big data solutions that are operationally reliable, performant, and aligned with strategic initiatives. Your expertise should include an expert-level knowledge of data frameworks, data lakes, and open-source projects like Apache Spark, MLflow, and Delta Lake. You should possess hands-on coding experience in Spark/Scala, Python, or Pyspark. An in-depth understanding of Spark Architecture, including Spark Core, Spark SQL, Data Frames, Spark Streaming, RDD caching, and Spark MLib, is essential for this role. Experience in IoT/event-driven/microservices in the cloud, familiarity with private and public cloud architectures, and extensive hands-on experience in implementing data migration and data processing using AWS/Azure/GCP services are key requirements. With over 9 years of consulting experience and a minimum of 7 years in data engineering, data platform, and analytics, you should have a proven track record of delivering projects with hands-on development experience on Databricks. Your knowledge of at least one cloud platform (AWS, Azure, or GCP) is mandatory, along with deep experience in distributed computing with Spark and familiarity with Spark runtime internals. Additionally, you should be familiar with CI/CD for production deployments, optimization for performance and scalability, and have completed data engineering professional certification and required classes. If you are a results-driven professional with a passion for architecting cutting-edge big data solutions and have the desired skill set, we encourage you to apply for this exciting opportunity.,
Posted 1 week ago
5.0 - 9.0 years
0 Lacs
pune, maharashtra
On-site
As a Senior Databricks Developer at Newscape Consulting, you will play a crucial role in our data engineering team, focusing on building scalable and efficient data pipelines using Databricks, Apache Spark, Delta Lake, and cloud-native services (Azure/AWS/GCP). Your responsibilities will include collaborating closely with data architects, data scientists, and business stakeholders to deliver high-performance, production-grade solutions that enhance user experience and productivity in the healthcare industry. Your key skills should include a strong hands-on experience with Databricks including Workspace, Jobs, DLT, Repos, and Unity Catalog. Proficiency in PySpark, Spark SQL, and optionally Scala is essential. You should also have a solid understanding of Delta Lake, Lakehouse architecture, and medallion architecture. Additionally, proficiency in at least one cloud platform such as Azure, AWS, or GCP is required. Experience in CI/CD for Databricks using tools like Azure DevOps or GitHub Actions, strong SQL skills, and familiarity with data warehousing concepts are essential for this role. Knowledge of data governance, lineage, and catalog tools like Unity Catalog or Purview will be beneficial. Familiarity with orchestration tools like Airflow, Azure Data Factory, or Databricks Workflows is also desired. This position is based in Pune, India, and is a full-time role with the option to work from the office. Strong communication, problem-solving, and stakeholder management skills are key attributes that we are looking for in the ideal candidate for this role.,
Posted 1 week ago
3.0 - 7.0 years
0 Lacs
pune, maharashtra
On-site
We are searching for a proficient Python Developer to become a valuable member of our product development team. Your primary focus will involve automating data ingestion, processing, and validation workflows utilizing PySpark to create robust, scalable, and efficient data pipelines. In this role, you will collaborate closely with data engineers, analysts, and stakeholders to provide impactful data solutions within a dynamic work environment. Your responsibilities will include collaborating with data engineers and stakeholders to outline requirements and implement automated solutions. You will design, develop, and upkeep scalable and efficient PySpark-based automation for data ingestion, processing, and calculation. Additionally, you will automate the integration of data from various sources such as databases, APIs, flat files, and cloud storage. Implementing and optimizing ETL workflows for high-performance data pipelines will be crucial, along with ensuring data quality through validation checks and exception handling in data processing pipelines. Troubleshooting and resolving issues in data pipelines to uphold operational efficiency will also be part of your duties. Key Responsibilities: - Collaborate with data engineers and stakeholders to define requirements and deliver automated solutions. - Design, develop, and maintain scalable and efficient PySpark-based automation for data ingestion, processing, and calculation. - Automate the reading and integration of data from multiple sources, such as databases, APIs, flat files, and cloud storage. - Implement and optimize ETL workflows for high-performance data pipelines. - Ensure data quality by incorporating validation checks and handling exceptions in data processing pipelines. - Troubleshoot and resolve issues in data pipelines to maintain operational efficiency. Required Skills and Qualifications: - 3+ years of strong proficiency in Python for data handling and validation. - Strong experience in Python libraries like pandas, NumPy, and duckdb. - Familiarity with cloud platforms such as AWS, Azure, or GCP (e.g., S3, Databricks, or BigQuery). - Experience with data pipeline orchestration tools such as Apache Airflow or similar. - Proficiency in SQL for querying and manipulating data. - Experience in handling structured, semi-structured, and unstructured data. - Familiarity with CI/CD processes and version control tools such as Git. - Knowledge of performance tuning and optimization techniques in PySpark. - Strong analytical and problem-solving skills. NICE TO HAVE: - Knowledge of Spark architecture, including RDDs, DataFrames, and Spark SQL. - Knowledge of Keyword-driven automation framework. - Quality Engineering background with prior experience of building automation solutions for data heavy applications. - Familiarity with REST APIs and data integration techniques. - Understanding of data governance, compliance, and security principles. This position is open for immediate joiners ONLY.,
Posted 1 week ago
3.0 - 5.0 years
17 - 18 Lacs
chennai
Work from Office
Key Responsibilities: Develop and optimize data pipelines in Databricks for transforming and processing data from various sources. Integrate data using Unity Catalog and external data sources (data lakes, APIs, etc.). Write Spark SQL and PySpark scripts for data transformations, optimizations, and creating views/procedures. Perform data analysis to identify quality issues, optimize pipelines, and enhance data processing for analytics. Collaborate on report generation and dashboard creation with front-end teams. Use GitLab for version control, CI/CD automation, and task management (Jira). Required Skills and Qualifications: Masters or Bachelor’s degree in Data Engineering or related field. 3+ years experience with Databricks and advanced SQL. Experience with ETL processes, views, and procedures. Strong experience with Databricks, Spark SQL, PySpark, and SQL. Expertise in creating and optimizing views and stored procedures in Databricks. Experience building ETL workflows and data models. Knowledge of cloud platforms (AWS, Azure) and version control tools (Git, GitLab). Experience with healthcare or clinical trial data. Familiarity with DevOps practices.
Posted 2 weeks ago
6.0 - 10.0 years
0 Lacs
noida, uttar pradesh
On-site
As a skilled professional with over 7 years of experience, you will be responsible for reviewing and understanding business requirements to ensure timely completion of development tasks with rigorous testing to minimize defects. Collaborating with a software development team is crucial to implement best practices and enhance the performance of Data applications, meeting client needs effectively. In this role, you will collaborate with various teams within the company and engage with customers to comprehend, translate, define, and design innovative solutions for their business challenges. Your tasks will also involve researching new Big Data technologies to evaluate their maturity and alignment with business and technology strategies. Operating within a rapid and agile development process, you will focus on accelerating speed to market while upholding necessary controls. Your qualifications should include a BE/B.Tech/MCA degree with a minimum of 6 years of IT experience, including 4 years of hands-on experience in design and development using the Hadoop technology stack and various programming languages. Furthermore, you are expected to have proficiency in multiple areas such as Hadoop, HDFS, MR, Spark Streaming, Spark SQL, Spark ML, Kafka/Flume, Apache NiFi, Hortonworks Data Platform, Hive, Pig, Sqoop, NoSQL Databases (HBase, Cassandra, Neo4j, MongoDB), Visualization & Reporting frameworks (D3.js, Zeppelin, Grafana, Kibana, Tableau, Pentaho), Scrapy for web crawling, Elastic Search, Google Analytics data streaming, and Data security protocols (Kerberos, Open LDAP, Knox, Ranger). A strong knowledge of the current technology landscape, industry trends, and experience in Big Data integration with Metadata Management, Data Quality, Master Data Management solutions, structured/unstructured data is essential. Your active participation in the community through articles, blogs, or speaking engagements at conferences will be highly valued in this role.,
Posted 2 weeks ago
7.0 - 11.0 years
0 Lacs
navi mumbai, maharashtra
On-site
The ideal candidate will be responsible for designing and implementing streaming data pipelines that integrate Kafka with Databricks using Structured Streaming. You will also be tasked with architecting and maintaining the Medallion Architecture, which consists of well-defined Bronze, Silver, and Gold layers. Additionally, you will need to implement efficient data ingestion processes using Databricks Autoloader for high-throughput data loads. You will work with large volumes of structured and unstructured data to ensure high availability and performance, applying performance tuning techniques like partitioning, caching, and cluster resource optimization. Collaboration with cross-functional teams, including data scientists, analysts, and business users, is essential to build robust data solutions. The role also involves establishing best practices for code versioning, deployment automation, and data governance. The required technical skills for this position include strong expertise in Azure Databricks and Spark Structured Streaming, along with at least 7 years of experience in Data Engineering. You should be familiar with processing modes (append, update, complete), output modes (append, complete, update), checkpointing, and state management. Experience with Kafka integration for real-time data pipelines, a deep understanding of Medallion Architecture, proficiency with Databricks Autoloader and schema evolution, and familiarity with Unity Catalog and Foreign catalog are also necessary. Strong knowledge of Spark SQL, Delta Lake, and DataFrames, expertise in performance tuning, data management strategies, governance, access management, data modeling, data warehousing concepts, and Databricks as a platform, as well as a solid understanding of Window functions will be beneficial in this role.,
Posted 2 weeks ago
5.0 - 9.0 years
0 Lacs
karnataka
On-site
As a Data Engineer at MS Fabric, your primary responsibility will be designing, implementing, and managing scalable data pipelines using the MS Fabric Azure Tech stack, including ADLS Gen2, ADF, and Azure SQL. You should have a strong background in data integration techniques, ETL processes, and data pipeline architectures. Additionally, you will need to be well-versed in data quality rules, principles, and implementation. Your key result areas and activities will include: 1. Data Pipeline Development & Optimization: - Design and implement data pipelines using MS Fabric. - Manage and optimize ETL processes for data extraction, transformation, and loading. - Conduct performance tuning for data storage and retrieval to enhance efficiency. 2. Data Quality, Governance & Documentation: - Ensure data quality and integrity across all data processes. - Assist in designing data governance frameworks and policies. - Generate and maintain documentation for data architecture and data flows. 3. Cross-Functional Collaboration & Requirement Gathering: - Collaborate with cross-functional teams to gather and define data requirements. - Translate functional and non-functional requirements into system specifications. 4. Technical Leadership & Support: - Provide technical guidance and support to junior data engineers. - Participate in code reviews and ensure adherence to coding standards. - Troubleshoot data-related issues and implement effective solutions. In terms of technical experience, you must be proficient in MS Fabric, Azure Data Factory, and Azure Synapse Analytics. You should have a deep knowledge of Fabric components like writing Notebook, Lakehouses, OneLake, Data Pipelines, and Real-Time Analytics. Additionally, you should be skilled in integrating Fabric capabilities for seamless data flow, governance, and cross-team collaboration. Strong grasp of Delta Lake, Parquet, distributed data systems, and various data formats (JSON, XML, CSV, Parquet) is essential. Experience in ETL/ELT processes, data warehousing, data modeling, and data quality frameworks is required. Proficiency in Python, PySpark, Scala, Spark SQL, and T-SQL for complex data transformations is a must-have. It would be beneficial if you have familiarity with Azure cloud platforms and cloud data services, MS Purview, and open-source libraries like Dequee, Pydequee, Great Expectation for DQ implementation. Additionally, experience with developing data models to support business intelligence and analytics, PowerBI dashboard, and Databricks is a plus. To qualify for this position, you should hold a Bachelor's or Master's degree in Computer Science, Engineering, or a related field, along with at least 5 years of experience in MS Fabric/ADF/Synapse. You should also have experience with or knowledge of Agile Software Development methodologies and be able to consult, write, and present persuasively.,
Posted 2 weeks ago
12.0 - 16.0 years
0 Lacs
hyderabad, telangana
On-site
Intellectt Inc. is seeking AI Product Engineers to join our team and contribute to our expanding AI-focused product portfolio. As an AI Product Engineer, you will play a crucial role in bridging the gap between product development and advanced AI technologies. This position requires a deep understanding of AI models, APIs, and data pipelines, along with a strong focus on product design, user experience, and deployment strategies. Key Responsibilities - Collaborate with product managers and AI teams to design and develop AI-driven products leveraging LLMs such as GPT-3.5, GPT-4, and Gemini. - Translate business requirements into AI product features and develop functional prototypes. - Implement LangChain-based workflows, including LangSmith and LangGraph, to enable intelligent app interactions. - Integrate Retrieval-Augmented Generation (RAG) pipelines and vector databases like ChromaDB to enhance AI performance. - Customize model outputs based on product needs and use cases through prompt engineering. - Develop REST APIs and microservices using Flask or FastAPI to facilitate model integration. - Manage and process structured and unstructured data using SQL, PySpark, and Spark SQL. - Collaborate with UI/UX teams to ensure seamless user interaction with AI components. - Support product deployment and monitoring on Azure ML or AWS Bedrock. Required Skills - Proficiency in Python, SQL, PySpark, and Spark SQL for programming and data processing. - Understanding of Generative AI, LLMs, and prompt engineering. - Experience with LangChain, LangGraph, and LangSmith for workflow development. - Familiarity with vector databases and embedding techniques for AI applications. - Exposure to cloud platforms such as Azure ML or AWS Bedrock. - Development of REST APIs using Flask or FastAPI. - Strong problem-solving skills and a product-oriented mindset. Preferred Qualifications - Bachelor's degree in Computer Science, AI/ML, Data Science, or related field. - Internship or academic experience in AI product development or applied NLP. - Knowledge of MLOps concepts and product lifecycle best practices. - Basic understanding of UI/UX principles and user-centric design. Join our team at Intellectt Inc. and be part of our mission to drive innovation through Artificial Intelligence and Digital Transformation services. Your expertise as an AI Product Engineer will contribute to building impactful, intelligent solutions that shape the future of technology.,
Posted 2 weeks ago
6.0 - 10.0 years
0 Lacs
noida, uttar pradesh
On-site
As a Data Engineer at our company, you will play a crucial role in our data team's growth and success. With 6 to 10 years of experience in data engineering, proficiency in SQL, Databricks, Spark SQL, PySpark, and BI tools like Power BI or Tableau, you are well-equipped to take on the responsibilities that come with this position. Your primary responsibilities will include designing and developing scalable data pipelines using PySpark and Databricks, crafting efficient SQL and Spark SQL queries for data transformation and analysis, and collaborating closely with BI teams to facilitate reporting through Power BI or Tableau. Additionally, you will be tasked with optimizing the performance of big data workflows, ensuring data quality, and implementing best practices for data integration, processing, and governance. To excel in this role, you should hold a Bachelor's degree in Computer Science, Engineering, or a related field, along with 6 to 10 years of experience in data engineering or a similar capacity. Your strong command over SQL, Spark SQL, PySpark, and Databricks, as well as your familiarity with BI tools such as Power BI and/or Tableau, will be essential in delivering high-quality results. Moreover, your expertise in data warehousing, ETL/ELT concepts, and problem-solving skills will be invaluable as you collaborate with various stakeholders. While not mandatory, experience with cloud data platforms (Azure, AWS, or GCP), knowledge of CI/CD pipelines and version control tools like Git, and an understanding of data governance, security, and compliance standards would be advantageous. Exposure to data lake architectures and real-time streaming data pipelines will further enhance your capabilities in this role. This is a full-time position based in Noida, Uttar Pradesh, requiring in-person work. In return, we offer health insurance benefits and a day shift schedule. If you meet the qualifications and possess the requisite experience, we encourage you to apply and become a valuable part of our dynamic data team.,
Posted 2 weeks ago
12.0 - 16.0 years
65 - 70 Lacs
ahmedabad, chennai, bengaluru
Work from Office
About the Role: Looking for a Principal Data Engineer to lead the design and delivery of scalable data solutions using Azure Data Factory and Azure Data bricks. This is a consulting- focused role that requires strong technical expertise, stakeholder engagement, and architectural thinking. You will work closely with business, functional, and technical teams to define data strategies, Design robust pipelines, and ensure smooth delivery in an Agile environment. Responsibilities Collaborate with business and technology stakeholders to gather and understand data needs Translate functional requirements into scalable and maintainable data architecture Design and implement robust data pipelines Lead data modeling, transformation, and performance optimization efforts Ensure data quality, validation, and consistency Participate in Agile ceremonies including sprint planning and backlog grooming Support CI/CD automation for data pipelines and integration workflows Mentor junior engineers and promote best practices in data engineering Must Have 12+ years of IT experience, with at least 5years in data architecture roles in modern metadata driven and cloud-based technologies ,bringing a software engineering mindset Strong analytical and problem-solving skills-Ability to determine data patterns and perform root cause analysis to resolve production issues Excellent communication skills, with experience in leading client-facing discussion Strong hands-on experience with Azure Data Factory and Data bricks, leveraging custom solutioning and design beyond drag-and-drop capabilities for big data workloads Demonstrated proficiency in SOL, Python, and Spark Experience with CI/CD pipelines, version control and DevOps tools Experience with applying dimensional and Data Vault methodologies Background in working with Agile methodologies and sprint-based delivery Ability to produce clear and comprehensive technical documentation Nice to Have Experience with Azure Synapse and Power BI Experience with Microsoft Purview and/or Unity Catalog Understanding of Data Lakehouse and Data Mesh concepts Familiarity with enterprise data governance and quality frameworks Manufacturing experience within the operations domain Location : Ahmedabad,Bengaluru,Chennai,Gurugram,Hyderabad,Mumbai,Pune
Posted 2 weeks ago
8.0 - 12.0 years
0 Lacs
hyderabad, telangana
On-site
You have an exciting opportunity to join a dynamic and high-impact firm as a Python, PySpark, and SQL Developer with 8-12 years of relevant experience. In this role, you will be responsible for working on development activities, collaborating with cross-functional teams, and designing scalable data pipelines using Python and PySpark. You will also be involved in implementing ETL processes, developing Power BI reports and dashboards, and optimizing data pipelines for performance and reliability. The ideal candidate should have 8+ years of experience in Spark, Scala, and PySpark for big data processing. Proficiency in Python programming for data manipulation and analysis is essential, along with knowledge of Python libraries such as Pandas and NumPy. Strong knowledge of SQL for querying databases and experience with database systems like Lakehouse, PostgreSQL, Teradata, and SQL Server are also required. Additionally, candidates should have strong analytical and problem-solving skills, effective communication skills, and the ability to troubleshoot and resolve data-related issues. Key Responsibilities: - Work on development activities and lead activities - Coordinate with Product Manager and Development Architect - Collaborate with other teams to understand data requirements and deliver solutions - Design, develop, and maintain scalable data pipelines using Python and PySpark - Utilize PySpark and Spark scripting for data processing and analysis - Implement ETL processes to ensure accurate data processing and storage - Develop and maintain Power BI reports and dashboards - Optimize data pipelines for performance and reliability - Integrate data from various sources into centralized data repositories - Ensure data quality and consistency across different data sets - Analyze large data sets to identify trends, patterns, and insights - Optimize PySpark applications for better performance and scalability - Continuously improve data processing workflows and infrastructure If you meet the qualifications and are interested in this incredible opportunity, please share your updated resume along with total experience, relevant experience in Python, PySpark, and SQL, current location, current CTC, expected CTC, and notice period. We assure you that your profile will be handled with strict confidentiality. Apply now and be a part of this amazing journey! Thank you, Syed Mohammad syed.m@anlage.co.in,
Posted 2 weeks ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
73564 Jobs | Dublin
Wipro
27625 Jobs | Bengaluru
Accenture in India
22690 Jobs | Dublin 2
EY
20638 Jobs | London
Uplers
15021 Jobs | Ahmedabad
Bajaj Finserv
14304 Jobs |
IBM
14148 Jobs | Armonk
Accenture services Pvt Ltd
13138 Jobs |
Capgemini
12942 Jobs | Paris,France
Amazon.com
12683 Jobs |