Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
8.0 - 10.0 years
13 - 18 Lacs
Hyderabad, Bengaluru
Hybrid
Databricks Technical Leadership: Guide and mentor teams in designing and implementing Databricks solutions. Architecture & Design: Develop scalable data pipelines and architectures using Databricks Lakehouse. Data Engineering: Lead the ingestion and transformation of batch and streaming data. Performance Optimization: Ensure efficient resource utilization and troubleshoot performance bottlenecks. Security & Compliance: Implement best practices for data governance, access control, and compliance. Collaboration: Work closely with data engineers, analysts, and business stakeholders. Cloud Integration: Manage Databricks environments on Azure, AWS, or GCP. Monitoring & Automation: Set up monitoring tools and automate workflows for efficiency. Qualifications: 7+ years of experience in Databricks, Apache Spark, and big data processing. Proficiency in Python, Scala, or SQL. Strong knowledge of Delta Lake, Unity Catalog, and MLflow. Experience with ETL processes and cloud platforms. Excellent problem-solving and leadership skills.
Posted 1 month ago
6 - 10 years
11 - 21 Lacs
Bengaluru
Hybrid
RESPONSIBILITIES: Choosing the right technologies for our use cases, deploy and operate. Setting up Data stores structured, semi structured and non-structured. Secure data at rest via encryption Implement tool to access securely multiple data sources Implement solutions to run real-time analytics Use container technologies Required Experience & Skills: Experience in one of the following: Elastic Search, Cassandra, Hadoop, Mongo DB Experience in Spark and Presto/Trino Experience with microservice based architectures Experience on Kubernetes Experience of Unix/Linux environments is plus Experience of Agile/Scrum development methodologies is a plus Cloud knowledge a big plus (AWS/GCP) (Kubernetes/Docker) Be nice, respectful, able to work in a team Willingness to learn
Posted 1 month ago
4 - 9 years
11 - 15 Lacs
Kochi
Work from Office
We are looking for a highly skilled and experienced Data Management Lead (Architect) with 4 to 9 years of experience to design, implement, and manage data lake environments. The ideal candidate will have a strong background in data management, architecture, and analytics. ### Roles and Responsibility Design and implement scalable, secure, and high-performing data lake architectures. Select appropriate technologies and platforms for data storage, processing, and analytics. Define and enforce data governance, metadata management, and data quality standards. Collaborate with IT security teams to establish robust security measures. Develop and maintain data ingestion and integration processes from various sources. Provide architectural guidance and support to data scientists and analysts. Monitor the performance of the data lake and recommend improvements. Stay updated on industry trends and advancements in data lake technologies. Liaise with business stakeholders to understand their data needs and translate requirements into technical specifications. Create documentation and architectural diagrams to provide a clear understanding of the data lake structure and processes. Lead the evaluation and selection of third-party tools and services to enhance the data lake's capabilities. Mentor and provide technical leadership to the data engineering team. Manage the full lifecycle of the data lake, including capacity planning, cost management, and decommissioning of legacy systems. ### Job Requirements At least 4 years of hands-on experience in designing, implementing, and managing data lakes or large-scale data warehousing solutions. Proficiency with data lake technologies such as Hadoop, Apache Spark, Apache Hive, or Azure Data Lake Storage. Experience with cloud services like AWS (Amazon Web Services), Microsoft Azure, or Google Cloud Platform, especially with their data storage and analytics offerings. Knowledge of SQL and NoSQL database systems, including relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., MongoDB, Cassandra). Expertise in data modeling techniques and tools for both structured and unstructured data. Experience with ETL (Extract, Transform, Load) tools and processes, and understanding of data integration and transformation best practices. Proficiency in programming languages commonly used for data processing and analytics, such as Python, Scala, or Java. Familiarity with data governance frameworks and data quality management practices to ensure the integrity and security of data within the lake. Knowledge of data security principles, including encryption, access controls, and compliance with data protection regulations (e.g., GDPR, HIPAA). Experience with big data processing frameworks and systems, such as Apache Kafka for real-time data streaming and Apache Flink or Apache Storm for stream processing. Familiarity with data pipeline orchestration tools like Apache Airflow, Luigi, or AWS Data Pipeline. Understanding of DevOps practices, including continuous integration/continuous deployment (CI/CD) pipelines, and automation tools like Jenkins or GitLab CI. Skills in monitoring data lake performance, diagnosing issues, and optimizing storage and processing for efficiency and cost-effectiveness. Ability to manage projects, including planning, execution, monitoring, and closing, often using methodologies like Agile or Scrum. Self-starter, independent-thinker, curious and creative person with ambition and passion. Bachelor's Degree: A bachelor's degree in Computer Science, Information Technology, Data Science, or a related field is typically required. This foundational education provides the theoretical knowledge necessary for understanding complex data systems. Master's Degree (optional): A master's degree or higher in a relevant field such as Computer Science, Data Science, or Information Systems can be beneficial. It indicates advanced knowledge and may be preferred for more senior positions. Certifications (optional): Industry-recognized certifications can enhance a candidate's qualifications. Examples include AWS Certified Solutions Architect, Azure Data Engineer Associate, Google Professional Data Engineer, Cloudera Certified Professional (CCP), or certifications in specific technologies like Apache Hadoop or Spark. PowerBI or any other reporting platform experience is a must. Knowledge on Power Automate, Qlik View, or any other reporting platform is an added advantage. ITIL Foundation certification is preferred.
Posted 1 month ago
10 - 16 years
40 - 60 Lacs
Bengaluru
Hybrid
Key Skills: SCALA, Apache Spark, SQL, SparkSQL, Spark, Core Java, Java Roles and Responsibilities: Lead technical initiatives and contribute as a senior team member to achieve project goals and deadlines. Collaborate with team members to design, implement, and optimize software solutions aligned with organizational objectives. Build scalable, efficient, and high-performance pipelines and workflows for processing large amounts of batch and real-time data. Perform multidisciplinary work, supporting real-time streams, ETL pipelines, data warehouses, and reporting services. Recommend and advocate for technology upgrades to company leaders to ensure infrastructure remains robust and competitive. Design and develop microservices and data applications while ensuring seamless integration with other systems. Leverage Big Data technologies like Kafka, AWS S3, EMR, and Spark to handle data ingestion, transformation, and querying. Follow coding best practices, including unit testing, code reviews, code coverage, and maintaining comprehensive documentation. Conduct thorough code reviews to maintain quality, mentor junior team members, and promote continuous learning within the team. Enhance system performance through analysis and capacity planning, ensuring efficient and reliable software releases. Actively bring new and innovative solutions to address challenging software issues that arise throughout the product lifecycle. Implement and promote security protocols and data governance standards across development projects. Actively engage in Agile processes to foster collaboration and innovation within the team. Required job skills: Strong software design capabilities with a deep understanding of design patterns and performance optimizations. Proficiency in writing high-quality, well-structured code in Java and Scala. Expertise in SQL and relational databases, with advanced skills in writing efficient, complex queries and optimizing database performance. Expertise in cloud computing infrastructure, particularly AWS (Aurora MySQL, DynamoDB, EMR, Lambda, etc.). Solid experience with Big Data tools such as Apache Spark and Kafka. Ability to clearly document and communicate technical solutions to diverse audiences. Experience mentoring and conducting constructive code reviews to support team development. Familiarity with Agile methodologies and modern development tools. Skills Required: 10+ years experience in designing and developing enterprise level software solutions 3 years experience developing Scala / Java applications and microservices using Spring Boot 7 years experience with large volume data processing and big data tools such as Apache Spark, SQL, Scala, and Hadoop technologies 5 years experience with SQL and Relational databases 2 year Experience working with the Agile/Scrum methodology Education: Bachelors Degree in related field
Posted 1 month ago
11 - 12 years
25 - 30 Lacs
Hyderabad
Work from Office
Job Description Lead Data Engineer Position: Lead Data Engineer Location: Hyderabad (Work from Office Mandatory) Experience: 10+ years overall | 8+ years relevant in Data Engineering Notice Period: Immediate to 30 days. About the Role We are looking for a strategic and hands-on Lead Data Engineer to architect and lead cutting-edge data platforms that empower business intelligence, analytics, and AI initiatives. This role demands a deep understanding of cloud-based big data ecosystems, excellent leadership skills, and a strong inclination toward driving data quality and governance at scale. You will define the data engineering roadmap, architect scalable data systems, and lead a team responsible for building and optimizing pipelines across structured and unstructured datasets in a secure and compliant environment. Key Responsibilities 1. Technical Strategy & Architecture Define the vision and technical roadmap for enterprise-grade data platforms (Lakehouse, Warehouse, Real-Time Pipelines). Lead evaluation of data platforms and tools, making informed build vs. buy decisions. Design solutions for long-term scalability, cost-efficiency, and performance. 2. Team Leadership Mentor and lead a high-performing data engineering team. Conduct performance reviews, technical coaching, and participate in hiring/onboarding. Instill engineering best practices and a culture of continuous improvement. 3. Platform & Pipeline Engineering Build and maintain data lakes, warehouses, and lakehouses using AWS, Azure, GCP, or Databricks. Architect and optimize data models and schemas tailored for analytics/reporting. Manage large-scale ETL/ELT pipelines for batch and streaming use cases. 4. Data Quality, Governance & Security Enforce data quality controls: automated validation, lineage, anomaly detection. Ensure compliance with data privacy and governance frameworks (GDPR, HIPAA, etc.). Manage metadata and documentation for transparency and discoverability. 5. Cross-Functional Collaboration Partner with Data Scientists, Product Managers, and Business Teams to understand requirements. Translate business needs into scalable data workflows and delivery mechanisms. Support self-service analytics and democratization of data access. 6. Monitoring, Optimization & Troubleshooting Implement monitoring frameworks to ensure data reliability and latency SLAs. Proactively resolve bottlenecks, failures, and optimize system performance. Recommend platform upgrades and automation strategies. 7. Technical Leadership & Community Building Lead code reviews, define development standards, and share reusable components. Promote innovation, experimentation, and cross-team knowledge sharing. Encourage open-source contributions and thought leadership. Required Skills & Experience 10+ years of experience in data engineering or related domains. Expert in PySpark, Python, and SQL . Deep expertise in Apache Spark and other distributed processing frameworks. Hands-on experience with cloud platforms (AWS, Azure, or GCP) and services like S3, EMR, Glue, Databricks, Data Factory . Proficient in data warehouse solutions (e.g., Snowflake, Redshift, BigQuery) and RDBMS like PostgreSQL or SQL Server. Knowledge of orchestration tools (Airflow, Dagster, or cloud-native schedulers). Familiarity with CI/CD tools , Git, and Infrastructure as Code (Terraform, CloudFormation). Strong data modeling and lifecycle management understanding.
Posted 1 month ago
4 - 7 years
6 - 9 Lacs
Noida
Work from Office
Role Objective: We are seeking a Software Data Engineer with 4-7 year of experience to join our Data Platform team. This role will report to the Manager of data engineering and be involved in the planning, design, and implementation of our centralized data warehouse solution for ETL, reporting and analytics across all applications within the company. Qualifications : Deep knowledge and experience working with Python/Scala and Apache Spark Experienced in Azure data factory, Azure Data bricks, Azure Blob Storage, Azure Data Lake, Delta lake. Experienced in orchestration tool Apache Airflow . Experience working with SQL and NoSQL database systems such as MongoDB, Apache Parquet Experience with Azure cloud environments Experience with acquiring and preparing data from primary and secondary disparate data sources Experience working on large scale data product implementation. Experience working with agile methodology preferred. Healthcare industry experience preferred. Responsibilities : Collaborate with and across Agile teams to design, develop, test, implement, and support technical solutions Work with other team with deep experience in ETL process and data science domains to understand how to centralize their data Share your passion for staying experimenting with and learning new technologies. Perform thorough data analysis, uncover opportunities, and address business problems.
Posted 1 month ago
6 - 8 years
8 - 10 Lacs
Hyderabad
Work from Office
Responsibilities: Solve problems, analyze and isolate issues. Provide technical guidance and mentoring to the team and help them adopt change as new processes are introduced. Champion best practices and serve as a subject matter authority. Develop solutions to develop/support key business needs. Engineer components and common services based on standard development models, languages and tools Produce system design documents and lead technical walkthroughs Produce high quality code Collaborate effectively with technical and non-technical partners As a team-member should continuously improve the architecture Basic Qualifications: 6-8 years of experience in application development using Java or Dot Net (.NET) Technologies Bachelor's /Masters degree in computer science, Information Systems or equivalent. Knowledge of object-oriented design, .NET framework and design patterns. Command of essential technologies: Java and/or C#, ASP.NET Experience with developing solutions involving relational database technologies: SQL, stored procedures Proficient with software development lifecycle (SDLC) methodologies like Agile, Test-Driven Development. Good communication and collaboration skills Preferred Qualifications: Search Technologies: Query and indexing content for Apache Solr, Elastic Search Big Data Technologies: Apache Spark, Spark SQL, Hadoop, Hive, Airflow Data Science Search Technologies: Personalization and Recommendation models, Learn to Rank (LTR) Preferred Languages: Python Database Technologies: MS SQL Server platform, stored procedure programming experience using Transact SQL. Ability to lead, train and mentor.
Posted 1 month ago
10 - 12 years
30 - 35 Lacs
Hyderabad
Work from Office
Grade Level (for internal use): 11 The Team: Our team is responsible for the design, architecture, and development of our client facing applications using a variety of tools that are regularly updated as new technologies emerge. You will have the opportunity every day to work with people from a wide variety of backgrounds and will be able to develop a close team dynamic with coworkers from around the globe. The Impact: The work you do will be used every single day, its the essential code youll write that provides the data and analytics required for crucial, daily decisions in the capital and commodities markets. Whats in it for you: Build a career with a global company. Work on code that fuels the global financial markets. Grow and improve your skills by working on enterprise level products and new technologies. Responsibilities: Solve problems, analyze and isolate issues. Provide technical guidance and mentoring to the team and help them adopt change as new processes are introduced. Champion best practices and serve as a subject matter authority. Develop solutions to develop/support key business needs. Engineer components and common services based on standard development models, languages and tools Produce system design documents and lead technical walkthroughs Produce high quality code Collaborate effectively with technical and non-technical partners As a team-member should continuously improve the architecture Basic Qualifications: 10-12 years of experience designing/building data-intensive solutions using distributed computing. Proven experience in implementing and maintaining enterprise search solutions in large-scale environments. Experience working with business stakeholders and users, providing research direction and solution design and writing robust maintainable architectures and APIs. Experience developing and deploying Search solutions in a public cloud such as AWS. Proficient programming skills at a high-level languages -Java, Scala, Python Solid knowledge of at least one machine learning research frameworks Familiarity with containerization, scripting, cloud platforms, and CI/CD. 5+ years experience with Python, Java, Kubernetes, and data and workflow orchestration tools 4+ years experience with Elasticsearch, SQL, NoSQL,Apache spark, Flink, Databricks and Mlflow. Prior experience with operationalizing data-driven pipelines for large scale batch and stream processing analytics solutions Good to have experience with contributing to GitHub and open source initiatives or in research projects and/or participation in Kaggle competitions Ability to quickly, efficiently, and effectively define and prototype solutions with continual iteration within aggressive product deadlines. Demonstrate strong communication and documentation skills for both technical and non-technical audiences. Preferred Qualifications: Search Technologies: Query and Indexing content for Apache Solr, Elastic Search, etc. Proficiency in search query languages (e.g., Lucene Query Syntax) and experience with data indexing and retrieval. Experience with machine learning models and NLP techniques for search relevance and ranking. Familiarity with vector search techniques and embedding models (e.g., BERT, Word2Vec). Experience with relevance tuning using A/B testing frameworks. Big Data Technologies: Apache Spark, Spark SQL, Hadoop, Hive, Airflow Data Science Search Technologies: Personalization and Recommendation models, Learn to Rank (LTR) Preferred Languages: Python, Java Database Technologies: MS SQL Server platform, stored procedure programming experience using Transact SQL. Ability to lead, train and mentor.
Posted 1 month ago
5 - 7 years
27 - 30 Lacs
Pune
Work from Office
Must-Have Skills: 5+ years of experience as a Big Data Engineer 3+ years of experience with Apache Spark, Hive, HDFS, and Beam (optional) Strong proficiency in SQL and either Scala or Python Experience with ETL processes and working with structured and unstructured data 2+ years of experience with Cloud Platforms (GCP, AWS, or Azure) Hands-on experience with software build management tools like Maven or Gradle Experience in automation, performance tuning, and optimizing data pipelines Familiarity with CI/CD, serverless computing, and infrastructure-as-code practices Good-to-Have Skills: Experience with Google Cloud Services (BigQuery, Dataproc, Dataflow, Composer, DataStream) Strong knowledge of data pipeline development and optimization Familiarity with source control tools (SVN/Git, GitHub) Experience working in Agile environments (Scrum, XP, etc.) Knowledge of relational databases (SQL Server, Oracle, MySQL) Experience with Atlassian tools (JIRA, Confluence, GitHub) Key Responsibilities: Extract, transform, and load (ETL) data from multiple sources using Big Data technologies Develop, enhance, and support data ingestion jobs using GCP services like Apache Spark, Dataproc, Dataflow, BigQuery, and Airflow Work closely with senior engineers and cross-functional teams to improve data accessibility Automate manual processes, optimize data pipelines, and enhance infrastructure for scalability Modify data extraction pipelines to follow standardized, reusable approaches Optimize query performance and data access techniques in collaboration with senior engineers Follow modern software development practices, including microservices, CI/CD, and infrastructure-as-code Participate in Agile development teams, ensuring best practices for software engineering and data management Preferred Qualifications: Bachelor's degree in Computer Science, Systems Engineering, or a related field Self-starter with strong problem-solving skills and adaptability to shifting priorities Cloud certifications (GCP, AWS, or Azure) are a plus Skills GCP Services, ,Dataproc,DataFlow.
Posted 1 month ago
5 - 10 years
10 - 17 Lacs
Gurugram
Work from Office
Job Type : Full-Time Location Gurgaon Experience – 5-10 years Role – Big Data Architect. About the Role: As a Big Data Engineer, you will play a critical role in integrating multiple data sources, designing scalable data workflows, and collaborating with data architects, scientists, and analysts to develop innovative solutions. You will work with rapidly evolving technologies to achieve strategic business goals. Must-Have Skills: 4+ year’s of mandatory experience with Big data. 4+ year’s mandatory experience in Apache Spark. Proficiency in Apache Spark, Hive on Tez, and Hadoop ecosystem components. Strong coding skills in Python & Pyspark. Experience building reusable components or frameworks using Spark Expertise in data ingestion from multiple sources using APIs, HDFS, and NiFi. Solid experience working with structured, unstructured, and semi-structured data formats (Text, JSON, Avro, Parquet, ORC, etc.). Experience with UNIX Bash scripting and databases like Postgres, MySQL and Oracle. Ability to design, develop, and evolve fault-tolerant distributed systems. Strong SQL skills, with expertise in Hive, Impala, Mongo and NoSQL databases. Hands-on with Git and CI/CD tools Experience with streaming data technologies (Kafka, Spark Streaming, Apache Flink, etc.). Proficient with HDFS, or similar data lake technologies Excellent problem-solving skills — you will be evaluated through coding rounds Key Responsibilities: Must be capable of handling existing or new Apache HDFS cluster having name node, data node & edge node commissioning & decommissioning. Work closely with data architects and analysts to design technical solutions. Integrate and ingest data from multiple source systems into big data environments. Develop end-to-end data transformations and workflows, ensuring logging and recovery mechanisms. Must able to troubleshoot spark job failures. Design and implement batch, real-time, and near-real-time data pipelines. Optimize Big Data transformations using Apache Spark, Hive, and Tez Work with Data Science teams to enhance actionable insights. Ensure seamless data integration and transformation across multiple systems.
Posted 1 month ago
6 - 10 years
22 - 32 Lacs
Bengaluru
Work from Office
We're Nagarro. We are a Digital Product Engineering company that is scaling in a big way! We build products, services, and experiences that inspire, excite, and delight. We work at scale across all devices and digital mediums, and our people exist everywhere in the world (18000+ experts across 38 countries, to be exact). Our work culture is dynamic and non-hierarchical. We're looking for great new colleagues. That's where you come in! REQUIREMENTS: Total experience 6+years. Hands on experience in Spark, Python/Scala, Azure Databricks, Data Pipelines, SQL Server / NoSQL . Strong working knowledge in Python, Scala, SQL, Airflow and PySpark. We are looking for a Senior Data Engineer with strong expertise in Apache Spark, Python, and Azure Databricks to design and implement scalable data pipelines. The ideal candidate will have hands-on experience in building ETL workflows, optimizing Spark jobs, and working with cloud-based data platforms on Azure. 6+ years of experience building data pipelines 6+ years of experience building data frameworks for unit testing, data lineage tracking, and automation. Fluency in any programming language Python is required. Working knowledge of Apache Spark. Familiarity with streaming technologies (e.g., Kafka, Kinesis, Flink). Excellent communication and collaboration skills. RESPONSIBILITIES: Writing and reviewing great quality code Understanding the clients business use cases and technical requirements and be able to convert them in to technical design which elegantly meets the requirements Mapping decisions with requirements and be able to translate the same to developers Identifying different solutions and being able to narrow down the best option that meets the clients requirements Defining guidelines and benchmarks for NFR considerations during project implementation Writing and reviewing design documents explaining overall architecture, framework, and high-level design of the application for the developers Reviewing architecture and design on various aspects like extensibility,scalability, security, design patterns, user experience, NFRs, etc., and ensure that all relevant best practices are followed Developing and designing the overall solution for defined functional and non-functional requirements; and defining technologies, patterns, and frameworks to materialize it Understanding and relating technology integration scenarios and applying these learnings in projects Resolving issues that are raised during code/review, through exhaustive systematic analysis of the root cause, and being able to justify the decision taken Carrying out POCs to make sure that suggested design/technologies meet the requirements.
Posted 1 month ago
8 - 13 years
40 - 45 Lacs
Noida, Gurugram
Work from Office
Responsibilities: Design and articulate enterprise-scale data architectures incorporating multiple platforms including Open Source and proprietary Data Platform solutions - Databricks, Snowflake, and Microsoft Fabri c, to address customer requirements in data engineering, data science, and machine learning use cases. Conduct technical discovery sessions with clients to understand their data architecture, analytics needs, and business objectives Design and deliver proof of concepts (POCs) and technical demonstrations that showcase modern data platforms in solving real-world problems Create comprehensive architectural diagrams and i mplementation roadmaps for complex data ecosystems spanning cloud and on-premises environments Evaluate and recommend appropriate big data technologies, cloud platforms, and processing frameworks based on specific customer requirements Lead technical responses to RFPs (Request for Proposals), crafting detailed solution architectures, technical approaches, and implementation methodologies Create and review techno-commercial proposals, including solution scoping, effort estimation, and technology selection justifications Collaborate with sales and delivery teams to develop competitive, technically sound proposals with appropriate pricing models for data solutions Stay current with the latest advancements in data technologies including cloud services, data processing frameworks, and AI/ML capabilities Qualifications: Bachelor's or Master's degree in Computer Science, Data Science, or a related technical field. 8+ years of experience in data architecture, data engineering, or solution architecture roles Proven experience in responding to RFPs and developing techno-commercial proposals for data solutions Demonstrated ability to estimate project efforts, resource requirements, and implementation timelines Hands-on experience with multiple data platforms including Databricks, Snowflake, and Microsoft Fabric Strong understanding of big data technologies including Hadoop ecosystem, Apache Spark, and Delta Lake Experience with modern data processing frameworks such as Apache Kafka and Airflow Proficiency in cloud platforms ( AWS, Azure, GCP ) and their respective data services Knowledge of system monitoring and observability tools. Experience implementing automated testing frameworks for data platforms and pipelines Expertise in both relational databases (PostgreSQL, MySQL) and NoSQL databases (MongoDB) Understanding of AI/ML technologies and their integration with data platforms Familiarity with data integration patterns, ETL/ELT processes , and data governance practices Experience designing and implementing data lakes, data warehouses, and machine learning pipelines Proficiency in programming languages commonly used in data processing (Python, Scala, SQL) Strong problem-solving skills and ability to think creatively to address customer challenges Relevant certifications such as Databricks, Snowflake, Azure Data Engineer, or AWS Data Analytics are a plus Willingness to travel as required to meet with customers and attend industry events If interested plz contact Ramya 9513487487, 9342164917
Posted 1 month ago
12 - 19 years
45 - 55 Lacs
Noida, Hyderabad, Gurugram
Work from Office
Responsibilities Lead a team of data engineers, providing technical mentorship, performance management, and career development guidance Design and oversee implementation of modern data architecture leveraging cloud platforms ( AWS, Azure, GCP ) and industry-leading data platforms (Databricks, Snowflake, Microsoft Fabric) Establish data engineering best practices, coding standards, and technical documentation processes Develop and execute data platform roadmaps aligned with business objectives and technical innovation Optimize data pipelines for performance, reliability, and cost-effectiveness Collaborate with data science, analytics, and business teams to understand requirements and deliver tailored data solutions Drive adoption of DevOps and DataOps practices, including CI/CD, automated testing etc. Qualifications Bachelor's or Master's degree in Computer Science, Information Systems, or a related technical field. 12+ years of experience in data engineering roles with at least 3 years in a leadership position Expert knowledge of big data technologies (Hadoop ecosystem) and modern data processing frameworks (Apache Spark, Kafka, Airflow) Extensive experience with cloud platforms (AWS, Azure, GCP) and cloud-native data services Hands-on experience with industry-leading data platforms such as Databricks, Snowflake, and Microsoft Fabric Strong background in both relational (PostgreSQL, MySQL) and NoSQL (MongoDB) database systems Experience implementing and managing data monitoring solutions (Grafana, Ganglia, etc.) Proven track record of implementing automated testing frameworks for data pipelines and applications Knowledge of AI/ML technologies and how they integrate with data platforms Excellent understanding of data modelling, ETL processes, and data warehousing concepts Outstanding leadership, communication, and project management skills If interested plz contact Ramya 9513487487, 9342164917
Posted 1 month ago
7 - 11 years
50 - 60 Lacs
Mumbai, Delhi / NCR, Bengaluru
Work from Office
Role :- Resident Solution ArchitectLocation: RemoteThe Solution Architect at Koantek builds secure, highly scalable big data solutions to achieve tangible, data-driven outcomes all the while keeping simplicity and operational effectiveness in mind This role collaborates with teammates, product teams, and cross-functional project teams to lead the adoption and integration of the Databricks Lakehouse Platform into the enterprise ecosystem and AWS/Azure/GCP architecture This role is responsible for implementing securely architected big data solutions that are operationally reliable, performant, and deliver on strategic initiatives Specific requirements for the role include: Expert-level knowledge of data frameworks, data lakes and open-source projects such as Apache Spark, MLflow, and Delta Lake Expert-level hands-on coding experience in Python, SQL ,Spark/Scala,Python or Pyspark In depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, RDD caching, Spark MLib IoT/event-driven/microservices in the cloud- Experience with private and public cloud architectures, pros/cons, and migration considerations Extensive hands-on experience implementing data migration and data processing using AWS/Azure/GCP services Extensive hands-on experience with the Technology stack available in the industry for data management, data ingestion, capture, processing, and curation: Kafka, StreamSets, Attunity, GoldenGate, Map Reduce, Hadoop, Hive, Hbase, Cassandra, Spark, Flume, Hive, Impala, etc Experience using Azure DevOps and CI/CD as well as Agile tools and processes including Git, Jenkins, Jira, and Confluence Experience in creating tables, partitioning, bucketing, loading and aggregating data using Spark SQL/Scala Able to build ingestion to ADLS and enable BI layer for Analytics with strong understanding of Data Modeling and defining conceptual logical and physical data models Proficient level experience with architecture design, build and optimization of big data collection, ingestion, storage, processing, and visualization Responsibilities : Work closely with team members to lead and drive enterprise solutions, advising on key decision points on trade-offs, best practices, and risk mitigationGuide customers in transforming big data projects,including development and deployment of big data and AI applications Promote, emphasize, and leverage big data solutions to deploy performant systems that appropriately auto-scale, are highly available, fault-tolerant, self-monitoring, and serviceable Use a defense-in-depth approach in designing data solutions and AWS/Azure/GCP infrastructure Assist and advise data engineers in the preparation and delivery of raw data for prescriptive and predictive modeling Aid developers to identify, design, and implement process improvements with automation tools to optimizing data delivery Implement processes and systems to monitor data quality and security, ensuring production data is accurate and available for key stakeholders and the business processes that depend on it Employ change management best practices to ensure that data remains readily accessible to the business Implement reusable design templates and solutions to integrate, automate, and orchestrate cloud operational needs and experience with MDM using data governance solutions Qualifications : Overall experience of 12+ years in the IT field Hands-on experience designing and implementing multi-tenant solutions using Azure Databricks for data governance, data pipelines for near real-time data warehouse, and machine learning solutions Design and development experience with scalable and cost-effective Microsoft Azure/AWS/GCP data architecture and related solutions Experience in a software development, data engineering, or data analytics field using Python, Scala, Spark, Java, or equivalent technologies Bachelors or Masters degree in Big Data, Computer Science, Engineering, Mathematics, or similar area of study or equivalent work experience Good to have- - Advanced technical certifications: Azure Solutions Architect Expert, - AWS Certified Data Analytics, DASCA Big Data Engineering and Analytics - AWS Certified Cloud Practitioner, Solutions Architect - Professional Google Cloud Certified Location : - Mumbai, Delhi / NCR, Bengaluru , Kolkata, Chennai, Hyderabad, Ahmedabad, Pune, Remote
Posted 1 month ago
6 - 10 years
15 - 19 Lacs
Bengaluru
Work from Office
As a Principal Data Engineer on the Marketplace team, you will be responsible for analysing and interpreting complex datasets to generate insights that directly influence business strategy and decision-making. You will apply advanced statistical analysis and predictive modelling techniques to identify trends, predict future outcomes, and assess data quality. These insights will drive data-driven decisions and strategic initiatives across the organization. The Marketplace team is responsible for building the services where our customers will go to purchase pre-configured software installations on the platform of their choice. The challenges here are across the entire stack, from back-end distributed services operating at cloud scale, to e-commerce transactions, to the actual web apps that users interact with. This is the perfect role for someone experienced in designing distributed systems, writing and debugging code across an entire stack (UI, APIs, databases, cloud infrastructure services), championing operational excellence, mentoring junior engineers, driving development process improvements and excellence in a start-up style environment. Career Level - IC4 Responsibilities As a Principal Data Engineer, you will be at the forefront of Oracles data initiatives, playing a pivotal role in transforming raw data into actionable insights. Collaborating with data scientists and business stakeholders, you will design scalable data pipelines, optimize data infrastructure, and ensure the availability of high-quality datasets for strategic analysis. This role goes beyond data engineering, requiring hands-on involvement in statistical analysis and predictive modeling. You will use techniques such as regression analysis, trend forecasting, and time-series modeling to extract meaningful insights from data, directly supporting business decision-making. Basic Qualifications: 7+ years of experience in data engineering and analytics, with a strong background in designing scalable database architectures, building and optimizing data pipelines, and applying statistical analysis to deliver strategic insights across complex, high-volume data environments Deep knowledge of big data frameworks such as Apache Spark, Apache Flink, Apache Airflow, Presto, Kafka, and data warehouse solutions. Experience working with other cloud platform teams and accommodating requirements from those teams (compute, networking, search, store). Excellent written and verbal communication skills with the ability to present complex information in a clear, concise manner to all audiences Design and optimize database structures to ensure scalability, performance, and reliability within Oracle ADW and OCI environments. This includes maintaining schema integrity, managing database objects, and implementing efficient table structures that support seamless reporting and analytical needs. Build and manage data pipelines that automate the flow of data from diverse sources into Oracle databases, using ETL processes to transform data for analysis and reporting. Conduct data quality assessments, identify anomalies, and validate the accuracy of data ingested into our systems. Working alongside data governance teams, you will establish metrics to measure data quality and implement controls to uphold data integrity, ensuring reliable data for stakeholders. Mentor junior team members and share best practices in data analysis, modeling, and domain expertise. Preferred Qualifications: Solid understanding of statistical methods, hypothesis testing, data distribution, regression analysis, and probability. Proficiency in Python for data analysis and statistical modeling. Experience with libraries like pandas, NumPy, and SciPy. Knowledge of methods and techniques for data quality assessment, anomaly detection, and validation processes. Skills in defining data quality metrics, creating data validation rules, and implementing controls to monitor and uphold data integrity. Familiarity with visualization tools (e.g., Tableau, Power BI, Oracle Analytics Cloud) and libraries (e.g., Matplotlib, Seaborn) to convey insights effectively. Strong communication skills for collaborating with stakeholders and translating business goals into technical data requirements. Ability to contextualize data insights in business terms and to present findings to non-technical stakeholders in a meaningful way. Ability to cleanse, transform, and aggregate data from various sources, ensuring its ready for analysis. Experience with relational database management and design, specifically in Oracle environments (e.g., Oracle Autonomous Data Warehouse, Oracle Database). Skills in designing, maintaining, and optimizing database schemas to ensure efficiency, scalability, and reliability. Advanced SQL skills for complex queries, indexing, stored procedures, and performance tuning. Experience with ETL tools such as Oracle Data Integrator (ODI), or other data integration frameworks.
Posted 1 month ago
3 - 7 years
16 - 20 Lacs
Gurugram
Hybrid
Senior Data Engineer We are seeking a highly skilled Senior Data Engineer with strong expertise in big data, Spark, and modern cloud-native platforms. The ideal candidate will work closely with data architects, platform teams, and vendor products to build scalable, sustainable, and performant data pipelines and APIs in a dynamic offshore environment. Location: Offshore (Gurugram) Job specific Duties and Responsibilities Design, develop, and optimize data pipelines for processing large volumes of structured and unstructured data using Apache Spark and AWS technologies. Develop APIs and microservices using container frameworks like OpenShift, Docker, and Kubernetes. Work with diverse data formats such as PARQUET, ORC, and CSV. Leverage data streaming and messaging platforms including Apache Kafka for real-time data processing. Build scalable solutions on AWS leveraging services like ElasticSearch/OpenSearch. Implement big data querying using tools such as Presto or Trino. Collaborate with platform and vendor deployment teams to ensure seamless integration of data sources. Work closely with data architects to provision and support sustainable infrastructure patterns. Contribute to data access strategies and data modeling in alignment with architectural principles. Communicate technical concepts effectively to non-technical stakeholders and vice versa. Required Competencies and Skills Advanced data engineering skills with strong experience using Spark, PySpark, SQL (Oracle/PostgreSQL or MySQL), Python, Kafka, and Airflow Experience in building and delivering API based microservices solutions using container frameworks like OpenShift, Docker, or Kubernetes Experience in various file format types like PARQUET, ORC, CSV Experience with AWS and technologies such as Elastic or Opensearch Strong knowledge of big data querying tools such as Presto or Trino Experience in data architecture principles, including data access patterns and data modelling Previously worked closely with data architect and platform/infrastructure teams to develop and provision sustainable infra patterns Design, develop and optimise data pipelines for processing large volumes of structured and unstructured data using Apache Spark and AWS technologies Work closely with platform and vendor product deployment teams to ensure seamless integration of data sources Excellent at stakeholder management across multiple levels of engagement Proactive and have great communication skills Good in doing downstream analysis and understanding end to end data flow when multiple systems are involved A natural collaborator with a learning mindset, happy to share knowledge and learn from others Able to understand technical requirements and translate them into non-technical language and vice versa Experienced with working on Agile delivery Familiarity with data governance, data quality and control frameworks would certainly be useful in this role Able to creatively use data and insights to uncover new opportunities, identify root causes and underlying risks to recommend solutions to the business Required Experience and Qualifications 6+ years of professional experience in Data Engineering and Microservices development. Proven experience with SPARK, SCALA, AWS, and modern data platforms. Strong experience working in Agile delivery environments. Bachelors or Masters degree in Computer Science, Engineering, or a related quantitative field. Prior experience in Financial Services or Banking is a plus. AWS or equivalent cloud certification preferred..
Posted 1 month ago
6 - 8 years
8 - 10 Lacs
Hyderabad
Work from Office
What you will do Lets do this. Lets change the world. In this vital role you will create and develop data lake solutions for scientific data that drive business decisions for Research. You will build scalable and high-performance data engineering solutions for large scientific datasets and collaborate with Research collaborators. You will also provide technical leadership to junior team members. The ideal candidate possesses experience in the pharmaceutical or biotech industry, demonstrates deep technical skills, is proficient with big data technologies, and has a deep understanding of data architecture and ETL processes. Roles & Responsibilities: Lead, manage, and mentor a high-performing team of data engineers Design, develop, and implement data pipelines, ETL processes, and data integration solutions Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks Develop and maintain data models for biopharma scientific data, data dictionaries, and other documentation to ensure data accuracy and consistency Optimize large datasets for query performance Collaborate with global multi-functional teams including research scientists to understand data requirements and design solutions that meet business needs Implement data security and privacy measures to protect sensitive data Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions Collaborate with Data Architects, Business SMEs, Software Engineers and Data Scientists to design and develop end-to-end data pipelines to meet fast paced business needs across geographic regions Identify and resolve data-related challenges Adhere to best practices for coding, testing, and designing reusable code/component Explore new tools and technologies that will help to improve ETL platform performance Participate in sprint planning meetings and provide estimations on technical implementation What we expect of you We are all different, yet we all use our unique contributions to serve patients. The [vital attribute] professional we seek is a [type of person] with these qualifications. Basic Qualifications: Doctorate Degree OR Masters degree with 4 - 6 years of experience in Computer Science, IT, Computational Chemistry, Computational Biology/Bioinformatics or related field OR Bachelors degree with 6 - 8 years of experience in Computer Science, IT, Computational Chemistry, Computational Biology/Bioinformatics or related field OR Diploma with 10 - 12 years of experience in Computer Science, IT, Computational Chemistry, Computational Biology/Bioinformatics or related field Preferred Qualifications: 3+ years of experience in implementing and supporting biopharma scientific research data analytics (software platforms) Functional Skills: Must-Have Skills: Proficiency in SQL and Python for data engineering, test automation frameworks (pytest), and scripting tasks Hands on experience with big data technologies and platforms, such as Databricks, Apache Spark (PySpark, SparkSQL), workflow orchestration, performance tuning on big data processing Excellent problem-solving skills and the ability to work with large, complex datasets Able to engage with business collaborators and mentor team to develop data pipelines and data models Good-to-Have Skills: A passion for tackling complex challenges in drug discovery with technology and data Good understanding of data modeling, data warehousing, and data integration concepts Good experience using RDBMS (e.g. Oracle, MySQL, SQL server, PostgreSQL) Knowledge of cloud data platforms (AWS preferred) Experience with data visualization tools (e.g. Dash, Plotly, Spotfire) Experience with diagramming and collaboration tools such as Miro, Lucidchart or similar tools for process mapping and brainstorming Experience writing and maintaining technical documentation in Confluence Understanding of data governance frameworks, tools, and best practices Professional Certifications: Databricks Certified Data Engineer Professional preferred Soft Skills: Excellent critical-thinking and problem-solving skills Good communication and collaboration skills Demonstrated awareness of how to function in a team setting Demonstrated presentation skills
Posted 1 month ago
3 - 7 years
7 - 10 Lacs
Bengaluru
Remote
• Design and implement scalable, efficient and high-performance data pipelines • Develop and optimize ETL/ELT workflows using modern tools and frameworks. • Work with cloud platforms (AWS, Azure, GCP) Detailed JD will be given later.
Posted 1 month ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
19947 Jobs | Dublin
Wipro
9475 Jobs | Bengaluru
EY
7894 Jobs | London
Accenture in India
6317 Jobs | Dublin 2
Amazon
6141 Jobs | Seattle,WA
Uplers
6077 Jobs | Ahmedabad
Oracle
5820 Jobs | Redwood City
IBM
5736 Jobs | Armonk
Tata Consultancy Services
3644 Jobs | Thane
Capgemini
3598 Jobs | Paris,France