Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
2.0 - 3.0 years
6 - 7 Lacs
coimbatore
Work from Office
ETL Developer Job Title: ETL Developer FTE Location: Coimbatore Start Date: ASAP Job Summary: We are looking for an experienced ETL Developer with strong expertise in Apache Airflow , Redshift , and SQL-based data pipelines, with upcoming transitions to Snowflake . This is a contract role based in Coimbatore, ideal for professionals who can independently deliver high-quality ETL solutions in a cloud-native, fast-paced environment. Key Responsibilities: 1. ETL Design and Development: Design and develop scalable and modular ETL pipelines using Apache Airflow , with orchestration and monitoring capabilities. Translate business requirements into robust data transformation pipelines across cloud data platforms. Develop reusable ETL components to support a configuration-driven architecture. 2. Data Integration and Transformation: Integrate data from multiple sources: Redshift , flat files, APIs, Excel, and relational databases. Implement transformation logic such as cleansing, standardization, enrichment, and deduplication. Manage incremental and full loads, along with SCD handling strategies. 3. SQL and Database Development: Write performant SQL queries for data staging and transformation within Redshift and Snowflake . Utilize joins, window functions, and aggregations effectively. Ensure indexing and query tuning for high-performance workloads. 4. Performance Tuning: Optimize data pipelines and orchestrations for large-scale data volumes. Tune SQL queries and monitor execution plans. Implement best practices in distributed data processing and cloud-native optimizations. 5. Error Handling and Logging: Implement robust error handling and logging in Airflow DAGs. Enable retry logic, alerting mechanisms, and failure notifications. 6. Testing and Quality Assurance: Conduct unit and integration testing of ETL jobs. Validate data outputs against business rules and source systems. Support QA during UAT cycles and help resolve data defects. 7. Deployment and Scheduling: Deploy pipelines using Git-based CI/CD practices. Schedule and monitor DAGs using Apache Airflow and integrated tools. Troubleshoot failures and ensure data pipeline reliability. 8. Documentation and Maintenance: Document data flows, DAG configurations, transformation logic, and operational procedures. Maintain change logs and update job dependency charts. 9. Collaboration and Communication: Work closely with data architects, analysts, and BI teams to define and fulfill data needs. Participate in stand-ups, sprint planning, and post-deployment reviews. 10. Compliance and Best Practices: Ensure ETL processes adhere to data security, governance, and privacy regulations (HIPAA, GDPR, etc.). Follow naming conventions, version control standards, and deployment protocols. Required Skills & Experience: 3–6 years of hands-on experience in ETL development. Proven experience with Apache Airflow , Amazon Redshift , and strong SQL. Strong understanding of data warehousing concepts and cloud-based data ecosystems. Familiarity with handling flat files, APIs, and external sources. Experience with job orchestration, error handling, and scalable transformation patterns. Ability to work independently and meet deadlines. Preferred Skills: Exposure to Snowflake or plans to migrate to Snowflake platforms. Experience in healthcare , life sciences , or regulated environments is a plus. Familiarity with Azure Data Factory , Power BI , or other cloud BI tools. Knowledge of Git, Azure DevOps, or other version control and CI/CD platforms. Role & responsibilities Preferred candidate profile
Posted 1 day ago
7.0 - 10.0 years
10 - 14 Lacs
hyderabad
Work from Office
About the Job : We are seeking a highly skilled and experienced Senior Data Engineer to join our dynamic team. In this pivotal role, you will be instrumental in driving our data engineering initiatives, with a strong emphasis on leveraging Dataiku's capabilities to enhance data processing and analytics. You will be responsible for designing, developing, and optimizing robust data pipelines, ensuring seamless integration of diverse data sources, and maintaining high data quality and accessibility to support our business intelligence and advanced analytics projects. This role requires a unique blend of expertise in traditional data engineering principles, advanced data modeling, and a forward-thinking approach to integrating cutting-AI technologies, particularly LLM Mesh for Generative AI applications. If you are passionate about building scalable data solutions and are eager to explore the cutting edge of AI, we encourage you to apply. Key Responsibilities : - Dataiku Leadership : Drive data engineering initiatives with a strong emphasis on leveraging Dataiku capabilities for data preparation, analysis, visualization, and the deployment of data solutions. - Data Pipeline Development : Design, develop, and optimize robust and scalable data pipelines to support various business intelligence and advanced analytics projects. This includes developing and maintaining ETL/ELT processes to automate data extraction, transformation, and loading from diverse sources. - Data Modeling & Architecture : Apply expertise in data modeling techniques to design efficient and scalable database structures, ensuring data integrity and optimal performance. - ETL/ELT Expertise : Implement and manage ETL processes and tools to ensure efficient and reliable data flow, maintaining high data quality and accessibility. - Gen AI Integration : Explore and implement solutions leveraging LLM Mesh for Generative AI applications, contributing to the development of innovative AI-powered features. - Programming & Scripting : Utilize programming languages such as Python and SQL for data manipulation, analysis, automation, and the development of custom data solutions. - Cloud Platform Deployment : Deploy and manage scalable data solutions on cloud platforms such as AWS or Azure, leveraging their respective services for optimal performance and cost-efficiency. - Data Quality & Governance : Ensure seamless integration of data sources, maintaining high data quality, consistency, and accessibility across all data assets. Implement data governance best practices. - Collaboration & Mentorship : Collaborate closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver impactful solutions. Potentially mentor junior team members. - Performance Optimization : Continuously monitor and optimize the performance of data pipelines and data systems. Required Skills & Experience : - Proficiency in Dataiku : Demonstrable expertise in Dataiku for data preparation, analysis, visualization, and building end-to-end data pipelines and applications. - Expertise in Data Modeling : Strong understanding and practical experience in various data modeling techniques (e.g., dimensional modeling, Kimball, Inmon) to design efficient and scalable database structures. - ETL/ELT Processes & Tools : Extensive experience with ETL/ELT processes and a proven track record of using various ETL tools (e.g., Dataiku's built-in capabilities, Apache Airflow, Talend, SSIS, etc.). - Familiarity with LLM Mesh : Familiarity with LLM Mesh or similar frameworks for Gen AI applications, understanding its concepts and potential for integration. - Programming Languages : Strong proficiency in Python for data manipulation, scripting, and developing data solutions. Solid command of SQL for complex querying, data analysis, and database interactions. - Cloud Platforms : Knowledge and hands-on experience with at least one major cloud platform (AWS or Azure) for deploying and managing scalable data solutions (e.g., S3, EC2, Azure Data Lake, Azure Synapse, etc.). - Gen AI Concepts : Basic understanding of Generative AI concepts and their potential applications in data engineering. - Problem-Solving : Excellent analytical and problem-solving skills with a keen eye for detail. - Communication : Strong communication and interpersonal skills to collaborate effectively with cross-functional teams. Bonus Points (Nice to Have) : - Experience with other big data technologies (e.g., Spark, Hadoop, Snowflake). - Familiarity with data governance and data security best practices. - Experience with MLOps principles and tools. - Contributions to open-source projects related to data engineering or AI. Education : Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.
Posted 3 days ago
7.0 - 10.0 years
10 - 14 Lacs
bengaluru
Work from Office
About the Job : We are seeking a highly skilled and experienced Senior Data Engineer to join our dynamic team. In this pivotal role, you will be instrumental in driving our data engineering initiatives, with a strong emphasis on leveraging Dataiku's capabilities to enhance data processing and analytics. You will be responsible for designing, developing, and optimizing robust data pipelines, ensuring seamless integration of diverse data sources, and maintaining high data quality and accessibility to support our business intelligence and advanced analytics projects. This role requires a unique blend of expertise in traditional data engineering principles, advanced data modeling, and a forward-thinking approach to integrating cutting-AI technologies, particularly LLM Mesh for Generative AI applications. If you are passionate about building scalable data solutions and are eager to explore the cutting edge of AI, we encourage you to apply. Key Responsibilities : - Dataiku Leadership : Drive data engineering initiatives with a strong emphasis on leveraging Dataiku capabilities for data preparation, analysis, visualization, and the deployment of data solutions. - Data Pipeline Development : Design, develop, and optimize robust and scalable data pipelines to support various business intelligence and advanced analytics projects. This includes developing and maintaining ETL/ELT processes to automate data extraction, transformation, and loading from diverse sources. - Data Modeling & Architecture : Apply expertise in data modeling techniques to design efficient and scalable database structures, ensuring data integrity and optimal performance. - ETL/ELT Expertise : Implement and manage ETL processes and tools to ensure efficient and reliable data flow, maintaining high data quality and accessibility. - Gen AI Integration : Explore and implement solutions leveraging LLM Mesh for Generative AI applications, contributing to the development of innovative AI-powered features. - Programming & Scripting : Utilize programming languages such as Python and SQL for data manipulation, analysis, automation, and the development of custom data solutions. - Cloud Platform Deployment : Deploy and manage scalable data solutions on cloud platforms such as AWS or Azure, leveraging their respective services for optimal performance and cost-efficiency. - Data Quality & Governance : Ensure seamless integration of data sources, maintaining high data quality, consistency, and accessibility across all data assets. Implement data governance best practices. - Collaboration & Mentorship : Collaborate closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver impactful solutions. Potentially mentor junior team members. - Performance Optimization : Continuously monitor and optimize the performance of data pipelines and data systems. Required Skills & Experience : - Proficiency in Dataiku : Demonstrable expertise in Dataiku for data preparation, analysis, visualization, and building end-to-end data pipelines and applications. - Expertise in Data Modeling : Strong understanding and practical experience in various data modeling techniques (e.g., dimensional modeling, Kimball, Inmon) to design efficient and scalable database structures. - ETL/ELT Processes & Tools : Extensive experience with ETL/ELT processes and a proven track record of using various ETL tools (e.g., Dataiku's built-in capabilities, Apache Airflow, Talend, SSIS, etc.). - Familiarity with LLM Mesh : Familiarity with LLM Mesh or similar frameworks for Gen AI applications, understanding its concepts and potential for integration. - Programming Languages : Strong proficiency in Python for data manipulation, scripting, and developing data solutions. Solid command of SQL for complex querying, data analysis, and database interactions. - Cloud Platforms : Knowledge and hands-on experience with at least one major cloud platform (AWS or Azure) for deploying and managing scalable data solutions (e.g., S3, EC2, Azure Data Lake, Azure Synapse, etc.). - Gen AI Concepts : Basic understanding of Generative AI concepts and their potential applications in data engineering. - Problem-Solving : Excellent analytical and problem-solving skills with a keen eye for detail. - Communication : Strong communication and interpersonal skills to collaborate effectively with cross-functional teams. Bonus Points (Nice to Have) : - Experience with other big data technologies (e.g., Spark, Hadoop, Snowflake). - Familiarity with data governance and data security best practices. - Experience with MLOps principles and tools. - Contributions to open-source projects related to data engineering or AI. Education : Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.
Posted 4 days ago
7.0 - 12.0 years
11 - 21 Lacs
hyderabad, pune, chennai
Work from Office
We are seeking a detail-oriented Data Analyst who will ensure that organizational data is accurately transformed, validated, and prepared for downstream consumption.
Posted 4 days ago
7.0 - 10.0 years
10 - 14 Lacs
mumbai
Work from Office
About the Job : We are seeking a highly skilled and experienced Senior Data Engineer to join our dynamic team. In this pivotal role, you will be instrumental in driving our data engineering initiatives, with a strong emphasis on leveraging Dataiku's capabilities to enhance data processing and analytics. You will be responsible for designing, developing, and optimizing robust data pipelines, ensuring seamless integration of diverse data sources, and maintaining high data quality and accessibility to support our business intelligence and advanced analytics projects. This role requires a unique blend of expertise in traditional data engineering principles, advanced data modeling, and a forward-thinking approach to integrating cutting-AI technologies, particularly LLM Mesh for Generative AI applications. If you are passionate about building scalable data solutions and are eager to explore the cutting edge of AI, we encourage you to apply. Key Responsibilities : - Dataiku Leadership : Drive data engineering initiatives with a strong emphasis on leveraging Dataiku capabilities for data preparation, analysis, visualization, and the deployment of data solutions. - Data Pipeline Development : Design, develop, and optimize robust and scalable data pipelines to support various business intelligence and advanced analytics projects. This includes developing and maintaining ETL/ELT processes to automate data extraction, transformation, and loading from diverse sources. - Data Modeling & Architecture : Apply expertise in data modeling techniques to design efficient and scalable database structures, ensuring data integrity and optimal performance. - ETL/ELT Expertise : Implement and manage ETL processes and tools to ensure efficient and reliable data flow, maintaining high data quality and accessibility. - Gen AI Integration : Explore and implement solutions leveraging LLM Mesh for Generative AI applications, contributing to the development of innovative AI-powered features. - Programming & Scripting : Utilize programming languages such as Python and SQL for data manipulation, analysis, automation, and the development of custom data solutions. - Cloud Platform Deployment : Deploy and manage scalable data solutions on cloud platforms such as AWS or Azure, leveraging their respective services for optimal performance and cost-efficiency. - Data Quality & Governance : Ensure seamless integration of data sources, maintaining high data quality, consistency, and accessibility across all data assets. Implement data governance best practices. - Collaboration & Mentorship : Collaborate closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver impactful solutions. Potentially mentor junior team members. - Performance Optimization : Continuously monitor and optimize the performance of data pipelines and data systems. Required Skills & Experience : - Proficiency in Dataiku : Demonstrable expertise in Dataiku for data preparation, analysis, visualization, and building end-to-end data pipelines and applications. - Expertise in Data Modeling : Strong understanding and practical experience in various data modeling techniques (e.g., dimensional modeling, Kimball, Inmon) to design efficient and scalable database structures. - ETL/ELT Processes & Tools : Extensive experience with ETL/ELT processes and a proven track record of using various ETL tools (e.g., Dataiku's built-in capabilities, Apache Airflow, Talend, SSIS, etc.). - Familiarity with LLM Mesh : Familiarity with LLM Mesh or similar frameworks for Gen AI applications, understanding its concepts and potential for integration. - Programming Languages : Strong proficiency in Python for data manipulation, scripting, and developing data solutions. Solid command of SQL for complex querying, data analysis, and database interactions. - Cloud Platforms : Knowledge and hands-on experience with at least one major cloud platform (AWS or Azure) for deploying and managing scalable data solutions (e.g., S3, EC2, Azure Data Lake, Azure Synapse, etc.). - Gen AI Concepts : Basic understanding of Generative AI concepts and their potential applications in data engineering. - Problem-Solving : Excellent analytical and problem-solving skills with a keen eye for detail. - Communication : Strong communication and interpersonal skills to collaborate effectively with cross-functional teams. Bonus Points (Nice to Have) : - Experience with other big data technologies (e.g., Spark, Hadoop, Snowflake). - Familiarity with data governance and data security best practices. - Experience with MLOps principles and tools. - Contributions to open-source projects related to data engineering or AI. Education : Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.
Posted 4 days ago
8.0 - 13.0 years
0 Lacs
hyderabad, telangana
On-site
As a Senior Data Engineer at Amgen, you will play a crucial role in driving the development and implementation of the company's data strategy. You will be responsible for designing, building, and optimizing data pipelines and platforms while also mentoring junior engineers. Your technical expertise and data-driven problem-solving skills will be essential in ensuring data quality, integrity, and reliability. **Key Responsibilities:** - Contribute to the design, development, and implementation of automations to monitor data and cloud platforms for reliability, cost, and maintenance. - Take ownership of platform management, managing scope, timelines, and risks. - Ensure data quality and integrity through rigorous testing and monitoring. - Build scalable and efficient data solutions using cloud platforms, with a preference for AWS. - Collaborate closely with data analysts, data scientists, and business stakeholders to understand data requirements. - Identify and resolve complex data-related challenges. - Adhere to data engineering best practices and standards. - Stay up-to-date with the latest data technologies and trends. - Execute POCs on new technologies to advance the capabilities and deliveries in the data strategy. **Qualifications:** - Doctorate degree / Master's degree / Bachelor's degree in Computer Science. **Functional Skills:** **Must-Have Skills:** - Hands-on experience with cloud platforms (AWS, Azure, GCP) for architecting cost-effective and scalable data solutions. - Proficiency in Python, PySpark, SQL, and big data ETL performance tuning. - Strong analytical and problem-solving skills for addressing complex data challenges. - Manage and administer cloud-based environments (AWS, Azure, GCP) and data platforms (Databricks, Snowflake). - Experience in troubleshooting cloud environment issues like networking and virtual machines. - Effective communication and interpersonal skills for collaboration with cross-functional teams. **Good-to-Have Skills:** - Experience with data modeling and performance tuning for both OLAP and OLTP databases. - Administering lifecycle and platform management policies on cloud resource usage. - Working with Apache Spark, Apache Airflow. - Software engineering best practices including version control (Git, Subversion, etc.), CI/CD (Jenkins, Maven, etc.), automated unit testing, and DevOps. - Familiarity with AWS, GCP, or Azure cloud services. Amgen is committed to providing equal opportunities for all individuals, including those with disabilities. Reasonable accommodations will be provided during the job application or interview process to ensure equal participation and access to benefits and privileges of employment.,
Posted 4 days ago
3.0 - 7.0 years
0 Lacs
karnataka
On-site
As an AI/ML Developer, you will be responsible for utilizing programming languages such as Python for AI/ML development. Your proficiency in libraries like NumPy, Pandas for data manipulation, Matplotlib, Seaborn, Plotly for data visualization, and Scikit-learn for classical ML algorithms will be crucial. Familiarity with R, Java, or C++ is a plus, especially for performance-critical applications. Your role will involve building models using Machine Learning & Deep Learning Frameworks such as TensorFlow and Keras for deep learning, PyTorch for research-grade and production-ready models, and XGBoost, LightGBM, or CatBoost for gradient boosting. Understanding model training, validation, hyperparameter tuning, and evaluation metrics like ROC-AUC, F1-score, precision/recall will be essential. In the field of Natural Language Processing (NLP), you will work with text preprocessing techniques like tokenization, stemming, lemmatization, vectorization techniques such as TF-IDF, Word2Vec, GloVe, and Transformer-based models like BERT, GPT, T5 using Hugging Face Transformers. Experience with text classification, named entity recognition (NER), question answering, or chatbot development will be required. For Computer Vision (CV), your experience with image classification, object detection, segmentation, and libraries like OpenCV, Pillow, and Albumentations will be utilized. Proficiency in pretrained models (e.g., ResNet, YOLO, EfficientNet) and transfer learning is expected. You will also handle Data Engineering & Pipelines by building and managing data ingestion and preprocessing pipelines using tools like Apache Airflow, Luigi, Pandas, Dask. Experience with structured (CSV, SQL) and unstructured (text, images, audio) data will be beneficial. Furthermore, your role will involve Model Deployment & MLOps where you will deploy models as REST APIs using Flask, FastAPI, or Django, batch jobs, or real-time inference services. Familiarity with Docker for containerization, Kubernetes for orchestration, and MLflow, Kubeflow, or SageMaker for model tracking and lifecycle management will be necessary. In addition, your hands-on experience with at least one cloud provider such as AWS (S3, EC2, SageMaker, Lambda), Google Cloud (Vertex AI, BigQuery, Cloud Functions), or Azure (Machine Learning Studio, Blob Storage) will be required. Understanding cloud storage, compute services, and cost optimization is essential. Your proficiency in SQL for querying relational databases (e.g., PostgreSQL, MySQL), NoSQL databases (e.g., MongoDB, Cassandra), and familiarity with big data tools like Apache Spark, Hadoop, or Databricks will be valuable. Experience with Git and platforms like GitHub, GitLab, or Bitbucket will be essential for Version Control & Collaboration. Familiarity with Agile/Scrum methodologies and tools like JIRA, Trello, or Asana will also be beneficial. Moreover, you will be responsible for writing unit tests and integration tests for ML code and using tools like pytest, unittest, and debuggers to ensure the quality of the code. This position is Full-time and Permanent with benefits including Provident Fund and Work from home option. The work location is in person.,
Posted 4 days ago
2.0 - 6.0 years
0 Lacs
vadodara, gujarat
On-site
At Rearc, we are dedicated to empowering engineers like you to create exceptional products and experiences by providing you with the best tools possible. We value individuals who think freely, challenge the norm, and embrace alternative problem-solving approaches. If you are driven by the desire to make a difference and solve complex problems, you'll feel right at home with us. As a Data Engineer at Rearc, you will be an integral part of our data engineering team, contributing to the optimization of data workflows for efficiency, scalability, and reliability. Your role will involve designing and implementing robust data solutions in collaboration with cross-functional teams to meet business objectives and uphold data management best practices. **Key Responsibilities:** - **Collaborate with Colleagues:** Work closely with team members to understand customers" data requirements and contribute to developing tailored data solutions. - **Apply DataOps Principles:** Utilize modern data engineering tools like Apache Airflow and Apache Spark to create scalable data pipelines and architectures. - **Support Data Engineering Projects:** Assist in managing and executing data engineering projects, providing technical support and ensuring project success. - **Promote Knowledge Sharing:** Contribute to the knowledge base through technical blogs and articles, advocating for best practices in data engineering and fostering a culture of continuous learning and innovation. **Qualifications Required:** - 2+ years of experience in data engineering, data architecture, or related fields. - Proven track record in contributing to complex data engineering projects and implementing scalable data solutions. - Hands-on experience with ETL processes, data warehousing, and data modeling tools. - Understanding of data integration tools and best practices. - Familiarity with cloud-based data services and technologies such as AWS Redshift, Azure Synapse Analytics, Google BigQuery. - Strong analytical skills for data-driven decision-making. - Proficiency in implementing and optimizing data pipelines using modern tools and frameworks. - Excellent communication and interpersonal skills for effective collaboration with teams and stakeholders. Your journey at Rearc will begin with an immersive learning experience to help you get acquainted with our processes. In the initial months, you will have the opportunity to explore various tools and technologies as you find your place within our team.,
Posted 4 days ago
7.0 - 10.0 years
10 - 14 Lacs
noida
Work from Office
About the Job : We are seeking a highly skilled and experienced Senior Data Engineer to join our dynamic team. In this pivotal role, you will be instrumental in driving our data engineering initiatives, with a strong emphasis on leveraging Dataiku's capabilities to enhance data processing and analytics. You will be responsible for designing, developing, and optimizing robust data pipelines, ensuring seamless integration of diverse data sources, and maintaining high data quality and accessibility to support our business intelligence and advanced analytics projects. This role requires a unique blend of expertise in traditional data engineering principles, advanced data modeling, and a forward-thinking approach to integrating cutting-AI technologies, particularly LLM Mesh for Generative AI applications. If you are passionate about building scalable data solutions and are eager to explore the cutting edge of AI, we encourage you to apply. Key Responsibilities : - Dataiku Leadership : Drive data engineering initiatives with a strong emphasis on leveraging Dataiku capabilities for data preparation, analysis, visualization, and the deployment of data solutions. - Data Pipeline Development : Design, develop, and optimize robust and scalable data pipelines to support various business intelligence and advanced analytics projects. This includes developing and maintaining ETL/ELT processes to automate data extraction, transformation, and loading from diverse sources. - Data Modeling & Architecture : Apply expertise in data modeling techniques to design efficient and scalable database structures, ensuring data integrity and optimal performance. - ETL/ELT Expertise : Implement and manage ETL processes and tools to ensure efficient and reliable data flow, maintaining high data quality and accessibility. - Gen AI Integration : Explore and implement solutions leveraging LLM Mesh for Generative AI applications, contributing to the development of innovative AI-powered features. - Programming & Scripting : Utilize programming languages such as Python and SQL for data manipulation, analysis, automation, and the development of custom data solutions. - Cloud Platform Deployment : Deploy and manage scalable data solutions on cloud platforms such as AWS or Azure, leveraging their respective services for optimal performance and cost-efficiency. - Data Quality & Governance : Ensure seamless integration of data sources, maintaining high data quality, consistency, and accessibility across all data assets. Implement data governance best practices. - Collaboration & Mentorship : Collaborate closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver impactful solutions. Potentially mentor junior team members. - Performance Optimization : Continuously monitor and optimize the performance of data pipelines and data systems. Required Skills & Experience : - Proficiency in Dataiku : Demonstrable expertise in Dataiku for data preparation, analysis, visualization, and building end-to-end data pipelines and applications. - Expertise in Data Modeling : Strong understanding and practical experience in various data modeling techniques (e.g., dimensional modeling, Kimball, Inmon) to design efficient and scalable database structures. - ETL/ELT Processes & Tools : Extensive experience with ETL/ELT processes and a proven track record of using various ETL tools (e.g., Dataiku's built-in capabilities, Apache Airflow, Talend, SSIS, etc.). - Familiarity with LLM Mesh : Familiarity with LLM Mesh or similar frameworks for Gen AI applications, understanding its concepts and potential for integration. - Programming Languages : Strong proficiency in Python for data manipulation, scripting, and developing data solutions. Solid command of SQL for complex querying, data analysis, and database interactions. - Cloud Platforms : Knowledge and hands-on experience with at least one major cloud platform (AWS or Azure) for deploying and managing scalable data solutions (e.g., S3, EC2, Azure Data Lake, Azure Synapse, etc.). - Gen AI Concepts : Basic understanding of Generative AI concepts and their potential applications in data engineering. - Problem-Solving : Excellent analytical and problem-solving skills with a keen eye for detail. - Communication : Strong communication and interpersonal skills to collaborate effectively with cross-functional teams. Bonus Points (Nice to Have) : - Experience with other big data technologies (e.g., Spark, Hadoop, Snowflake). - Familiarity with data governance and data security best practices. - Experience with MLOps principles and tools. - Contributions to open-source projects related to data engineering or AI. Education : Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.
Posted 4 days ago
6.0 - 10.0 years
6 - 10 Lacs
surat
Work from Office
We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities : - Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications : - 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills : - Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications : - Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop
Posted 4 days ago
6.0 - 10.0 years
6 - 10 Lacs
hyderabad
Work from Office
We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities : - Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications : - 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills : - Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications : - Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop
Posted 4 days ago
6.0 - 10.0 years
3 - 6 Lacs
kanpur
Work from Office
Job description : We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities : - Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications : - 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills : - Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications : - Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop
Posted 4 days ago
6.0 - 10.0 years
3 - 6 Lacs
pune
Work from Office
Job description : We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities : - Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications : - 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills : - Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications : - Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop
Posted 4 days ago
6.0 - 10.0 years
6 - 10 Lacs
ludhiana
Work from Office
We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities : - Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications : - 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills : - Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications : - Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop
Posted 5 days ago
6.0 - 10.0 years
6 - 10 Lacs
ahmedabad
Work from Office
We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities : - Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications : - 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills : - Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications : - Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop
Posted 5 days ago
6.0 - 10.0 years
6 - 10 Lacs
bengaluru
Work from Office
We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities : - Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications : - 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills : - Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications : - Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop
Posted 5 days ago
3.0 - 7.0 years
0 Lacs
noida, uttar pradesh
On-site
As an IAM Engineer (MIM) working on a long-term contract in India with a hybrid working model, you will be part of a prestigious consultancy's multi-year Identity & Access Management (IAM) transformation program. Your primary responsibility will be to utilize your strong technical expertise in Microsoft Identity Manager (MIM) to contribute to the success of this project. In this role, you will have the opportunity to engage in a long-term contract, spanning multiple years, with a hybrid working arrangement that allows you to work approximately 2 days per week in the office, offering flexibility across various locations in India. You will collaborate closely with a highly skilled global team, working together on a significant IAM migration program. Key skills and experience required for this role include proven proficiency in Microsoft Identity Manager (MIM, FIM, ILM), particularly in the area of strong synchronization expertise within MIM. Proficiency in Python is essential for this role, and familiarity with Apache Airflow would be advantageous, though not mandatory. Exposure to RSA IAM solutions would also be beneficial, though not a strict requirement. This position offers a hybrid role at a choice of company locations, with an initial 6-month rolling contract. The working hours are in Indian Standard Time (IST). To express your interest in this opportunity, please submit your CV highlighting your relevant experience in MIM, Python, and IAM clearly demonstrated. Join us in this exciting journey of contributing to a major IAM migration program and making a significant impact in the field of Identity & Access Management.,
Posted 5 days ago
4.0 - 10.0 years
0 Lacs
pune, maharashtra
On-site
At Solidatus, we are revolutionizing the way organizations comprehend their data. We are an award-winning, venture-backed software company often referred to as the Git for Metadata. Our platform enables businesses to extract, model, and visualize intricate data lineage flows. Through our unique lineage-first approach and active AI development, we offer organizations unparalleled clarity and robust control over their data's journey and significance. As a rapidly growing B2B SaaS business with fewer than 100 employees, your contributions play a pivotal role in shaping our product. Renowned for our innovation and collaborative culture, we invite you to join us as we expand globally and redefine the future of data understanding. We are currently looking for an experienced Data Pipeline Engineer/Data Lineage Engineer to support the development of data lineage solutions for our clients" existing data pipelines. In this role, you will collaborate with cross-functional teams to ensure the integrity, accuracy, and timeliness of the data lineage solution. Your responsibilities will involve working directly with clients to maximize the value derived from our product and assist them in achieving their contractual objectives. **Experience:** - 4-10 years of relevant experience **Qualifications:** - Proven track record as a Data Engineer or in a similar capacity, with hands-on experience in constructing and optimizing data pipelines and infrastructure. - Demonstrated experience working with Big Data and related tools. - Strong problem-solving and analytical skills to diagnose and resolve complex data-related issues. - Profound understanding of data engineering principles and practices. - Exceptional communication and collaboration abilities to work effectively in cross-functional teams and convey technical concepts to non-technical stakeholders. - Adaptability to new technologies, tools, and methodologies within a dynamic environment. - Proficiency in writing clean, scalable, and robust code using Python or similar programming languages. Background in software engineering is advantageous. **Desirable Languages/Tools:** - Proficiency in programming languages such as Python, Java, Scala, or SQL for data manipulation and scripting. - Experience with XML in transformation pipelines. - Familiarity with major Database technologies like Oracle, Snowflake, and MS SQL Server. - Strong grasp of data modeling concepts including relational and dimensional modeling. - Exposure to big data technologies and frameworks such as Databricks, Spark, Kafka, and MS Notebooks. - Knowledge of modern data architectures like lakehouse. - Experience with CI/CD pipelines and version control systems such as Git. - Understanding of ETL tools like Apache Airflow, Informatica, or SSIS. - Familiarity with data governance and best practices in data management. - Proficiency in cloud platforms and services like AWS, Azure, or GCP for deploying and managing data solutions. - Strong problem-solving and analytical skills for resolving complex data-related issues. - Proficiency in SQL for database management and querying. - Exposure to tools like Open Lineage, Apache Spark Streaming, Kafka, or similar for real-time data streaming. - Experience utilizing data tools in at least one cloud service - AWS, Azure, or GCP. **Key Responsibilities:** - Implement robust data lineage solutions utilizing Solidatus products to support business intelligence, analytics, and data governance initiatives. - Collaborate with stakeholders to comprehend data lineage requirements and translate them into technical and business solutions. - Develop and maintain lineage data models, semantic metadata systems, and data dictionaries. - Ensure data quality, security, and compliance with relevant regulations. - Uphold Solidatus implementation and data lineage modeling best practices at client sites. - Stay updated on emerging technologies and industry trends to enhance data lineage architecture practices continually. **Qualifications:** - Bachelor's or Master's degree in Computer Science, Information Systems, or a related field. - Proven experience in data architecture, focusing on large-scale data systems across multiple companies. - Proficiency in data modeling, database design, and data warehousing concepts. - Experience with cloud platforms (e.g., AWS, Azure, GCP) and big data technologies (e.g., Hadoop, Spark). - Strong understanding of data governance, data quality, and data security principles. - Excellent communication and interpersonal skills to thrive in a collaborative environment. **Why Join Solidatus ** - Participate in an innovative company that is shaping the future of data management. - Collaborate with a dynamic and talented team in a supportive work environment. - Opportunities for professional growth and career advancement. - Flexible working arrangements, including hybrid work options. - Competitive compensation and benefits package. If you are passionate about data architecture and eager to make a significant impact, we invite you to apply now and become a part of our team at Solidatus.,
Posted 5 days ago
7.0 - 10.0 years
6 - 10 Lacs
gurugram
Work from Office
About the Job : We are seeking a highly skilled and experienced Senior Data Engineer to join our dynamic team. In this pivotal role, you will be instrumental in driving our data engineering initiatives, with a strong emphasis on leveraging Dataiku's capabilities to enhance data processing and analytics. You will be responsible for designing, developing, and optimizing robust data pipelines, ensuring seamless integration of diverse data sources, and maintaining high data quality and accessibility to support our business intelligence and advanced analytics projects. This role requires a unique blend of expertise in traditional data engineering principles, advanced data modeling, and a forward-thinking approach to integrating cutting-AI technologies, particularly LLM Mesh for Generative AI applications. If you are passionate about building scalable data solutions and are eager to explore the cutting edge of AI, we encourage you to apply. Key Responsibilities : - Dataiku Leadership : Drive data engineering initiatives with a strong emphasis on leveraging Dataiku capabilities for data preparation, analysis, visualization, and the deployment of data solutions. - Data Pipeline Development : Design, develop, and optimize robust and scalable data pipelines to support various business intelligence and advanced analytics projects. This includes developing and maintaining ETL/ELT processes to automate data extraction, transformation, and loading from diverse sources. - Data Modeling & Architecture : Apply expertise in data modeling techniques to design efficient and scalable database structures, ensuring data integrity and optimal performance. - ETL/ELT Expertise : Implement and manage ETL processes and tools to ensure efficient and reliable data flow, maintaining high data quality and accessibility. - Gen AI Integration : Explore and implement solutions leveraging LLM Mesh for Generative AI applications, contributing to the development of innovative AI-powered features. - Programming & Scripting : Utilize programming languages such as Python and SQL for data manipulation, analysis, automation, and the development of custom data solutions. - Cloud Platform Deployment : Deploy and manage scalable data solutions on cloud platforms such as AWS or Azure, leveraging their respective services for optimal performance and cost-efficiency. - Data Quality & Governance : Ensure seamless integration of data sources, maintaining high data quality, consistency, and accessibility across all data assets. Implement data governance best practices. - Collaboration & Mentorship : Collaborate closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver impactful solutions. Potentially mentor junior team members. - Performance Optimization : Continuously monitor and optimize the performance of data pipelines and data systems. Required Skills & Experience : - Proficiency in Dataiku : Demonstrable expertise in Dataiku for data preparation, analysis, visualization, and building end-to-end data pipelines and applications. - Expertise in Data Modeling : Strong understanding and practical experience in various data modeling techniques (e.g., dimensional modeling, Kimball, Inmon) to design efficient and scalable database structures. - ETL/ELT Processes & Tools : Extensive experience with ETL/ELT processes and a proven track record of using various ETL tools (e.g., Dataiku's built-in capabilities, Apache Airflow, Talend, SSIS, etc.). - Familiarity with LLM Mesh : Familiarity with LLM Mesh or similar frameworks for Gen AI applications, understanding its concepts and potential for integration. - Programming Languages : Strong proficiency in Python for data manipulation, scripting, and developing data solutions. Solid command of SQL for complex querying, data analysis, and database interactions. - Cloud Platforms : Knowledge and hands-on experience with at least one major cloud platform (AWS or Azure) for deploying and managing scalable data solutions (e.g., S3, EC2, Azure Data Lake, Azure Synapse, etc.). - Gen AI Concepts : Basic understanding of Generative AI concepts and their potential applications in data engineering. - Problem-Solving : Excellent analytical and problem-solving skills with a keen eye for detail. - Communication : Strong communication and interpersonal skills to collaborate effectively with cross-functional teams. Bonus Points (Nice to Have) : - Experience with other big data technologies (e.g., Spark, Hadoop, Snowflake). - Familiarity with data governance and data security best practices. - Experience with MLOps principles and tools. - Contributions to open-source projects related to data engineering or AI. Education : Bachelor's or Master's degree in Computer Science, Data Science, Engineering, or a related quantitative field.
Posted 5 days ago
6.0 - 10.0 years
6 - 10 Lacs
gurugram
Work from Office
We are seeking an experienced and driven Data Engineer with 5+ years of hands-on experience in building scalable data infrastructure and systems. You will play a key role in designing and developing robust, high-performance ETL pipelines and managing large-scale datasets to support critical business functions. This role requires deep technical expertise, strong problem-solving skills, and the ability to thrive in a fast-paced, evolving environment. Key Responsibilities : - Design, develop, and maintain scalable and reliable ETL/ELT pipelines for processing large volumes of data (terabytes and beyond). - Model and structure data for performance, scalability, and usability. - Work with cloud infrastructure (preferably Azure) to build and optimize data workflows. - Leverage distributed computing frameworks like Apache Spark and Hadoop for large-scale data processing. - Build and manage data lake/lakehouse architectures in alignment with best practices. - Optimize ETL performance and manage cost-effective data operations. - Collaborate closely with cross-functional teams including data science, analytics, and software engineering. - Ensure data quality, integrity, and security across all stages of the data lifecycle. Required Skills & Qualifications : - 7 to 10 years of relevant experience in bigdata engineering. - Advanced proficiency in Python, - Strong skills in SQL for complex data manipulation and analysis. - Hands-on experience with Apache Spark, Hadoop, or similar distributed systems. - Proven track record of handling large-scale datasets (TBs) in production environments. - Cloud development experience with Azure (preferred), AWS, or GCP. - Solid understanding of data lake and data lakehouse architectures. - Expertise in ETL performance tuning and cost optimization techniques. - Knowledge of data structures, algorithms, and modern software engineering practices. Soft Skills : - Strong communication skills with the ability to explain complex technical concepts clearly and concisely. - Self-starter who learns quickly and takes ownership. - High attention to detail with a strong sense of data quality and reliability. - Comfortable working in an agile, fast-changing environment with incomplete requirements. Preferred Qualifications : - Experience with tools like Apache Airflow, Azure Data Factory, or similar. - Familiarity with CI/CD and DevOps in the context of data engineering. - Knowledge of data governance, cataloging, and access control principles. Skills : Python,Sql,Aws,Azure, Hadoop
Posted 5 days ago
4.0 - 5.0 years
3 - 6 Lacs
nagpur
Work from Office
Job Title : Databricks Tech Lead (Contract) Contract Duration : 4 Months (Extendable based on Performance) Job Location : Remote Job Timings : India Evening Shift (till 11 : 30 PM IST) Experience Required : 7+ Years Job Description : We are looking for an experienced Databricks Tech Lead to join our team on a 4-month extendable contract. The ideal candidate will bring deep expertise in data engineering, big data platforms, and cloud-based data warehouse solutions, with the ability to work in a fast-paced remote environment. Key Responsibilities : - Lead the design, optimization, and management of large-scale data pipelines using Databricks, Spark (PySpark), and AWS data services. - Productionize and deploy Big Data platforms and applications across multi-cloud environments (AWS, Azure, GCP). - Build and manage data warehouse solutions, schema evolution, and data versioning. - Implement and manage workflow orchestration using Airflow or similar tools. - Work with complex business use cases and transform them into scalable data models and architectures. - Collaborate with cross-functional teams to deliver high-quality data engineering and analytics solutions. - Mentor team members and provide technical leadership on Databricks and modern data technologies. Required Skills & Experience : - 7+ years of experience in Data Warehouse, ETL, Data Modeling & Reporting. - 5+ years of experience in productionizing & deploying Big Data platforms. - 3+ years of hands-on Databricks experience. - Strong expertise with : a. SQL, Python, Spark, Airflow b. AWS S3, Redshift, Hive Data Catalog, Delta Lake, Parquet, Avro - Streaming platforms (Spark Streaming, Kafka, Hive) - Proven experience in building Enterprise Data Warehouses and implementing frameworks such as Databricks, Apache Spark, Delta Lake, Tableau, Hive Metastore, Kafka, Kubernetes, Docker, and CI/CD pipelines. - Hands-on experience with Machine Learning frameworks (TensorFlow, Keras, PyTorch) and MLOps practices. - Strong leadership, problem-solving, and communication skills.
Posted 6 days ago
4.0 - 5.0 years
3 - 6 Lacs
pune
Work from Office
Job Title : Databricks Tech Lead (Contract) Contract Duration : 4 Months (Extendable based on Performance) Job Location : Remote Job Timings : India Evening Shift (till 11 : 30 PM IST) Experience Required : 7+ Years Job Description : We are looking for an experienced Databricks Tech Lead to join our team on a 4-month extendable contract. The ideal candidate will bring deep expertise in data engineering, big data platforms, and cloud-based data warehouse solutions, with the ability to work in a fast-paced remote environment. Key Responsibilities : - Lead the design, optimization, and management of large-scale data pipelines using Databricks, Spark (PySpark), and AWS data services. - Productionize and deploy Big Data platforms and applications across multi-cloud environments (AWS, Azure, GCP). - Build and manage data warehouse solutions, schema evolution, and data versioning. - Implement and manage workflow orchestration using Airflow or similar tools. - Work with complex business use cases and transform them into scalable data models and architectures. - Collaborate with cross-functional teams to deliver high-quality data engineering and analytics solutions. - Mentor team members and provide technical leadership on Databricks and modern data technologies. Required Skills & Experience : - 7+ years of experience in Data Warehouse, ETL, Data Modeling & Reporting. - 5+ years of experience in productionizing & deploying Big Data platforms. - 3+ years of hands-on Databricks experience. - Strong expertise with : a. SQL, Python, Spark, Airflow b. AWS S3, Redshift, Hive Data Catalog, Delta Lake, Parquet, Avro - Streaming platforms (Spark Streaming, Kafka, Hive) - Proven experience in building Enterprise Data Warehouses and implementing frameworks such as Databricks, Apache Spark, Delta Lake, Tableau, Hive Metastore, Kafka, Kubernetes, Docker, and CI/CD pipelines. - Hands-on experience with Machine Learning frameworks (TensorFlow, Keras, PyTorch) and MLOps practices. - Strong leadership, problem-solving, and communication skills.
Posted 6 days ago
7.0 - 8.0 years
8 - 12 Lacs
jaipur
Work from Office
Job Title : Databricks Tech Lead (Contract) Contract Duration : 4 Months (Extendable based on Performance) Job Location : Remote Job Timings : India Evening Shift (till 11 : 30 PM IST) Experience Required : 7+ Years Job Description : We are looking for an experienced Databricks Tech Lead to join our team on a 4-month extendable contract. The ideal candidate will bring deep expertise in data engineering, big data platforms, and cloud-based data warehouse solutions, with the ability to work in a fast-paced remote environment. Key Responsibilities : - Lead the design, optimization, and management of large-scale data pipelines using Databricks, Spark (PySpark), and AWS data services. - Productionize and deploy Big Data platforms and applications across multi-cloud environments (AWS, Azure, GCP). - Build and manage data warehouse solutions, schema evolution, and data versioning. - Implement and manage workflow orchestration using Airflow or similar tools. - Work with complex business use cases and transform them into scalable data models and architectures. - Collaborate with cross-functional teams to deliver high-quality data engineering and analytics solutions. - Mentor team members and provide technical leadership on Databricks and modern data technologies. Required Skills & Experience : - 7+ years of experience in Data Warehouse, ETL, Data Modeling & Reporting. - 5+ years of experience in productionizing & deploying Big Data platforms. - 3+ years of hands-on Databricks experience. - Strong expertise with : a. SQL, Python, Spark, Airflow b. AWS S3, Redshift, Hive Data Catalog, Delta Lake, Parquet, Avro - Streaming platforms (Spark Streaming, Kafka, Hive) - Proven experience in building Enterprise Data Warehouses and implementing frameworks such as Databricks, Apache Spark, Delta Lake, Tableau, Hive Metastore, Kafka, Kubernetes, Docker, and CI/CD pipelines. - Hands-on experience with Machine Learning frameworks (TensorFlow, Keras, PyTorch) and MLOps practices. - Strong leadership, problem-solving, and communication skills.
Posted 6 days ago
4.0 - 5.0 years
3 - 6 Lacs
mumbai
Work from Office
Job Title : Databricks Tech Lead (Contract) Contract Duration : 4 Months (Extendable based on Performance) Job Location : Remote Job Timings : India Evening Shift (till 11 : 30 PM IST) Experience Required : 7+ Years Job Description : We are looking for an experienced Databricks Tech Lead to join our team on a 4-month extendable contract. The ideal candidate will bring deep expertise in data engineering, big data platforms, and cloud-based data warehouse solutions, with the ability to work in a fast-paced remote environment. Key Responsibilities : - Lead the design, optimization, and management of large-scale data pipelines using Databricks, Spark (PySpark), and AWS data services. - Productionize and deploy Big Data platforms and applications across multi-cloud environments (AWS, Azure, GCP). - Build and manage data warehouse solutions, schema evolution, and data versioning. - Implement and manage workflow orchestration using Airflow or similar tools. - Work with complex business use cases and transform them into scalable data models and architectures. - Collaborate with cross-functional teams to deliver high-quality data engineering and analytics solutions. - Mentor team members and provide technical leadership on Databricks and modern data technologies. Required Skills & Experience : - 7+ years of experience in Data Warehouse, ETL, Data Modeling & Reporting. - 5+ years of experience in productionizing & deploying Big Data platforms. - 3+ years of hands-on Databricks experience. - Strong expertise with : a. SQL, Python, Spark, Airflow b. AWS S3, Redshift, Hive Data Catalog, Delta Lake, Parquet, Avro - Streaming platforms (Spark Streaming, Kafka, Hive) - Proven experience in building Enterprise Data Warehouses and implementing frameworks such as Databricks, Apache Spark, Delta Lake, Tableau, Hive Metastore, Kafka, Kubernetes, Docker, and CI/CD pipelines. - Hands-on experience with Machine Learning frameworks (TensorFlow, Keras, PyTorch) and MLOps practices. - Strong leadership, problem-solving, and communication skills.
Posted 6 days ago
7.0 - 8.0 years
8 - 12 Lacs
hyderabad
Work from Office
Job Title : Databricks Tech Lead (Contract) Contract Duration : 4 Months (Extendable based on Performance) Job Location : Remote Job Timings : India Evening Shift (till 11 : 30 PM IST) Experience Required : 7+ Years Job Description : We are looking for an experienced Databricks Tech Lead to join our team on a 4-month extendable contract. The ideal candidate will bring deep expertise in data engineering, big data platforms, and cloud-based data warehouse solutions, with the ability to work in a fast-paced remote environment. Key Responsibilities : - Lead the design, optimization, and management of large-scale data pipelines using Databricks, Spark (PySpark), and AWS data services. - Productionize and deploy Big Data platforms and applications across multi-cloud environments (AWS, Azure, GCP). - Build and manage data warehouse solutions, schema evolution, and data versioning. - Implement and manage workflow orchestration using Airflow or similar tools. - Work with complex business use cases and transform them into scalable data models and architectures. - Collaborate with cross-functional teams to deliver high-quality data engineering and analytics solutions. - Mentor team members and provide technical leadership on Databricks and modern data technologies. Required Skills & Experience : - 7+ years of experience in Data Warehouse, ETL, Data Modeling & Reporting. - 5+ years of experience in productionizing & deploying Big Data platforms. - 3+ years of hands-on Databricks experience. - Strong expertise with : a. SQL, Python, Spark, Airflow b. AWS S3, Redshift, Hive Data Catalog, Delta Lake, Parquet, Avro - Streaming platforms (Spark Streaming, Kafka, Hive) - Proven experience in building Enterprise Data Warehouses and implementing frameworks such as Databricks, Apache Spark, Delta Lake, Tableau, Hive Metastore, Kafka, Kubernetes, Docker, and CI/CD pipelines. - Hands-on experience with Machine Learning frameworks (TensorFlow, Keras, PyTorch) and MLOps practices. - Strong leadership, problem-solving, and communication skills.
Posted 6 days ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
73564 Jobs | Dublin
Wipro
27625 Jobs | Bengaluru
Accenture in India
22690 Jobs | Dublin 2
EY
20638 Jobs | London
Uplers
15021 Jobs | Ahmedabad
Bajaj Finserv
14304 Jobs |
IBM
14148 Jobs | Armonk
Accenture services Pvt Ltd
13138 Jobs |
Capgemini
12942 Jobs | Paris,France
Amazon.com
12683 Jobs |