Get alerts for new jobs matching your selected skills, preferred locations, and experience range. Manage Job Alerts
4.0 - 8.0 years
0 Lacs
delhi
On-site
The ideal candidate should possess extensive expertise in SQL, data modeling, ETL/ELT pipeline development, and cloud-based data platforms like Databricks or Snowflake. You will be responsible for designing scalable data models, managing reliable data workflows, and ensuring the integrity and performance of critical financial datasets. Collaboration with engineering, analytics, product, and compliance teams is a key aspect of this role. Responsibilities: - Design, implement, and maintain logical and physical data models for transactional, analytical, and reporting systems. - Develop and oversee scalable ETL/ELT pipelines to process large volumes of financial transaction data. - Optimize SQL queries, stored procedures, and data transformations for enhanced performance. - Create and manage data orchestration workflows using tools like Airflow, Dagster, or Luigi. - Architect data lakes and warehouses utilizing platforms such as Databricks, Snowflake, BigQuery, or Redshift. - Ensure adherence to data governance, security, and compliance standards (e.g., PCI-DSS, GDPR). - Work closely with data engineers, analysts, and business stakeholders to comprehend data requirements and deliver solutions. - Conduct data profiling, validation, and quality assurance to maintain clean and consistent data. - Maintain comprehensive documentation for data models, pipelines, and architecture. Required Skills & Qualifications: - Proficiency in advanced SQL, including query tuning, indexing, and performance optimization. - Experience in developing ETL/ELT workflows with tools like Spark, dbt, Talend, or Informatica. - Familiarity with data orchestration frameworks such as Airflow, Dagster, Luigi, etc. - Hands-on experience with cloud-based data platforms like Databricks, Snowflake, or similar technologies. - Deep understanding of data warehousing principles like star/snowflake schema, slowly changing dimensions, etc. - Knowledge of cloud services (AWS, GCP, or Azure) and data security best practices. - Strong analytical and problem-solving skills in high-scale environments. Preferred Qualifications: - Exposure to real-time data pipelines like Kafka, Spark Streaming. - Knowledge of data mesh or data fabric architecture paradigms. - Certifications in Snowflake, Databricks, or relevant cloud platforms. - Familiarity with Python or Scala for data engineering tasks.,
Posted 1 day ago
2.0 - 6.0 years
0 Lacs
haryana
On-site
As a Data Engineer at our company, you will be responsible for building and maintaining scalable data pipelines and ETL processes using Python and related technologies. Your primary focus will be on developing efficient data pipelines to handle large volumes of data and optimize processing times. Additionally, you will collaborate closely with our team of data scientists and engineers at Matrix Space. To qualify for this role, you should have 2-5 years of experience in data engineering or a related field, with a strong proficiency in Python programming. You must be well-versed in libraries such as Pandas, NumPy, and SQL Alchemy, and have hands-on experience with data engineering tools like Apache Airflow, Luigi, or similar frameworks. A working knowledge of SQL and experience with relational databases such as PostgreSQL or MySQL is also required. In addition to technical skills, we are looking for candidates with strong problem-solving abilities who can work both independently and as part of a team. Effective communication skills are essential, as you will be required to explain technical concepts to non-technical stakeholders. The ability to complete tasks efficiently and effectively is a key trait we value in potential candidates. If you are an immediate joiner and can start within a week, we encourage you to apply for this opportunity. Join our team and be a part of our exciting projects in data engineering.,
Posted 1 day ago
0.0 - 4.0 years
0 Lacs
karnataka
On-site
We are looking for a Data Engineer to join our data team. You will be responsible for managing our master data set, developing reports, and troubleshooting data issues. To excel in this role, attention to detail, experience as a data analyst, and a deep understanding of popular data analysis tools and databases are essential. Your responsibilities include: - Building, maintaining, and managing data pipelines for efficient data flow between systems. - Collaborating with stakeholders to design and manage customized data pipelines. - Testing various ETL (Extract, Transform, Load) tools for data ingestion and processing. - Assisting in scaling the data infrastructure to meet the organization's growing data demands. - Monitoring data pipeline performance and troubleshooting data issues. - Documenting pipeline architectures and workflows for future reference and scaling. - Evaluating data formats, sources, and transformation techniques. - Working closely with data scientists to ensure data availability and reliability for analytics. We require the following skill sets/experience: - Proficiency in Python, PySpark, and Big Data concepts such as Data Lakes and Data Warehouses. - Strong background in SQL. - Familiarity with cloud computing platforms like AWS, Azure, or Google Cloud. - Basic knowledge of containerization technologies like Docker. - Exposure to data orchestration tools like Apache Airflow or Luigi. Pedigree: - Bachelor's degree in Computer Science, Electrical Engineering, or IT.,
Posted 3 days ago
2.0 - 6.0 years
0 Lacs
haryana
On-site
As a Data Engineer at our company, your primary responsibility will be the development and maintenance of scalable and efficient data pipelines and ETL processes using Python and related technologies. You will play a crucial role in optimizing the performance of these pipelines and queries to handle large volumes of data and improve processing times. Collaboration is key in this role, as you will closely work with our team of data scientists and engineers at Matrix Space. To excel in this position, you should have 2-5 years of experience in data engineering or a related field with a strong focus on Python. Proficiency in Python programming is a must, including knowledge of libraries such as Pandas, NumPy, and SQL Alchemy. Additionally, hands-on experience with data engineering tools and frameworks like Apache Airflow, Luigi, or similar is highly desirable. A solid grasp of SQL and experience with relational databases such as PostgreSQL and MySQL will be beneficial. In addition to technical skills, we value certain soft skills in our team members. Problem-solving abilities, the capacity to work both independently and collaboratively, and effective communication skills are essential. You should be able to articulate technical concepts to non-technical stakeholders and demonstrate a proven track record of completing tasks efficiently. If you are an immediate joiner and can commence within a week, we encourage you to apply for this position. Join our team and be part of an exciting journey in data engineering where your skills and expertise will be valued and put to good use.,
Posted 4 days ago
3.0 - 7.0 years
0 Lacs
kolkata, west bengal
On-site
You are a Data Engineer with 3+ years of experience, proficient in SQL and Python development. You will be responsible for designing, developing, and maintaining scalable data pipelines to support ETL processes using tools like Apache Airflow, AWS Glue, or similar. Your role involves optimizing and managing relational and NoSQL databases such as MySQL, PostgreSQL, MongoDB, or Cassandra for high performance and scalability. You will write advanced SQL queries, stored procedures, and functions to efficiently extract, transform, and analyze large datasets. Additionally, you will implement and manage data solutions on cloud platforms like AWS, Azure, or Google Cloud, utilizing services such as Redshift, BigQuery, or Snowflake. Your contributions to designing and maintaining data warehouses and data lakes will support analytics and BI requirements. Automation of data processing tasks through script and application development in Python or other programming languages is also part of your responsibilities. As a Data Engineer, you will implement data quality checks, monitoring, and governance policies to ensure data accuracy, consistency, and security. Collaboration with data scientists, analysts, and business stakeholders to understand data needs and translate them into technical solutions is essential. Identifying and resolving performance bottlenecks in data systems, optimizing data storage, and retrieval are key aspects. Maintaining comprehensive documentation for data processes, pipelines, and infrastructure is crucial. Staying up-to-date with the latest trends in data engineering, big data technologies, and cloud services is expected from you. You should hold a Bachelors or Masters degree in Computer Science, Information Technology, Data Engineering, or a related field. Proficiency in SQL, relational databases, NoSQL databases, Python programming, and experience with data pipeline tools and cloud platforms is required. Knowledge of big data tools like Apache Spark, Hadoop, or Kafka is a plus. Strong analytical and problem-solving skills with a focus on performance optimization and scalability are essential. Excellent verbal and written communication skills are necessary to convey technical concepts to non-technical stakeholders. You should be able to work collaboratively in cross-functional teams. Preferred certifications include AWS Certified Data Analytics, Google Professional Data Engineer, or similar. An eagerness to learn new technologies and adapt quickly in a fast-paced environment is a mindset that will be valuable in this role.,
Posted 1 week ago
8.0 - 12.0 years
0 Lacs
karnataka
On-site
Working with data on a day-to-day basis excites you, and you are interested in building robust data architecture to identify data patterns and optimize data consumption for customers who will forecast and predict actions based on data. If this excites you, then working in our intelligent automation team at Schneider AI Hub is the perfect fit for you. As a Lead Data Engineer at Schneider AI Hub, you will play a crucial role in the AI transformation of Schneider Electric by developing AI-powered solutions. Your responsibilities will include expanding and optimizing data and data pipeline architecture, ensuring optimal data flow and collection for cross-functional teams, and supporting software engineers, data analysts, and data scientists on data initiatives. You will be responsible for creating and maintaining optimal data pipeline architecture, designing the right schema to support functional requirements, and building production data pipelines from ingestion to consumption. Additionally, you will create preprocessing and postprocessing for various forms of data, develop data visualization and business intelligence tools, and implement internal process improvements for automating manual data processes. To qualify for this role, you should hold a bachelor's or master's degree in computer science, information technology, or other quantitative fields and have a minimum of 8 years of experience as a data engineer supporting large data transformation initiatives related to machine learning. Strong analytical skills, experience with Azure cloud services, ETLs using Spark, and proficiency in scripting languages like Python and Pyspark are essential requirements for this position. As a team player committed to the success of the team and projects, you will collaborate with various stakeholders to ensure data delivery architecture is consistent and secure across multiple data centers. Join us at Schneider Electric, where we create connected technologies that reshape industries, transform cities, and enrich lives, with a diverse and inclusive culture that values the contribution of every individual. If you are passionate about success and eager to contribute to cutting-edge projects, we invite you to be part of our dynamic team at Schneider Electric in Bangalore, India.,
Posted 1 week ago
5.0 - 9.0 years
0 Lacs
pune, maharashtra
On-site
As a DataOps Engineer, you will be responsible for designing and maintaining scalable ML model deployment infrastructure using Kubernetes and Docker. Your role will involve implementing CI/CD pipelines for ML workflows, ensuring security best practices are followed, and setting up monitoring tools to track system health, model performance, and data pipeline issues. You will collaborate with cross-functional teams to streamline the end-to-end lifecycle of data products and identify performance bottlenecks and data reliability issues in the ML infrastructure. To excel in this role, you should have strong experience with Kubernetes and Docker for containerization and orchestration, hands-on experience in ML model deployment in production environments, and proficiency with orchestration tools like Airflow or Luigi. Familiarity with monitoring tools such as Prometheus, Grafana, or ELK Stack, knowledge of security protocols, CI/CD pipelines, and DevOps practices in a data/ML environment are essential. Exposure to cloud platforms like AWS, GCP, or Azure is preferred. Additionally, experience with MLflow, Seldon, or Kubeflow, knowledge of data governance, lineage, and compliance standards, and understanding of data pipelines and streaming frameworks would be advantageous in this role. Your expertise in data pipelines, Docker, Grafana, Airflow, CI/CD pipelines, orchestration tools, cloud platforms, compliance standards, data governance, ELK Stack, Kubernetes, lineage, ML, streaming frameworks, ML model deployment, and DevOps practices will be key to your success in this position.,
Posted 1 week ago
6.0 - 10.0 years
0 Lacs
kolkata, west bengal
On-site
You must have knowledge in Azure Datalake, Azure function, Azure Databricks, Azure Data Factory, and PostgreSQL. Working knowledge in Azure DevOps and Git flow would be an added advantage. Alternatively, you should have working knowledge in AWS Kinesis, AWS EMR, AWS Glue, AWS RDS, AWS Athena, and AWS RedShift. Demonstrable expertise in working with timeseries data is essential. Experience in delivering data engineering/data science projects in Industry 4.0 is an added advantage. Knowledge of Palantir is required. You must possess strong problem-solving skills with a focus on sustainable and reusable development. Proficiency in using statistical computer languages like Python/PySpark, Pandas, Numpy, seaborn/matplotlib is necessary. Knowledge in Streamlit.io is a plus. Familiarity with Scala, GoLang, Java, and big data tools such as Hadoop, Spark, Kafka is beneficial. Experience with relational databases like Microsoft SQL Server, MySQL, PostGreSQL, Oracle, and NoSQL databases including Hadoop, Cassandra, MongoDB is expected. Proficiency in data pipeline and workflow management tools like Azkaban, Luigi, Airflow is required. Experience in building and optimizing big data pipelines, architectures, and data sets is crucial. You should possess strong analytical skills related to working with unstructured datasets. Provide innovative solutions to data engineering problems, document technology choices, and integration patterns. Apply best practices for project delivery with clean code. Demonstrate innovation and proactiveness in meeting project requirements. Reporting to: Director- Intelligent Insights and Data Strategy Travel: Must be willing to be deployed at client locations worldwide for long and short terms, flexible for shorter durations within India and abroad.,
Posted 1 week ago
4.0 - 5.0 years
5 - 8 Lacs
Bengaluru, Karnataka, India
On-site
Key Responsibilities: Design, develop, and maintain ETL workflows and pipelines using Python Extract data from various sources (databases, APIs, flat files) and perform data transformations to meet business requirements Load processed data into target systems such as data warehouses, data lakes, or databases Optimize ETL processes for performance, scalability, and reliability Collaborate with data architects and analysts to understand data requirements and design solutions Implement data validation and error-handling mechanisms to ensure data quality Automate routine ETL tasks and monitoring using scripting and workflow tools Document ETL processes, data mappings, and technical specifications Troubleshoot and resolve issues in ETL workflows promptly Follow data governance, security policies, and compliance standards Required Skills: 4 to 5 years of hands-on experience in Python programming for ETL development Strong knowledge of ETL concepts and data integration best practices Experience with ETL frameworks/libraries such as Airflow, Luigi, Apache NiFi, Pandas , or similar Proficiency in SQL and working with relational databases (Oracle, MySQL, SQL Server, etc.) Familiarity with data formats like JSON, XML, CSV, Parquet Experience in cloud platforms and tools such as AWS Glue, Azure Data Factory, or GCP Dataflow is a plus Understanding of data warehousing concepts and architectures (star schema, snowflake schema) Experience with version control tools such as Git Knowledge of containerization (Docker) and CI/CD pipelines is desirable Preferred Qualifications: Experience working with big data technologies such as Hadoop, Spark, or Kafka Familiarity with NoSQL databases (MongoDB, Cassandra) Experience with data visualization and reporting tools Certification in Python or Data Engineering tools Knowledge of Agile methodologies and working in collaborative teams Soft Skills: Strong analytical and problem-solving skills Excellent communication and collaboration abilities Detail-oriented and committed to delivering high-quality work Ability to manage multiple tasks and meet deadlines Proactive and eager to learn new technologies and tools
Posted 2 weeks ago
2.0 - 6.0 years
0 Lacs
gwalior, madhya pradesh
On-site
As a Data Engineer at Synram Software Services Pvt. Ltd., a subsidiary of FG International GmbH, you will be an integral part of our team dedicated to providing innovative IT solutions in ERP systems, E-commerce platforms, Mobile Applications, and Digital Marketing. We are committed to delivering customized solutions that drive success across various industries. In this role, you will be responsible for designing, building, and maintaining scalable data pipelines and infrastructure. Working closely with data analysts, data scientists, and software engineers, you will facilitate data-driven decision-making throughout the organization. Your key responsibilities will include developing, testing, and maintaining data architectures, designing and implementing ETL processes, optimizing data systems, collaborating with cross-functional teams to understand data requirements, ensuring data quality, integrity, and security, automating repetitive data tasks, monitoring and troubleshooting production data pipelines, and documenting systems, processes, and best practices. To excel in this role, you should possess a Bachelor's/Master's degree in Computer Science, Information Technology, or a related field, along with at least 2 years of experience as a Data Engineer or in a similar role. Proficiency in SQL, Python, or Scala is essential, as well as experience with data pipeline tools like Apache Airflow and familiarity with big data tools such as Hadoop and Spark. Hands-on experience with cloud platforms like AWS, GCP, or Azure is preferred, along with knowledge of data warehouse solutions like Snowflake, Redshift, or BigQuery. Preferred qualifications include knowledge of CI/CD for data applications, experience with containerization tools like Docker and Kubernetes, and exposure to data governance and compliance standards. If you are ready to be part of a data-driven transformation journey, apply now to join our team at Synram Software Pvt Ltd. For inquiries, contact us at career@synram.co or +91-9111381555. Benefits of this full-time, permanent role include a flexible schedule, internet reimbursement, leave encashment, day shift with fixed hours and weekend availability, joining bonus, and performance bonus. The ability to commute/relocate to Gwalior, Madhya Pradesh, is preferred. Don't miss the opportunity to contribute your expertise to our dynamic team. The application deadline is 20/07/2025, and the expected start date is 12/07/2025. We look forward to welcoming you aboard for a rewarding and challenging career in data engineering.,
Posted 2 weeks ago
5.0 - 9.0 years
0 Lacs
chennai, tamil nadu
On-site
As a skilled PySpark Data Engineer, you will be responsible for designing, implementing, and maintaining PySpark-based applications to handle complex data processing tasks, ensure data quality, and integrate with diverse data sources. Your role will involve developing, testing, and optimizing PySpark applications to process, transform, and analyze large-scale datasets from various sources such as relational databases, NoSQL databases, batch files, and real-time data streams. You will collaborate with data analysts, data scientists, and data architects to understand data processing requirements and deliver high-quality data solutions. Your key responsibilities will include designing efficient data transformation and aggregation processes, developing error handling mechanisms for data integrity, optimizing PySpark jobs for performance, and working with distributed datasets in Spark. Additionally, you will design and implement ETL processes to ingest and integrate data from multiple sources, ensuring consistency, accuracy, and performance. You should have a Bachelor's degree in Computer Science or a related field, along with 5+ years of hands-on experience in big data development. Proficiency in PySpark, Apache Spark, and ETL development tools is essential for this role. To succeed in this position, you should have a strong understanding of data processing principles, techniques, and best practices in a big data environment. You must possess excellent analytical and problem-solving skills, with the ability to translate business requirements into technical solutions. Strong communication and collaboration skills are also crucial for effectively working with data analysts, data architects, and other team members. If you are looking to drive the development of robust data processing and transformation solutions within a fast-paced, data-driven environment, this role is ideal for you.,
Posted 2 weeks ago
2.0 - 6.0 years
0 Lacs
pune, maharashtra
On-site
About Mindstix Software Labs: Mindstix accelerates digital transformation for the world's leading brands. We are a team of passionate innovators specialized in Cloud Engineering, DevOps, Data Science, and Digital Experiences. Our UX studio and modern-stack engineers deliver world-class products for our global customers that include Fortune 500 Enterprises and Silicon Valley startups. Our work impacts a diverse set of industries - eCommerce, Luxury Retail, ISV and SaaS, Consumer Tech, and Hospitality. A fast-moving open culture powered by curiosity and craftsmanship. A team committed to bold thinking and innovation at the very intersection of business, technology, and design. That's our DNA. Roles and Responsibilities: Mindstix is looking for a proficient Data Engineer. You are a collaborative person who takes pleasure in finding solutions to issues that add to the bottom line. You appreciate technical work by hand and feel a sense of ownership. You require a keen eye for detail, work experience as a data analyst, and in-depth knowledge of widely used databases and technologies for data analysis. Your responsibilities include: - Building outstanding domain-focused data solutions with internal teams, business analysts, and stakeholders. - Applying data engineering practices and standards to develop robust and maintainable solutions. - Being motivated by a fast-paced, service-oriented environment and interacting directly with clients on new features for future product releases. - Being a natural problem-solver and intellectually curious across a breadth of industries and topics. - Being acquainted with different aspects of Data Management like Data Strategy, Architecture, Governance, Data Quality, Integrity & Data Integration. - Being extremely well-versed in designing incremental and full data load techniques. Qualifications and Skills: - Bachelors or Master's degree in Computer Science, Information Technology, or allied streams. - 2+ years of hands-on experience in the data engineering domain with DWH development. - Must have experience with end-to-end data warehouse implementation on Azure or GCP. - Must have SQL and PL/SQL skills, implementing complex queries and stored procedures. - Solid understanding of DWH concepts such as OLAP, ETL/ELT, RBAC, Data Modelling, Data Driven Pipelines, Virtual Warehousing, and MPP. - Expertise in Databricks - Structured Streaming, Lakehouse Architecture, DLT, Data Modeling, Vacuum, Time Travel, Security, Monitoring, Dashboards, DBSQL, and Unit Testing. - Expertise in Snowflake - Monitoring, RBACs, Virtual Warehousing, Query Performance Tuning, and Time Travel. - Understanding of Apache Spark, Airflow, Hudi, Iceberg, Nessie, NiFi, Luigi, and Arrow (Good to have). - Strong foundations in computer science, data structures, algorithms, and programming logic. - Excellent logical reasoning and data interpretation capability. - Ability to interpret business requirements accurately. - Exposure to work with multicultural international customers. - Experience in the Retail/ Supply Chain/ CPG/ EComm/Health Industry is a plus. Who Fits Best - You are a data enthusiast and problem solver. - You are a self-motivated and fast learner with a strong sense of ownership and drive. - You enjoy working in a fast-paced creative environment. - You appreciate great design, have a strong sense of aesthetics and have a keen eye for detail. - You thrive in a customer-centric environment with the ability to actively listen, empathize and collaborate with globally distributed teams. - You are a team player who desires to mentor and inspire others to do their best. - You love expressing ideas and articulating well with strong written and verbal English communication and presentation skills. - You are detail-oriented with an appreciation for craftsmanship. Benefits: - Flexible working environment. - Competitive compensation and perks. - Health insurance coverage. - Accelerated career paths. - Rewards and recognition. - Sponsored certifications. - Global customers. - Mentorship by industry leaders. Location: This position is primarily based at our Pune (India) headquarters, requiring all potential hires to work from this location. A modern workplace is deeply collaborative by nature, while also demanding a touch of flexibility. We embrace deep collaboration at our offices with reasonable flexi-timing and hybrid options to our seasoned team members. Equal Opportunity Employer.,
Posted 3 weeks ago
3.0 - 8.0 years
9 - 19 Lacs
Hyderabad
Work from Office
We Advantum Health Pvt. Ltd - US Healthcare MNC looking for Senior AI/ML Engineer. We Advantum Health Private Limited is a leading RCM and Medical Coding company, operating since 2013. Our Head Office is located in Hyderabad, with branch operations in Chennai and Noida. We are proud to be a Great Place to Work certified organization and a recipient of the Telangana Best Employer Award. Our office spans 35,000 sq. ft. in Cyber Gateway, Hitech City, Hyderabad Job Title: Senior AI/ML Engineer Location: Hitech City, Hyderabad, India Work from office Ph: 9177078628, 7382307530, 9059683624 Address: Advantum Health Private Limited, Cyber gateway, Block C, 4th floor Hitech City, Hyderabad. Location: https://www.google.com/maps/place/Advantum+Health+India/@17.4469674,78.3747158,289m/data=!3m2!1e3!5s0x3bcb93e01f1bbe71:0x694a7f60f2062a1!4m6!3m5!1s0x3bcb930059ea66d1:0x5f2dcd85862cf8be!8m2!3d17.4467126!4d78.3767566!16s%2Fg%2F11whflplxg?entry=ttu&g_ep=EgoyMDI1MDMxNi4wIKXMDSoASAFQAw%3D%3D Job Summary: We are seeking a highly skilled and motivated Data Engineer to join our growing data team. In this role, you will be responsible for designing, building, and maintaining scalable data pipelines and infrastructure to support analytics, machine learning, and business intelligence initiatives. You will work closely with data analysts, scientists, and engineers to ensure data availability, reliability, and quality across the organization. Key Responsibilities: Design, develop, and maintain robust ETL/ELT pipelines for ingesting and transforming large volumes of structured and unstructured data Build and optimize data infrastructure for scalability, performance, and reliability Collaborate with cross-functional teams to understand data needs and translate them into technical solutions Implement data quality checks, monitoring, and alerting mechanisms Manage and optimize data storage solutions (data warehouses, data lakes, databases) Ensure data security, compliance, and governance across all platforms Automate data workflows and optimize data delivery for real-time and batch processing Participate in code reviews and contribute to best practices for data engineering Required Skills and Qualifications: Bachelors or Masters degree in Computer Science, Engineering, Information Systems, or a related field 3+ years of experience in data engineering or related roles Strong programming skills in Python, Java, or Scala Proficiency with SQL and working with relational databases (e.g., PostgreSQL, MySQL) Experience with data pipeline and workflow orchestration tools (e.g., Airflow, Prefect, Luigi) Hands-on experience with cloud platforms (AWS, GCP, or Azure) and cloud data services (e.g., Redshift, BigQuery, Snowflake) Familiarity with distributed data processing tools (e.g., Spark, Kafka, Hadoop) Solid understanding of data modeling, warehousing concepts, and data governance Preferred Qualifications: Experience with CI/CD and DevOps practices for data engineering Knowledge of data privacy regulations such as GDPR, HIPAA, etc. Experience with version control systems like Git Familiarity with containerization (Docker, Kubernetes) Follow us on LinkedIn, Facebook, Instagram, Youtube and Threads for all updates: Advantum Health Linkedin Page: https://www.linkedin.com/showcase/advantum-health-india/ Advantum Health Facebook Page: https://www.facebook.com/profile.php?id=61564435551477 Advantum Health Instagram Page: https://www.instagram.com/reel/DCXISlIO2os/?igsh=dHd3czVtc3Fyb2hk Advantum Health India Youtube link: https://youtube.com/@advantumhealthindia-rcmandcodi?si=265M1T2IF0gF-oF1 Advantum Health Threads link: https://www.threads.net/@advantum.health.india HR Dept, Advantum Health Pvt Ltd Cybergateway, Block C, Hitech City, Hyderabad Ph: 9177078628, 7382307530, 9059683624
Posted 4 weeks ago
2.0 - 4.0 years
7 - 9 Lacs
Hyderabad, Chennai, Bengaluru
Hybrid
POSITION Senior Data Engineer / Data Engineer LOCATION Bangalore/Mumbai/Kolkata/Gurugram/Hyd/Pune/Chennai EXPERIENCE 2+ Years JOB TITLE: Senior Data Engineer / Data Engineer OVERVIEW OF THE ROLE: As a Data Engineer or Senior Data Engineer, you will be hands-on in architecting, building, and optimizing robust, efficient, and secure data pipelines and platforms that power business-critical analytics and applications. You will play a central role in the implementation and automation of scalable batch and streaming data workflows using modern big data and cloud technologies. Working within cross-functional teams, you will deliver well-engineered, high-quality code and data models, and drive best practices for data reliability, lineage, quality, and security. HASHEDIN BY DELOITTE 2025 Mandatory Skills: Hands-on software coding or scripting for minimum 3 years Experience in product management for at-least 2 years Stakeholder management experience for at-least 3 years Experience in one amongst GCP, AWS or Azure cloud platform Key Responsibilities: Design, build, and optimize scalable data pipelines and ETL/ELT workflows using Spark (Scala/Python), SQL, and orchestration tools (e.g., Apache Airflow, Prefect, Luigi). Implement efficient solutions for high-volume, batch, real-time streaming, and event-driven data processing, leveraging best-in-class patterns and frameworks. Build and maintain data warehouse and lakehouse architectures (e.g., Snowflake, Databricks, Delta Lake, BigQuery, Redshift) to support analytics, data science, and BI workloads. Develop, automate, and monitor Airflow DAGs/jobs on cloud or Kubernetes, following robust deployment and operational practices (CI/CD, containerization, infra-as-code). Write performant, production-grade SQL for complex data aggregation, transformation, and analytics tasks. Ensure data quality, consistency, and governance across the stack, implementing processes for validation, cleansing, anomaly detection, and reconciliation. Collaborate with Data Scientists, Analysts, and DevOps engineers to ingest, structure, and expose structured, semi-structured, and unstructured data for diverse use-cases. Contribute to data modeling, schema design, data partitioning strategies, and ensure adherence to best practices for performance and cost optimization. Implement, document, and extend data lineage, cataloging, and observability through tools such as AWS Glue, Azure Purview, Amundsen, or open-source technologies. Apply and enforce data security, privacy, and compliance requirements (e.g., access control, data masking, retention policies, GDPR/CCPA). Take ownership of end-to-end data pipeline lifecycle: design, development, code reviews, testing, deployment, operational monitoring, and maintenance/troubleshooting. Contribute to frameworks, reusable modules, and automation to improve development efficiency and maintainability of the codebase. Stay abreast of industry trends and emerging technologies, participating in code reviews, technical discussions, and peer mentoring as needed. Skills & Experience: Proficiency with Spark (Python or Scala), SQL, and data pipeline orchestration (Airflow, Prefect, Luigi, or similar). Experience with cloud data ecosystems (AWS, GCP, Azure) and cloud-native services for data processing (Glue, Dataflow, Dataproc, EMR, HDInsight, Synapse, etc.). © HASHEDIN BY DELOITTE 2025 Hands-on development skills in at least one programming language (Python, Scala, or Java preferred); solid knowledge of software engineering best practices (version control, testing, modularity). Deep understanding of batch and streaming architectures (Kafka, Kinesis, Pub/Sub, Flink, Structured Streaming, Spark Streaming). Expertise in data warehouse/lakehouse solutions (Snowflake, Databricks, Delta Lake, BigQuery, Redshift, Synapse) and storage formats (Parquet, ORC, Delta, Iceberg, Avro). Strong SQL development skills for ETL, analytics, and performance optimization. Familiarity with Kubernetes (K8s), containerization (Docker), and deploying data pipelines in distributed/cloud-native environments. Experience with data quality frameworks (Great Expectations, Deequ, or custom validation), monitoring/observability tools, and automated testing. Working knowledge of data modeling (star/snowflake, normalized, denormalized) and metadata/catalog management. Understanding of data security, privacy, and regulatory compliance (access management, PII masking, auditing, GDPR/CCPA/HIPAA). Familiarity with BI or visualization tools (PowerBI, Tableau, Looker, etc.) is an advantage but not core. Previous experience with data migrations, modernization, or refactoring legacy ETL processes to modern cloud architectures is a strong plus. Bonus: Exposure to open-source data tools (dbt, Delta Lake, Apache Iceberg, Amundsen, Great Expectations, etc.) and knowledge of DevOps/MLOps processes. Professional Attributes: Strong analytical and problem-solving skills; attention to detail and commitment to code quality and documentation. Ability to communicate technical designs and issues effectively with team members and stakeholders. Proven self-starter, fast learner, and collaborative team player who thrives in dynamic, fast-paced environments. Passion for mentoring, sharing knowledge, and raising the technical bar for data engineering practices. Desirable Experience: Contributions to open source data engineering/tools communities. Implementing data cataloging, stewardship, and data democratization initiatives. Hands-on work with DataOps/DevOps pipelines for code and data. Knowledge of ML pipeline integration (feature stores, model serving, lineage/monitoring integration) is beneficial. © HASHEDIN BY DELOITTE 2025 EDUCATIONAL QUALIFICATIONS: Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or related field (or equivalent experience). Certifications in cloud platforms (AWS, GCP, Azure) and/or data engineering (AWS Data Analytics, GCP Data Engineer, Databricks). Experience working in an Agile environment with exposure to CI/CD, Git, Jira, Confluence, and code review processes. Prior work in highly regulated or large-scale enterprise data environments (finance, healthcare, or similar) is a plus.
Posted 1 month ago
7.0 - 12.0 years
9 - 14 Lacs
Pune, Hinjewadi
Work from Office
Job Summary Synechron is seeking an experienced and technically proficient Senior PySpark Data Engineer to join our data engineering team. In this role, you will be responsible for developing, optimizing, and maintaining large-scale data processing solutions using PySpark. Your expertise will support our organizations efforts to leverage big data for actionable insights, enabling data-driven decision-making and strategic initiatives. Software Requirements Required Skills: Proficiency in PySpark Familiarity with Hadoop ecosystem components (e.g., HDFS, Hive, Spark SQL) Experience with Linux/Unix operating systems Data processing tools like Apache Kafka or similar streaming platforms Preferred Skills: Experience with cloud-based big data platforms (e.g., AWS EMR, Azure HDInsight) Knowledge of Python (beyond PySpark), Java or Scala relevant to big data applications Familiarity with data orchestration tools (e.g., Apache Airflow, Luigi) Overall Responsibilities Design, develop, and optimize scalable data processing pipelines using PySpark. Collaborate with data engineers, data scientists, and business analysts to understand data requirements and deliver solutions. Implement data transformations, aggregations, and extraction processes to support analytics and reporting. Manage large datasets in distributed storage systems, ensuring data integrity, security, and performance. Troubleshoot and resolve performance issues within big data workflows. Document data processes, architectures, and best practices to promote consistency and knowledge sharing. Support data migration and integration efforts across varied platforms. Strategic Objectives: Enable efficient and reliable data processing to meet organizational analytics and reporting needs. Maintain high standards of data security, compliance, and operational durability. Drive continuous improvement in data workflows and infrastructure. Performance Outcomes & Expectations: Efficient processing of large-scale data workloads with minimum downtime. Clear, maintainable, and well-documented code. Active participation in team reviews, knowledge transfer, and innovation initiatives. Technical Skills (By Category) Programming Languages: Required: PySpark (essential); Python (needed for scripting and automation) Preferred: Java, Scala Databases/Data Management: Required: Experience with distributed data storage (HDFS, S3, or similar) and data warehousing solutions (Hive, Snowflake) Preferred: Experience with NoSQL databases (Cassandra, HBase) Cloud Technologies: Required: Familiarity with deploying and managing big data solutions on cloud platforms such as AWS (EMR), Azure, or GCP Preferred: Cloud certifications Frameworks and Libraries: Required: Spark SQL, Spark MLlib (basic familiarity) Preferred: Integration with streaming platforms (e.g., Kafka), data validation tools Development Tools and Methodologies: Required: Version control systems (e.g., Git), Agile/Scrum methodologies Preferred: CI/CD pipelines, containerization (Docker, Kubernetes) Security Protocols: Optional: Basic understanding of data security practices and compliance standards relevant to big data management Experience Requirements Minimum of 7+ years of experience in big data environments with hands-on PySpark development. Proven ability to design and implement large-scale data pipelines. Experience working with cloud and on-premises big data architectures. Preference for candidates with domain-specific experience in finance, banking, or related sectors. Candidates with substantial related experience and strong technical skills in big data, even from different domains, are encouraged to apply. Day-to-Day Activities Develop, test, and deploy PySpark data processing jobs to meet project specifications. Collaborate in multi-disciplinary teams during sprint planning, stand-ups, and code reviews. Optimize existing data pipelines for performance and scalability. Monitor data workflows, troubleshoot issues, and implement fixes. Engage with stakeholders to gather new data requirements, ensuring solutions are aligned with business needs. Contribute to documentation, standards, and best practices for data engineering processes. Support the onboarding of new data sources, including integration and validation. Decision-Making Authority & Responsibilities: Identify performance bottlenecks and propose effective solutions. Decide on appropriate data processing approaches based on project requirements. Escalate issues that impact project timelines or data integrity. Qualifications Bachelors degree in Computer Science, Information Technology, or related field. Equivalent experience considered. Relevant certifications are preferred: Cloudera, Databricks, AWS Certified Data Analytics, or similar. Commitment to ongoing professional development in data engineering and big data technologies. Demonstrated ability to adapt to evolving data tools and frameworks. Professional Competencies Strong analytical and problem-solving skills, with the ability to model complex data workflows. Excellent communication skills to articulate technical solutions to non-technical stakeholders. Effective teamwork and collaboration in a multidisciplinary environment. Adaptability to new technologies and emerging trends in big data. Ability to prioritize tasks effectively and manage time in fast-paced projects. Innovation mindset, actively seeking ways to improve data infrastructure and processes.
Posted 1 month ago
6.0 - 8.0 years
10 - 15 Lacs
Hyderabad
Hybrid
Mega Walkin Drive for Lead Software Engineer/Sr Software Engineer- Data Engineer -Python & Hadoop Your future duties and responsibilities: Job Overview: CGI is looking for a talented and motivated Data Engineer with strong expertise in Python, Apache Spark, HDFS, and MongoDB to build and manage scalable, efficient, and reliable data pipelines and infrastructure Youll play a key role in transforming raw data into actionable insights, working closely with data scientists, analysts, and business teams. Key Responsibilities: Design, develop, and maintain scalable data pipelines using Python and Spark. Ingest, process, and transform large datasets from various sources into usable formats. Manage and optimize data storage using HDFS and MongoDB. Ensure high availability and performance of data infrastructure. Implement data quality checks, validations, and monitoring processes. Collaborate with cross-functional teams to understand data needs and deliver solutions. Write reusable and maintainable code with strong documentation practices. Optimize performance of data workflows and troubleshoot bottlenecks. Maintain data governance, privacy, and security best practices. Required qualifications to be successful in this role: Minimum 6 years of experience as a Data Engineer or similar role. Strong proficiency in Python for data manipulation and pipeline development. Hands-on experience with Apache Spark for large-scale data processing. Experience with HDFS and distributed data storage systems. Strong understanding of data architecture, data modeling, and performance tuning. Familiarity with version control tools like Git. Experience with workflow orchestration tools (e.g., Airflow, Luigi) is a plus. Knowledge of cloud services (AWS, GCP, or Azure) is preferred. Bachelors or Masters degree in Computer Science, Information Systems, or a related field. Preferred Skills: Experience with containerization (Docker, Kubernetes). Knowledge of real-time data streaming tools like Kafka. Familiarity with data visualization tools (e.g., Power BI, Tableau). Exposure to Agile/Scrum methodologies. Skills: Hadoop Hive Python SQL English Notice Period- 0-45 Days Pre requisites : Aadhar Card a copy, PAN card copy, UAN Disclaimer : The selected candidates will initially be required to work from the office for 8 weeks before transitioning to a hybrid model with 2 days of work from the office each week.
Posted 2 months ago
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Accenture
39581 Jobs | Dublin
Wipro
19070 Jobs | Bengaluru
Accenture in India
14409 Jobs | Dublin 2
EY
14248 Jobs | London
Uplers
10536 Jobs | Ahmedabad
Amazon
10262 Jobs | Seattle,WA
IBM
9120 Jobs | Armonk
Oracle
8925 Jobs | Redwood City
Capgemini
7500 Jobs | Paris,France
Virtusa
7132 Jobs | Southborough