Sr. Data Engineer Azure Databricks

Pune

10 - 15 years

INR 6.0 - 10.0 Lacs P.A.

Work from Office

Full Time

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About the role This is a remote, contract position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics). We are looking for a skilled Senior Data Engineer with a strong background in Python, SQL, PySpark, Azure, Databricks, Synapse, Azure Data Lake, DevOps and cloud-based large scale data applications with a passion for data quality, performance and cost optimization. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Data products in the Aviation Industry, including migration from Synapse to Azure Data Lake. This role involves hands-on coding, mentoring junior staff and collaboration with multi-disciplined teams to achieve project objectives. Qualification & Experience Must have a full-time Bachelors degree in Computer Science or similar At least 5 years of experience as a data engineer with strong expertise in Databricks, Azure, DevOps, or other hyperscalers. 5+ years of experience with Azure DevOps, GitHub. Proven experience delivering large scale projects and products for Data and Analytics, as a data engineer, including migrations. Following certifications: Databricks Certified Associate Developer for Apache Spark Databricks Certified Data Engineer Associate Microsoft Certified: Azure Fundamentals Microsoft Certified: Azure Data Engineer Associate Microsoft Exam: Designing and Implementing Microsoft DevOps Solutions (nice to have) Required skills/Competencies Strong programming Skills in one or more languages such as Python (must have), Scala, and proficiency in writing efficient and optimized code for data integration, migration, storage, processing and manipulation. Strong understanding and experience with SQL and writing advanced SQL queries. Thorough understanding of big data principles, techniques, and best practices. Strong experience with scalable and distributed Data Processing Technologies such as Spark/PySpark (must have: experience with Azure Databricks), DBT and Kafka, to be able to handle large volumes of data. Solid Databricks development experience with significant Python, PySpark, Spark SQL, Pandas, NumPy in Azure environment. Strong experience in designing and implementing efficient ELT/ETL processes in Azure and Databricks and using open source solutions being able to develop custom integration solutions as needed. Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming. Expertise in data cleansing, transformation, and validation. Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table). Good understanding of Data Modeling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions. Strong experience in designing and implementing Data Warehousing, data lake and data lake house, solutions in Azure and Databricks. Good experience with Delta Lake, Unity Catalog, Delta Sharing, Delta Live Tables (DLT). Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies. Strong knowledge of SDLC tools and technologies Azure DevOps and GitHub, including project management software (Jira, Azure Boards or similar), source code management (GitHub, Azure Repos or similar), CI/CD system (GitHub actions, Azure Pipelines, Jenkins or similar) and binary repository manager (Azure Artifacts or similar). Strong understanding of DevOps principles, including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC Terraform, ARM including hands-on experience), configuration management, automated testing, performance tuning and cost management and optimization. Strong knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics, such as Azure Data Factory, Azure Databricks, Azure Synapse Analytics, Azure Data Lake, Azure Stream Analytics, SQL Server, Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, etc. Experience in Orchestration using technologies like Databricks workflows and Apache Airflow. Strong knowledge of data structures and algorithms and good software engineering practices. Proven experience migrating from Azure Synapse to Azure Data Lake, or other technologies. Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures. Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines. Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent. Experience with BI solutions including PowerBI is a plus. Strong written and verbal communication skills to collaborate and articulate complex situations concisely with cross-functional teams, including business users, data architects, DevOps engineers, data analysts, data scientists, developers, and operations teams. Ability to document processes, procedures, and deployment configurations. Understanding of security practices, including network security groups, Azure Active Directory, encryption, and compliance standards. Ability to implement security controls and best practices within data and analytics solutions, including proficient knowledge and working experience on various cloud security vulnerabilities and ways to mitigate them. Self-motivated with the ability to work well in a team, and experienced in mentoring and coaching different members of the team. A willingness to stay updated with the latest services, Data Engineering trends, and best practices in the field. Comfortable with picking up new technologies independently and working in a rapidly changing environment with ambiguous requirements. Care about architecture, observability, testing, and building reliable infrastructure and data pipelines. Responsibilities Architect, design, develop, test and maintain high-performance, large-scale, complex data architectures, which support data integration (batch and real-time, ETL and ELT patterns from heterogeneous data systems: APIs and platforms), storage (data lakes, warehouses, data lake houses, etc), processing, orchestration and infrastructure. Ensuring the scalability, reliability, and performance of data systems, focusing on Databricks and Azure. Contribute to detailed design, architectural discussions, and customer requirements sessions. Actively participate in the design, development, and testing of big data products.. Construct and fine-tune Apache Spark jobs and clusters within the Databricks platform. Migrate out of Azure Synapse to Azure Data Lake or other technologies. Assess best practices and design schemas that match business needs for delivering a modern analytics solution (descriptive, diagnostic, predictive, prescriptive). Design and implement data models and schemas that support efficient data processing and analytics. Design and develop clear, maintainable code with automated testing using Pytest, unittest, integration tests, performance tests, regression tests, etc. Collaborating with cross-functional teams and Product, Engineering, Data Scientists and Analysts to understand data requirements and develop data solutions, including reusable components meeting product deliverables. Evaluating and implementing new technologies and tools to improve data integration, data processing, storage and analysis. Evaluate, design, implement and maintain data governance solutions: cataloging, lineage, data quality and data governance frameworks that are suitable for a modern analytics solution, considering industry-standard best practices and patterns. Continuously monitor and fine-tune workloads and clusters to achieve optimal performance. Provide guidance and mentorship to junior team members, sharing knowledge and best practices. Maintain clear and comprehensive documentation of the solutions, configurations, and best practices implemented. Promote and enforce best practices in data engineering, data governance, and data quality. Ensure data quality and accuracy. Design, Implement and maintain data security and privacy measures. Be an active member of an Agile team, participating in all ceremonies and continuous improvement activities, being able to work independently as well as collaboratively.

Sr. Software Engineer

Pune

3 - 6 years

INR 5.0 - 9.0 Lacs P.A.

Work from Office

Full Time

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About the Role: As a Senior Software Engineer on our team, you will be at the forefront of designing, building, and deploying enterprise-grade applications that harness the power of both traditional machine learning and cutting-edge AI models. You will be instrumental in developing a diverse range of impactful use cases, including demand forecasting services, large-scale AI factories, and innovative LLM-powered voice assistants for drive-through and telephony systems. Collaborating closely with our talented data science team, you will translate sophisticated models into scalable and reliable services that drive significant business value. Key Responsibilities Design and develop robust applications that effectively utilize machine learning models to address specific business and operational challenges, including areas like demand forecasting, large-scale AI factory implementations, and LLM-powered voice assistants. Partner closely with Data Scientists to seamlessly integrate trained models into production environments, transforming research prototypes into maintainable and scalable services. Optimize application performance with a strong focus on minimizing latency, ensuring scalability to handle large volumes, and maintaining high reliability. Tackle complex system design problems, contributing meaningfully to critical architectural decisions. Champion and contribute to team best practices in areas such as coding standards, comprehensive testing strategies, robust observability practices, and efficient CI/CD pipelines. Qualifications: Our ideal candidate will possess 3-6 years of demonstrable experience in building and deploying software applications. You have a minimum of 3 years of hands-on experience developing and deploying applications on cloud platforms (AWS, GCP, or Azure). You possess strong programming skills in Python. While experience with other languages like C, C++, Java, Go, or NodeJS is valued, the primary coding language will be Python. You have practical experience with API technologies, including REST, WebSocket, and gRPC. You have hands-on experience with containerization technologies (Docker) and orchestration frameworks (Kubernetes). You have a proven track record of building and operating large-scale, distributed systems. Preferred Qualifications: Experience in building applications leveraging Large Language Models (LLMs). Familiarity and experience working with both SQL and NoSQL database systems. Experience integrating with message queue systems such as Kafka, Pulsar, or RabbitMQ. Knowledge and practical experience in implementing security measures, encryption techniques, and authentication methods. Familiarity with ML engineering pipelines, model serving infrastructure, and feature stores

Middleware Engineer

Pune, Maharashtra, India

10 years

Not disclosed

On-site

Contractual

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. This is a 1-year contractual role. About the role: We are seeking a Senior Middleware & Data Integration Engineer to design and build secure, scalable, and real-time data pipelines and middleware services across hybrid cloud environments. The ideal candidate will have strong proficiency in Java (Spring Boot), experience with data lakes (Azure, AWS), messaging systems (Kafka, Azure Service Bus, JMS), and an understanding of real-time and batch-based processing. You will work across full-stack components and integrate structured/unstructured data from upstream systems into platforms like Snowflake, while ensuring compliance and performance at scale. Roles and Responsibilities: Design, build, and deploy middleware services using Java Spring Boot with integrations across REST APIs, data lakes, and messaging systems Develop and manage real-time and batch data pipelines that extract, enrich, and transform data from upstream sources into systems like Snowflake Build resilient integrations using Kafka, Azure Service Bus, JMS, including handling retries, dead-letter queues, and throttling strategies Leverage data spine architecture for metadata exchange, data standardization, and integration logic across systems Integrate RESTful services (e.g., Spring Boot APIs) to facilitate ingestion and distribution of data across the platform Build and optimize workflows for data ingestion, event processing, and API interaction Implement crosswalk and data enrichment logic within data pipelines using technologies like PySpark or Java Streams Collaborate with architects and DevOps teams to ensure CI/CD readiness, monitoring, and alerting of data flows Install, configure and maintain middleware technologies (experience with any of these: Websphere, Weblogic, Tomcat, JBoss, Kafka, RabbitMQ or similar) Ensure high availability, scalability and reliability of middleware systems Design and implement solutions for system and application integration. Optimize middleware performance and recommend improvements Design and development of middleware components Design and implement API necessary for the integration and or data consumption Work independently and collaboratively on a multi-disciplined project team in an Agile development environment Be actively involved in the design, development and testing activities for Big data products Provide feedback to development teams on code/architecture optimization Design and implement secure data processing pipelines, including concepts like data spines, for handling sensitive information Architect and differentiate between event-driven and batch-based data pipelines, making informed decisions on their application Design and implement robust security measures for middleware systems processing PII or customer-sensitive data Design and develop middleware systems to process and enrich messages from multiple upstream sources, integrating with data warehouses like Snowflake Required Skills and Qualifications: Hands-on experience developing Java, Python Hands-on experience with Spring Boot, Spring Boot Oauth, Spring Security, Spring Data JPA, and Spring Batch Familiarity with Azure services Proven expertise in Kafka, JMS, or Azure Service Bus, including designing fault-tolerant, scalable message-driven applications Experience with data enrichment and transformation processes, preferably using PySpark or Java Streams Experience integrating with Snowflake, Redshift, BigQuery, or similar data platforms Deep understanding of event-driven architectures and batch-based workflows, including tradeoffs and ideal use cases Experience working with data enrichment, schema alignment, and crosswalk logic in enterprise-scale pipelines Proven experience with CI/CD. Proven experience with Jenkins, Ansible, Docker, Kubernetes In-depth understanding of event-driven and batch-based data pipeline architectures Experience with application servers like IBM WebSphere, Oracle WebLogic Server, Apache Tomcat, JBoss/WildFly Understanding Relational Databases, such as Oracle, SQL Server, MySQL, Postgres or similar Experience using software project tracking tools such as Jira Proven experience with version control (Github, Bitbucket) Familiarity with Linux OS/concepts Strong knowledge of data security best practices, especially concerning PII and sensitive data Strong written and verbal communication skills Self-motivated and ability to work well in a team Education Bachelor of Science degree from an accredited university Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws. Powered by JazzHR 7g0JrMG4fp Show more Show less

Sr. Data Analyst

Pune

10 - 15 years

INR 25.0 - 30.0 Lacs P.A.

Work from Office

Full Time

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. This is a Full-time Remote (work from home) contract position. About the role: As a Business Intelligence Engineer, you will play a pivotal role in leveraging data to drive strategic decisions and enhance operational efficiency. You will be responsible for designing, developing, and maintaining Power BI dashboards and reports that provide valuable insights to various stakeholders across the organization. Your work will directly contribute to optimizing business processes, improving customer experiences, and shaping the future of private aviation. Responsibilities: Collaborate with cross-functional teams to understand business requirements and translate them into actionable insights using Fabric, SQL, and Power BI. Develop visually appealing and interactive dashboards and reports to effectively communicate key performance indicators (KPIs), trends, and anomalies. Optimize data models and queries to ensure efficient performance and scalability of Power BI solutions. Implement best practices for data visualization, ensuring clarity, consistency, and usability for end users. Work closely with data engineers to integrate data from various sources and maintain data accuracy and integrity. Provide training and support to end users to maximize adoption and utilization of Power BI tools. Stay updated on industry trends and advancements in data visualization and analytics technologies, recommending improvements and innovations as appropriate. Collaborate with IT teams to ensure compliance with data security and governance policies. An ideal candidate will have: Proven experience as a Power BI Developer or similar role, with a strong portfolio showcasing impactful dashboards and reports. Proficiency in SQL for data extraction, transformation, and manipulation. Solid understanding of data modeling concepts and experience in designing efficient data models. Strong analytical and problem-solving skills, with the ability to translate business requirements into technical solutions. Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams Experience with other BI tools (e.g., Tableau) is a plus.

HubSpot Integration Engineer

Pune, Maharashtra, India

10 years

Not disclosed

On-site

Contractual

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About the role: We are seeking a HubSpot Integration Engineer to support the expansion of HubSpot across multiple business units. You will play a critical role in scaling their CRM infrastructure beyond sales to include marketing, customer service, and membership operations. Responsibilities: Integrate HubSpot with key platforms including Google Suite, NetSuite, Piano.io, & SWOOGO Responsible for migrating customer, deal, and activity data from Microsoft Dynamics into HubSpot, ensuring data integrity and continuity across sales and marketing operations Collaborate with internal stakeholders to extend existing HubSpot workflows to new teams Build scalable, sustainable systems for lead management, event tracking, and member engagement Support data migration, user access configuration, and role-specific onboarding Provide technical expertise to optimize HubSpot for marketing, events, and subscription initiatives Qualifications: 3+ years of experience in HubSpot CRM development and integration Strong understanding of APIs, middleware tools, and CRM architecture Experience with event management and subscription platforms is a plus Able to work cross-functionally and translate business needs into technical solutions Experience in a media or membership-based organization Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws. Powered by JazzHR E2svigGBN2 Show more Show less

HubSpot Project Manager / Business Analyst

Pune, Maharashtra, India

10 years

Not disclosed

On-site

Contractual

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About The Role Fusemachines is seeking a HubSpot-savvy Project Manager / Business Analyst to partner with our Integration Engineer and business teams as we scale our CRM usage across departments. This role will act as a critical bridge between business needs and technical execution—gathering requirements, setting project scope, defining KPIs, and ensuring successful adoption of HubSpot tools. Responsibilities Work closely with stakeholders across sales, marketing, customer service, and membership to gather and document requirements Translate business needs into clear user stories, technical briefs, and HubSpot configurations Provide HubSpot admin support: create custom properties, workflows, reports, user permissions, and dashboards Lead user training, change management, and onboarding initiatives across teams Define and update success metrics and KPIs; deliver biweekly reports to stakeholders Collaborate with the Integration Engineer to validate feasibility, prioritize features, and manage scope Qualifications: 3+ years of experience working with CRM systems, including at least 2 years of hands-on HubSpot admin experience Strong skills in business analysis, stakeholder communication, and project scoping Proven ability to gather requirements and deliver scalable CRM solutions in a cross-functional environment Experience with reporting and analytics; ability to define and track KPIs Excellent communication and training skills Nice to Have: Familiarity with Microsoft Dynamics and NetSuite Background in media, events, or subscription-based businesses Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws Powered by JazzHR VXe3i7wXcd Show more Show less

HubSpot Project Manager / Business Analyst

Pune

10 - 15 years

INR 6.0 - 9.0 Lacs P.A.

Work from Office

Full Time

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About the Role Fusemachines is seeking a HubSpot-savvy Project Manager / Business Analyst to partner with our Integration Engineer and business teams as we scale our CRM usage across departments. This role will act as a critical bridge between business needs and technical execution gathering requirements, setting project scope, defining KPIs, and ensuring successful adoption of HubSpot tools. Responsibilities Work closely with stakeholders across sales, marketing, customer service, and membership to gather and document requirements. Translate business needs into clear user stories, technical briefs, and HubSpot configurations. Provide HubSpot admin support: create custom properties, workflows, reports, user permissions, and dashboards. Lead user training, change management, and onboarding initiatives across teams. Define and update success metrics and KPIs; deliver biweekly reports to stakeholders. Collaborate with the Integration Engineer to validate feasibility, prioritize features, and manage scope. Qualifications: 3+ years of experience working with CRM systems, including at least 2 years of hands-on HubSpot admin experience. Strong skills in business analysis, stakeholder communication, and project scoping. Proven ability to gather requirements and deliver scalable CRM solutions in a cross-functional environment. Experience with reporting and analytics; ability to define and track KPIs. Excellent communication and training skills. Nice to Have: Familiarity with Microsoft Dynamics and NetSuite Background in media, events, or subscription-based businesses Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws

Sr. Data Engineer AWS

Pune

10 - 15 years

INR 4.0 - 8.0 Lacs P.A.

Work from Office

Full Time

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About the role This is a remote full-time contractual position , working in the Travel Hospitality Industry , responsible for designing, building, testing, optimizing and maintaining the infrastructure and code required for data integration, storage, processing, pipelines and analytics (BI, visualization and Advanced Analytics) from ingestion to consumption, implementing data flow controls, and ensuring high data quality and accessibility for analytics and business intelligence purposes. This role requires a strong foundation in programming and a keen understanding of how to integrate and manage data effectively across various storage systems and technologies. Were looking for someone who can quickly ramp up, contribute right away and work independently as well as with junior team members with minimal oversight. We are looking for a skilled Sr. Data Engineer with a strong background in Python , SQL , Pyspark , Redshift, and AWS cloud-based large-scale data solutions with a passion for data quality, performance and cost optimization. The ideal candidate will develop in an Agile environment. This role is perfect for an individual passionate about leveraging data to drive insights, improve decision-making, and support the strategic goals of the organization through innovative data engineering solutions. Qualification / Skill Set Requirement: Must have a full-time Bachelors degree in Computer Science, Information Systems, Engineering, or a related field. 5+ years of real-world data engineering development experience in AWS (certifications preferred). Strong expertise in Python, SQL, PySpark and AWS in an Agile environment, with a proven track record of building and optimizing data pipelines, architectures, and datasets, and proven experience in data storage, modelling, management, lake, warehousing, processing/transformation, integration, cleansing, validation and analytics. A senior person who can understand requirements and design end-to-end solutions with minimal oversight. Strong programming Skills in one or more languages such as Python , Scala, and proficient in writing efficient and optimized code for data integration, storage, processing and manipulation. Strong knowledge SDLC tools and technologies, including project management software (Jira or similar), source code management (GitHub or similar), CI/CD system (GitHub actions, AWS CodeBuild or similar) and binary repository manager (AWS CodeArtifact or similar). Good understanding of Data Modelling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions. Strong SQL skills and experience working with complex data sets, Enterprise Data Warehouse and writing advanced SQL queries. Proficient with Relational Databases (RDS, MySQL, Postgres, or similar) and NonSQL Databases (Cassandra, MongoDB, Neo4j, etc.). Skilled in Data Integration from different sources such as APIs, databases, flat files, and event streaming. Strong experience in implementing data pipelines and efficient ELT/ETL processes, batch and real-time, in AWS and using open source solutions, being able to develop custom integration solutions as needed, including Data Integration from different sources such as APIs (PoS integrations is a plus), ERP (Oracle and Allegra are a plus), databases, flat files, Apache Parquet, event streaming, including cleansing, transformation and validation of the data. Strong experience with scalable and distributed Data Technologies such as Spark/ PySpark , DBT and Kafka , to be able to handle large volumes of data. Experience with stream-processing systems: Storm, Spark-Streaming, etc. is a plus. Strong experience in designing and implementing Data Warehousing solutions in AWS with Redshift . Demonstrated experience in designing and implementing efficient ELT/ETL processes that extract data from source systems, transform it (DBT), and load it into the data warehouse. Strong experience in Orchestration using Apache Airflow. Expert in Cloud Computing in AWS, including deep knowledge of a variety of AWS services like Lambda, Kinesis, S3 , Lake Formation, EC2, EMR , ECS/ECR, IAM, CloudWatch, etc Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent. Good understanding of BI solutions, including Looker and LookML (Looker Modelling Language). Strong knowledge and hands-on experience of DevOps principles, tools and technologies (GitHub and AWS DevOps), including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC Terraform), configuration management, automated testing, performance tuning and cost management and optimization. Good Problem-Solving skills: being able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues. Possesses strong leadership skills with a willingness to lead, create Ideas, and be assertive. Strong project management and organizational skills. Excellent communication skills to collaborate with cross-functional teams, including business users, data architects, DevOps/DataOps/MLOps engineers, data analysts, data scientists, developers, and operations teams. Essential to convey complex technical concepts and insights to non-technical stakeholders effectively. Ability to document processes, procedures, and deployment configurations. Responsibilities: Design, implement, deploy, test and maintain highly scalable and efficient data architectures, defining and maintaining standards and best practices for data management independently with minimal guidance. Ensuring the scalability, reliability, quality and performance of data systems. Mentoring and guiding junior/mid-level data engineers. Collaborating with Product, Engineering, Data Scientists and Analysts to understand data requirements and develop data solutions, including reusable components. Evaluating and implementing new technologies and tools to improve data integration, data processing and analysis. Design architecture, observability and testing strategies, and build reliable infrastructure and data pipelines. Takes ownership of storage layer, data management tasks, including schema design, indexing, and performance tuning. Swiftly address and resolve complex data engineering issues, incidents and resolve bottlenecks in SQL queries and database operations. Conduct a Discovery on the existing Data Infrastructure and Proposed Architecture. Evaluate and implement cutting-edge technologies and methodologies, and continue learning and expanding skills in data engineering and cloud platforms, to improve and modernize existing data systems. Evaluate, design, and implement data governance solutions: cataloguing, lineage, quality and data governance frameworks that are suitable for a modern analytics solution, considering industry-standard best practices and patterns. Define and document data engineering architectures, processes and data flows. Assess best practices and design schemas that match business needs for delivering a modern analytics solution (descriptive, diagnostic, predictive, prescriptive). Be an active member of our Agile team, participating in all ceremonies and continuous improvement activities. Fusemachines is an Equal opportunity employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws.

Sr. Data Engineer AWS

Pune, Maharashtra, India

10 years

Not disclosed

Remote

Contractual

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About The Role This is a remote full-time contractual position , working in the Travel & Hospitality Industry , responsible for designing, building, testing, optimizing and maintaining the infrastructure and code required for data integration, storage, processing, pipelines and analytics (BI, visualization and Advanced Analytics) from ingestion to consumption, implementing data flow controls, and ensuring high data quality and accessibility for analytics and business intelligence purposes. This role requires a strong foundation in programming and a keen understanding of how to integrate and manage data effectively across various storage systems and technologies. We're looking for someone who can quickly ramp up, contribute right away and work independently as well as with junior team members with minimal oversight. We are looking for a skilled Sr. Data Engineer with a strong background in Python , SQL , Pyspark , Redshift, and AWS cloud-based large-scale data solutions with a passion for data quality, performance and cost optimization. The ideal candidate will develop in an Agile environment. This role is perfect for an individual passionate about leveraging data to drive insights, improve decision-making, and support the strategic goals of the organization through innovative data engineering solutions. Qualification / Skill Set Requirement: Must have a full-time Bachelor's degree in Computer Science, Information Systems, Engineering, or a related field 5+ years of real-world data engineering development experience in AWS (certifications preferred). Strong expertise in Python, SQL, PySpark and AWS in an Agile environment, with a proven track record of building and optimizing data pipelines, architectures, and datasets, and proven experience in data storage, modelling, management, lake, warehousing, processing/transformation, integration, cleansing, validation and analytics A senior person who can understand requirements and design end-to-end solutions with minimal oversight Strong programming Skills in one or more languages such as Python, Scala, and proficient in writing efficient and optimized code for data integration, storage, processing and manipulation Strong knowledge SDLC tools and technologies, including project management software (Jira or similar), source code management (GitHub or similar), CI/CD system (GitHub actions, AWS CodeBuild or similar) and binary repository manager (AWS CodeArtifact or similar) Good understanding of Data Modelling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions Strong SQL skills and experience working with complex data sets, Enterprise Data Warehouse and writing advanced SQL queries. Proficient with Relational Databases (RDS, MySQL, Postgres, or similar) and NonSQL Databases (Cassandra, MongoDB, Neo4j, etc.) Skilled in Data Integration from different sources such as APIs, databases, flat files, and event streaming Strong experience in implementing data pipelines and efficient ELT/ETL processes, batch and real-time, in AWS and using open source solutions, being able to develop custom integration solutions as needed, including Data Integration from different sources such as APIs (PoS integrations is a plus), ERP (Oracle and Allegra are a plus), databases, flat files, Apache Parquet, event streaming, including cleansing, transformation and validation of the data Strong experience with scalable and distributed Data Technologies such as Spark/PySpark, DBT and Kafka, to be able to handle large volumes of data Experience with stream-processing systems: Storm, Spark-Streaming, etc. is a plus Strong experience in designing and implementing Data Warehousing solutions in AWS with Redshift. Demonstrated experience in designing and implementing efficient ELT/ETL processes that extract data from source systems, transform it (DBT), and load it into the data warehouse Strong experience in Orchestration using Apache Airflow Expert in Cloud Computing in AWS, including deep knowledge of a variety of AWS services like Lambda, Kinesis, S3, Lake Formation, EC2, EMR, ECS/ECR, IAM, CloudWatch, etc Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent Good understanding of BI solutions, including Looker and LookML (Looker Modelling Language) Strong knowledge and hands-on experience of DevOps principles, tools and technologies (GitHub and AWS DevOps), including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC – Terraform), configuration management, automated testing, performance tuning and cost management and optimization Good Problem-Solving skills: being able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues Possesses strong leadership skills with a willingness to lead, create Ideas, and be assertive Strong project management and organizational skills Excellent communication skills to collaborate with cross-functional teams, including business users, data architects, DevOps/DataOps/MLOps engineers, data analysts, data scientists, developers, and operations teams. Essential to convey complex technical concepts and insights to non-technical stakeholders effectively Ability to document processes, procedures, and deployment configurations Responsibilities: Design, implement, deploy, test and maintain highly scalable and efficient data architectures, defining and maintaining standards and best practices for data management independently with minimal guidance Ensuring the scalability, reliability, quality and performance of data systems Mentoring and guiding junior/mid-level data engineers Collaborating with Product, Engineering, Data Scientists and Analysts to understand data requirements and develop data solutions, including reusable components Evaluating and implementing new technologies and tools to improve data integration, data processing and analysis Design architecture, observability and testing strategies, and build reliable infrastructure and data pipelines Takes ownership of storage layer, data management tasks, including schema design, indexing, and performance tuning Swiftly address and resolve complex data engineering issues, incidents and resolve bottlenecks in SQL queries and database operations Conduct a Discovery on the existing Data Infrastructure and Proposed Architecture Evaluate and implement cutting-edge technologies and methodologies, and continue learning and expanding skills in data engineering and cloud platforms, to improve and modernize existing data systems Evaluate, design, and implement data governance solutions: cataloguing, lineage, quality and data governance frameworks that are suitable for a modern analytics solution, considering industry-standard best practices and patterns Define and document data engineering architectures, processes and data flows Assess best practices and design schemas that match business needs for delivering a modern analytics solution (descriptive, diagnostic, predictive, prescriptive) Be an active member of our Agile team, participating in all ceremonies and continuous improvement activities Fusemachines is an Equal opportunity employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws. Powered by JazzHR SC1hyFVwpp Show more Show less

Sr. Data Engineer Azure Databricks

Pune, Maharashtra, India

10 years

None Not disclosed

Remote

Contractual

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About The Role This is a remote, contract position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics). We are looking for a skilled Senior Data Engineer with a strong background in Python, SQL, PySpark, Azure, Databricks, Synapse, Azure Data Lake, DevOps and cloud-based large scale data applications with a passion for data quality, performance and cost optimization. The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Data products in the Aviation Industry, including migration from Synapse to Azure Data Lake. This role involves hands-on coding, mentoring junior staff and collaboration with multi-disciplined teams to achieve project objectives. Qualification & Experience Must have a full-time Bachelor's degree in Computer Science or similar At least 5 years of experience as a data engineer with strong expertise in Databricks, Azure, DevOps, or other hyperscalers 5+ years of experience with Azure DevOps, GitHub Proven experience delivering large scale projects and products for Data and Analytics, as a data engineer, including migrations Following certifications: Databricks Certified Associate Developer for Apache Spark Databricks Certified Data Engineer Associate Microsoft Certified: Azure Fundamentals Microsoft Certified: Azure Data Engineer Associate Microsoft Exam: Designing and Implementing Microsoft DevOps Solutions (nice to have) Required Skills/Competencies Strong programming Skills in one or more languages such as Python (must have), Scala, and proficiency in writing efficient and optimized code for data integration, migration, storage, processing and manipulation Strong understanding and experience with SQL and writing advanced SQL queries Thorough understanding of big data principles, techniques, and best practices Strong experience with scalable and distributed Data Processing Technologies such as Spark/PySpark (must have: experience with Azure Databricks), DBT and Kafka, to be able to handle large volumes of data Solid Databricks development experience with significant Python, PySpark, Spark SQL, Pandas, NumPy in Azure environment Strong experience in designing and implementing efficient ELT/ETL processes in Azure and Databricks and using open source solutions being able to develop custom integration solutions as needed Skilled in Data Integration from different sources such as APIs, databases, flat files, event streaming Expertise in data cleansing, transformation, and validation Proficiency with Relational Databases (Oracle, SQL Server, MySQL, Postgres, or similar) and NonSQL Databases (MongoDB or Table) Good understanding of Data Modeling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions Strong experience in designing and implementing Data Warehousing, data lake and data lake house, solutions in Azure and Databricks Good experience with Delta Lake, Unity Catalog, Delta Sharing, Delta Live Tables (DLT) Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies Strong knowledge of SDLC tools and technologies Azure DevOps and GitHub, including project management software (Jira, Azure Boards or similar), source code management (GitHub, Azure Repos or similar), CI/CD system (GitHub actions, Azure Pipelines, Jenkins or similar) and binary repository manager (Azure Artifacts or similar) Strong understanding of DevOps principles, including continuous integration, continuous delivery (CI/CD), infrastructure as code (IaC – Terraform, ARM including hands-on experience), configuration management, automated testing, performance tuning and cost management and optimization Strong knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics, such as Azure Data Factory, Azure Databricks, Azure Synapse Analytics, Azure Data Lake, Azure Stream Analytics, SQL Server, Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, etc Experience in Orchestration using technologies like Databricks workflows and Apache Airflow Strong knowledge of data structures and algorithms and good software engineering practices Proven experience migrating from Azure Synapse to Azure Data Lake, or other technologies Strong analytical skills to identify and address technical issues, performance bottlenecks, and system failures Proficiency in debugging and troubleshooting issues in complex data and analytics environments and pipelines Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent Experience with BI solutions including PowerBI is a plus Strong written and verbal communication skills to collaborate and articulate complex situations concisely with cross-functional teams, including business users, data architects, DevOps engineers, data analysts, data scientists, developers, and operations teams Ability to document processes, procedures, and deployment configurations Understanding of security practices, including network security groups, Azure Active Directory, encryption, and compliance standards Ability to implement security controls and best practices within data and analytics solutions, including proficient knowledge and working experience on various cloud security vulnerabilities and ways to mitigate them Self-motivated with the ability to work well in a team, and experienced in mentoring and coaching different members of the team A willingness to stay updated with the latest services, Data Engineering trends, and best practices in the field Comfortable with picking up new technologies independently and working in a rapidly changing environment with ambiguous requirements Care about architecture, observability, testing, and building reliable infrastructure and data pipelines Responsibilities Architect, design, develop, test and maintain high-performance, large-scale, complex data architectures, which support data integration (batch and real-time, ETL and ELT patterns from heterogeneous data systems: APIs and platforms), storage (data lakes, warehouses, data lake houses, etc), processing, orchestration and infrastructure. Ensuring the scalability, reliability, and performance of data systems, focusing on Databricks and Azure Contribute to detailed design, architectural discussions, and customer requirements sessions Actively participate in the design, development, and testing of big data products. Construct and fine-tune Apache Spark jobs and clusters within the Databricks platform Migrate out of Azure Synapse to Azure Data Lake or other technologies Assess best practices and design schemas that match business needs for delivering a modern analytics solution (descriptive, diagnostic, predictive, prescriptive) Design and implement data models and schemas that support efficient data processing and analytics Design and develop clear, maintainable code with automated testing using Pytest, unittest, integration tests, performance tests, regression tests, etc Collaborating with cross-functional teams and Product, Engineering, Data Scientists and Analysts to understand data requirements and develop data solutions, including reusable components meeting product deliverables Evaluating and implementing new technologies and tools to improve data integration, data processing, storage and analysis Evaluate, design, implement and maintain data governance solutions: cataloging, lineage, data quality and data governance frameworks that are suitable for a modern analytics solution, considering industry-standard best practices and patterns Continuously monitor and fine-tune workloads and clusters to achieve optimal performance Provide guidance and mentorship to junior team members, sharing knowledge and best practices Maintain clear and comprehensive documentation of the solutions, configurations, and best practices implemented Promote and enforce best practices in data engineering, data governance, and data quality Ensure data quality and accuracy Design, Implement and maintain data security and privacy measures Be an active member of an Agile team, participating in all ceremonies and continuous improvement activities, being able to work independently as well as collaboratively Fusemachines is an Equal Opportunities Employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws. Powered by JazzHR YyCZFu6HUv

Sr. Data Engineer AWS Snowflake

Pune

4 - 8 years

INR 4.0 - 8.0 Lacs P.A.

Work from Office

Full Time

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About the role: This is a remote, full time consulting position (contract) responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics) to optimize digital channels and technology innovations with the end goal of creating competitive advantages for food services industry around the globe. We re looking for a solid lead engineer who brings fresh ideas from past experiences and is eager to tackle new challenges. We re in search of a candidate who is knowledgeable about and loves working with modern data integration frameworks, big data and cloud technologies. Candidates must also be proficient with data programming languages (Python and SQL), AWS cloud and Snowflake Data Platform. The data engineer will build a variety of data pipelines and models to support advanced AI/ML analytics projects, with the intent of elevating the customer experience and driving revenue and profit growth globally. Qualification & Experience: Must have a full-time Bachelors degree in Computer Science or similar from an accredited institution. At least 3 years of experience as a data engineer with strong expertise in Python, Snowflake, PySpark, and AWS. Proven experience delivering large-scale projects and products for Data and Analytics, as a data engineer. Skill Set Requirement: Vast background in all things data-related. 3+ years of real-world data engineering development experience in Snowflake and AWS (certifications preferred). Highly skilled in one or more programming languages, must have Python , and proficient in writing efficient and optimized code for data integration, storage, processing, manipulation and automation. Strong experience in working with ELT and ETL tools and being able to develop custom integration solutions as needed, from different sources such as APIs, databases, flat files, and event streaming. Including experience with modern ETL tools such as Informatica, Matillion, or DBT; Informatica CDI is a plus. Strong experience with scalable and distributed Data Technologies such as Spark/PySpark, DBT and Kafka, to be able to handle large volumes of data. Strong programming skills in SQL , with proficiency in writing efficient and optimized code for data integration, storage, processing, and manipulation. Strong experience in designing and implementing Data Warehousing solutions in AWS with Snowflake. Good understanding of Data Modelling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions. Proven experience as a Snowflake Developer, with a strong understanding of Snowflake architecture and concepts. Proficient in Snowflake services such as Snowpipe, stages, stored procedures, views, materialized views, tasks and streams. Robust understanding of data partitioning and other optimization techniques in Snowflake. Knowledge of data security measures in Snowflake, including role-based access control (RBAC) and data encryption. Experience with Kafka, Pulsar, or other streaming technologies. Experience orchestrating complex task flows across a variety of technologies, Apache Airflow preferred. Expert in Cloud Computing in AWS, including deep knowledge of a variety of AWS services like Lambda, Kinesis, S3, Lake Formation, EC2, ECS/ECR, IAM, CloudWatch, EKS, API Gateway, etc Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent. Good Problem-Solving skills: being able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues. Responsibilities: Follow established design and constructed data architectures. Developing and maintaining data pipelines (streaming and batch), ensuring data flows smoothly from source (point-of-sale, back of house, operational platforms and more of a Global Data Hub) to destination. Handle ETL/ELT processes, including data extraction, loading, transformation and loading data from various sources into Snowflake to enable best-in-class technology solutions. Play a key role in the Data Operations team - developing data solutions responsible for driving Growth. Contribute to standardizing and developing a framework to extend these pipelines globally, across markets and business areas. Develop on a data platform by building applications using a mix of open-source frameworks (PySpark, Kubernetes, Airflow, etc.) and best-in-breed SaaS tools (Informatica Cloud, Snowflake, Domo, etc.). Implement and manage production support processes around data lifecycle, data quality, coding utilities, storage, reporting and other data integration points. Ensure the reliability, scalability, and efficiency of data systems are maintained at all times. Assist in the configuration and management of Snowflake data warehousing and data lake solutions, working under the guidance of senior team members. Work with cross-functional teams, including Product, Engineering, Data Science, and Analytics teams to understand and fulfill data requirements. Contribute to data quality assurance through validation checks and support data governance initiatives, including cataloging and lineage tracking. Takes ownership of storage layer, SQL database management tasks, including schema design, indexing, and performance tuning. Continuously evaluate and integrate new technologies to enhance data engineering capabilities and actively participate in our Agile team meetings and improvement activities. Fusemachines is an Equal opportunity employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws.

Sr. Data Engineer AWS Snowflake

Pune, Maharashtra, India

10 years

None Not disclosed

Remote

Contractual

About Fusemachines Fusemachines is a 10+ year old AI company, dedicated to delivering state-of-the-art AI products and solutions to a diverse range of industries. Founded by Sameer Maskey, Ph.D., an Adjunct Associate Professor at Columbia University, our company is on a steadfast mission to democratize AI and harness the power of global AI talent from underserved communities. With a robust presence in four countries and a dedicated team of over 400 full-time employees, we are committed to fostering AI transformation journeys for businesses worldwide. At Fusemachines, we not only bridge the gap between AI advancement and its global impact but also strive to deliver the most advanced technology solutions to the world. About the role: This is a remote, full time consulting position (contract) responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization and Advanced Analytics) to optimize digital channels and technology innovations with the end goal of creating competitive advantages for food services industry around the globe. We’re looking for a solid lead engineer who brings fresh ideas from past experiences and is eager to tackle new challenges. We’re in search of a candidate who is knowledgeable about and loves working with modern data integration frameworks, big data and cloud technologies. Candidates must also be proficient with data programming languages (Python and SQL), AWS cloud and Snowflake Data Platform. The data engineer will build a variety of data pipelines and models to support advanced AI/ML analytics projects, with the intent of elevating the customer experience and driving revenue and profit growth globally. Qualification & Experience: Must have a full-time Bachelor's degree in Computer Science or similar from an accredited institution At least 3 years of experience as a data engineer with strong expertise in Python, Snowflake, PySpark, and AWS Proven experience delivering large-scale projects and products for Data and Analytics, as a data engineer Skill Set Requirement: Vast background in all things data-related 3+ years of real-world data engineering development experience in Snowflake and AWS (certifications preferred) Highly skilled in one or more programming languages, must have Python, and proficient in writing efficient and optimized code for data integration, storage, processing, manipulation and automation Strong experience in working with ELT and ETL tools and being able to develop custom integration solutions as needed, from different sources such as APIs, databases, flat files, and event streaming. Including experience with modern ETL tools such as Informatica, Matillion, or DBT; Informatica CDI is a plus Strong experience with scalable and distributed Data Technologies such as Spark/PySpark, DBT and Kafka, to be able to handle large volumes of data Strong programming skills in SQL, with proficiency in writing efficient and optimized code for data integration, storage, processing, and manipulation Strong experience in designing and implementing Data Warehousing solutions in AWS with Snowflake Good understanding of Data Modelling and Database Design Principles. Being able to design and implement efficient database schemas that meet the requirements of the data architecture to support data solutions Proven experience as a Snowflake Developer, with a strong understanding of Snowflake architecture and concepts Proficient in Snowflake services such as Snowpipe, stages, stored procedures, views, materialized views, tasks and streams Robust understanding of data partitioning and other optimization techniques in Snowflake Knowledge of data security measures in Snowflake, including role-based access control (RBAC) and data encryption Experience with Kafka, Pulsar, or other streaming technologies Experience orchestrating complex task flows across a variety of technologies, Apache Airflow preferred Expert in Cloud Computing in AWS, including deep knowledge of a variety of AWS services like Lambda, Kinesis, S3, Lake Formation, EC2, ECS/ECR, IAM, CloudWatch, EKS, API Gateway, etc Good understanding of Data Quality and Governance, including implementation of data quality checks and monitoring processes to ensure that data is accurate, complete, and consistent Good Problem-Solving skills: being able to troubleshoot data processing pipelines and identify performance bottlenecks and other issues Responsibilities: Follow established design and constructed data architectures. Developing and maintaining data pipelines (streaming and batch), ensuring data flows smoothly from source (point-of-sale, back of house, operational platforms and more of a Global Data Hub) to destination. Handle ETL/ELT processes, including data extraction, loading, transformation and loading data from various sources into Snowflake to enable best-in-class technology solutions Play a key role in the Data Operations team - developing data solutions responsible for driving Growth Contribute to standardizing and developing a framework to extend these pipelines globally, across markets and business areas Develop on a data platform by building applications using a mix of open-source frameworks (PySpark, Kubernetes, Airflow, etc.) and best-in-breed SaaS tools (Informatica Cloud, Snowflake, Domo, etc.) Implement and manage production support processes around data lifecycle, data quality, coding utilities, storage, reporting and other data integration points Ensure the reliability, scalability, and efficiency of data systems are maintained at all times Assist in the configuration and management of Snowflake data warehousing and data lake solutions, working under the guidance of senior team members Work with cross-functional teams, including Product, Engineering, Data Science, and Analytics teams to understand and fulfill data requirements Contribute to data quality assurance through validation checks and support data governance initiatives, including cataloging and lineage tracking Takes ownership of storage layer, SQL database management tasks, including schema design, indexing, and performance tuning Continuously evaluate and integrate new technologies to enhance data engineering capabilities and actively participate in our Agile team meetings and improvement activities Fusemachines is an Equal opportunity employer, committed to diversity and inclusion. All qualified applicants will receive consideration for employment without regard to race, colour, religion, sex, sexual orientation, gender identity, national origin, disability, or any other characteristic protected by applicable federal, state, or local laws. Powered by JazzHR s7fNo7jx6d

Senior Fullstack Engineer

Pune

10 - 15 years

INR 9.0 - 13.0 Lacs P.A.

Work from Office

Full Time

We re hiring a Senior Software Engineer with expertise in Node.js , React.js , and Firebase to lead key technical initiatives on our engineering team. This role is ideal for someone who enjoys building and scaling full-stack applications in a cloud-native, serverless environment, and who takes pride in writing clean, maintainable code. What You ll Do Lead development of core features across both front-end and back-end systems. Contribute to and improve an established codebase with minimal oversight. Plan and deliver scalable solutions in collaboration with other team members. Build and maintain integrations with external platforms and APIs. Mentor junior engineers and participate in peer code reviews. Continuously explore tools and techniques that improve team productivity and product quality. What We re Looking For 10+ years of software development experience, with senior-level responsibilities in recent roles. 8+ years of experience with Node.js and React.js , especially in serverless environments like Cloud Functions or AWS Lambda. 5+ years working with NoSQL databases such as Firebase Realtime Database or equivalent. Ability to deliver high-quality solutions quickly and efficiently with minimal supervision. Strong communication, collaboration, and leadership skills in remote or distributed teams. Nice to Have Familiarity with Firebase suite (Realtime Database, Cloud Functions). Experience in cross-functional collaboration and mentoring. Equal Opportunity Employer: Fusemachines is committed to fostering a diverse and inclusive workplace. We welcome applications from all qualified individuals regardless of race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, protected veteran status, or any other legally protected status.

Login to

Please Verify Your Phone or Email

Confirm Action

Search

Profile

Upskill and Grow with AI

Fusemachines

Fusemachines

Start Your Job Search Today

Please Verify Your Phone or Email

Job Application AI Bot

Download the Mobile App

Setup Job Alerts

Similar Companies

Job Titles Overview

Before You Leave... Find Your Perfect Job!

Login to

Please Verify Your Phone or Email

Confirm Action

Contact Us

Search

Profile

Upskill and Grow with AI

Personal Settings

Personal Settings

Fusemachines

Fusemachines

Visit Company Website

Start Your Job Search Today

Please Verify Your Phone or Email

Job Application AI Bot

Download the Mobile App

Setup Job Alerts

Similar Companies

Job Titles Overview