We are looking for a highly skilled Big Data Engineer with expertise in cloud technologies to join our team. The ideal candidate will be responsible for designing, developing, and maintaining scalable big data solutions, ensuring efficient data processing, storage, and analytics. This role involves working with distributed systems, cloud platforms, and modern data frameworks to support real-time and batch data pipelines. The above-mentioned skillsets or roles are used for creating content and labs. Responsibilities Design, implement, and manage scalable big data architectures on AWS, Azure, or GCP. Develop ETL pipelines for ingesting, processing, and transforming large datasets. Work with Python, Apache Spark, Hadoop, and Kafka to build efficient data processing solutions. Implement data lakes, data warehouses, and streaming architectures. Optimize database and query performance for large-scale datasets. Collaborate with SMEs, Clients, and software engineers to deliver content. Ensure data security, governance, and compliance with industry standards. Automate workflows using Apache Airflow or other orchestration tools. Monitor and troubleshoot data pipelines to ensure reliability and scalability. Requirements Minimum educational qualifications: B. E., B. Sc, M. Sc, MCA Proficiency in Python, Java, or Scala for data processing. Hands-on experience with Apache Spark, Hadoop, Kafka, Flink, Storm. Hands-on experience working with SQL and NoSQL databases. Strong expertise in cloud-based data solutions (AWS / Google / Azure). Hands-on experience in building and managing ETL/ELT pipelines. Knowledge of containerization and orchestration, Docker or K8S. Hands-on with real-time data streaming and serverless data processing. Familiarity with machine learning pipelines and AI-driven analytics. Strong understanding of CI/CD & ETL pipelines for data workflows. Technical Skills Big Data Technologies: Apache Spark, Hadoop, Kafka, Flink, Storm. Cloud Platforms: AWS / Google / Azure. Programming Languages: Python, Java, Scala, SQL, PySpark. Data Storage and Processing: Data Lakes, Warehouses, ETL/ELT Pipelines. Orchestration: Apache Airflow, Prefect, Dagster. Databases: SQL (PostgreSQL, MySQL), NoSQL (MongoDB, Cassandra). Security and Compliance: IAM, Data Governance, Encryption. DevOps Tools: Docker, Kubernetes, Terraform, CI/CD Pipelines. Soft Skills Strong problem-solving and analytical skills. Excellent communication and collaboration abilities. Ability to work in an agile, fast-paced environment. Attention to detail and data accuracy. Self-motivated and proactive. Certifications Any Cloud or Data-related certifications. (ref:hirist.tech) Show more Show less