Big Data Architect

4.0 years

0.0 Lacs P.A.

Gurugram, Haryana, India

Posted:1 week ago| Platform: Linkedin logo

Apply Now

Skills Required

dataapachesparkhivehadoopcodingpythonpysparknifijsonunixscriptingmysqloracledesignsqlnosqlgitkafkacommissioningloggingintegration

Work Mode

On-site

Job Type

Full Time

Job Description

About the Role: As a Big Data Engineer, you will play a critical role in integrating multiple data sources, designing scalable data workflows, and collaborating with data architects, scientists, and analysts to develop innovative solutions. You will work with rapidly evolving technologies to achieve strategic business goals. Must-Have Skills: 4+ year’s of mandatory experience with Big data. 4+ year’s mandatory experience in Apache Spark. Proficiency in Apache Spark, Hive on Tez, and Hadoop ecosystem components. Strong coding skills in Python & PySpark. Experience building reusable components or frameworks using Spark Expertise in data ingestion from multiple sources using APIs, HDFS, and NiFi. Solid experience working with structured, unstructured, and semi-structured data formats (Text, JSON, Avro, Parquet, ORC, etc.). Experience with UNIX Bash scripting and databases like Postgres, MySQL and Oracle. Ability to design, develop, and evolve fault-tolerant distributed systems. Strong SQL skills, with expertise in Hive, Impala, Mongo and NoSQL databases. Hands-on with Git and CI/CD tools Experience with streaming data technologies (Kafka, Spark Streaming, Apache Flink, etc.). Proficient with HDFS, or similar data lake technologies Excellent problem-solving skills — you will be evaluated through coding rounds Key Responsibilities: Must be capable of handling existing or new Apache HDFS cluster having name node, data node & edge node commissioning & decommissioning. Work closely with data architects and analysts to design technical solutions. Integrate and ingest data from multiple source systems into big data environments. Develop end-to-end data transformations and workflows, ensuring logging and recovery mechanisms. Must be able to troubleshoot spark job failures. Design and implement batch, real-time, and near-real-time data pipelines. Optimize Big Data transformations using Apache Spark, Hive, and Tez Work with Data Science teams to enhance actionable insights. Ensure seamless data integration and transformation across multiple systems. Show more Show less

Recruin
Not specified
No locations

Employees

3 Jobs

RecommendedJobs for You

Bengaluru / Bangalore, Karnataka, India

Gurugram, Haryana, India

Gurugram, Haryana, India