This role is for one of our clients
Industry: IT Services and IT ConsultingSeniority level: Associate levelMin Experience: 4 yearsLocation: IndiaJobType: full-timeWe are looking for a highly skilled Data Infrastructure Specialist
to design, implement, and maintain robust data platforms and pipelines that empower analytics, machine learning, and business intelligence initiatives. The ideal candidate brings hands-on expertise with modern data engineering tools, cloud platforms, and distributed systems, enabling seamless data-driven decision-making across the organization.Key Responsibilities
Architect, build, and optimize scalable ETL/ELT pipelines for large-scale structured and unstructured datasets using Spark, Databricks, Hive, or Glue.Develop and maintain data integration workflows leveraging tools such as Informatica, Talend, or SSIS.Write, optimize, and maintain complex SQL queries to support analytics and reporting.Collaborate with cross-functional teams—including data scientists, analysts, and product stakeholders—to translate requirements into robust technical solutions.Deploy and manage automated data workflows through CI/CD pipelines, ensuring smooth and reliable releases.Monitor and enhance data pipeline performance, scalability, and reliability, proactively resolving bottlenecks.Ensure data quality, governance, security, and regulatory compliance across all data workflows.Work with cloud-based platforms such as Azure (ADF, Synapse, Databricks) or AWS (EMR, Glue, S3, Athena) to support distributed data processing.Maintain clear and detailed documentation for data architectures, processes, and workflows.Mentor junior team members, providing guidance on best practices, coding standards, and problem-solving approaches.Stay up-to-date with emerging data technologies, tools, and industry best practices to continuously improve data infrastructure.Candidate Profile
Bachelor’s degree in Computer Science, IT, or a related technical discipline.4+ years of hands-on experience in designing and implementing data pipelines and distributed data processing solutions.Strong expertise with Databricks, Spark, or equivalent technologies (EMR, Hadoop).Proficiency in Python or Scala for data processing and transformation tasks.Experience with modern data warehouses such as Snowflake, Redshift, or Oracle.Solid understanding of distributed storage systems (HDFS, ADLS, S3) and data formats like Parquet and ORC.Familiarity with orchestration and workflow management tools like ADF, Airflow, or Step Functions.Databricks Data Engineering Professional certification is a plus.Exposure to multi-cloud environments or cloud migration projects is advantageous.Core Skills
Cloud Data Platforms | Databricks | Spark | ETL/ELT Pipeline Development | SQL & Python/Scala | Data Warehousing | Distributed Storage | Data Governance & Security | CI/CD for Data | Workflow Orchestration | Performance Optimization | Team MentorshipWe may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.