Job
Description
POSITION
-
Software Engineer – Data Engineering
LOCATION
-
Bangalore/Mumbai/Kolkata/Gurugram/Hyderabad/Pune/Chennai
EXPERIENCE
-
5-9 Years
ABOUT HASHEDIN
We are software engineers who solve business problems with a Product Mindset for leading global organizations.
By combining engineering talent with business insight, we build software and products that can create new enterprise value.
The secret to our success is a fast-paced learning environment, an extreme ownership spirit, and a fun culture.
JOB TITLE:
Software Engineer – Data Engineering
OVERVIEW OF THE ROLE:
As a Data Engineer or Senior Data Engineer, you will be hands-on in architecting, building, and optimizing robust, efficient, and secure data pipelines and platforms that power business critical analytics and applications. You will play a central role in the implementation and automation of scalable batch and streaming data workflows using modern big data and cloud technologies. Working within cross-functional teams, you will deliver well-engineered, high quality code and data models, and drive best practices for data reliability, lineage, quality, and security
Mandatory Skills:
• Hands-on software coding or scripting for minimum 4 years
• Experience in product management for at-least 4 years
• Stakeholder management experience for at-least 4 years
• Experience in one amongst GCP, AWS or Azure cloud platform
Key Responsibilities:
• Design, build, and optimize scalable data pipelines and ETL/ELT workflows using Spark (Scala/Python), SQL, and orchestration tools (e.g., Apache Airflow, Prefect, Luigi).
• Implement efficient solutions for high-volume, batch, real-time streaming, and eventdriven data processing, leveraging best-in-class patterns and frameworks.
• Build and maintain data warehouse and lakehouse architectures (e.g., Snowflake, Databricks, Delta Lake, BigQuery, Redshift) to support analytics, data science, and BI workloads.
• Develop, automate, and monitor Airflow DAGs/jobs on cloud or Kubernetes, following robust deployment and operational practices (CI/CD, containerization, infra-as-code).
• Write performant, production-grade SQL for complex data aggregation, transformation, and analytics tasks.
• Ensure data quality, consistency, and governance across the stack, implementing processes for validation, cleansing, anomaly detection, and reconciliation
General Skills & Experience:
• Proficiency with Spark (Python or Scala), SQL, and data pipeline orchestration (Airflow, Prefect, Luigi, or similar).
• Experience with cloud data ecosystems (AWS, GCP, Azure) and cloud-native services for data processing (Glue, Dataflow, Dataproc, EMR, HDInsight, Synapse, etc.)
Hands-on development skills in at least one programming language (Python, Scala, or Java preferred); solid knowledge of software engineering best practices (version control, testing, modularity).
• Deep understanding of batch and streaming architectures (Kafka, Kinesis, Pub/Sub, Flink, Structured Streaming, Spark Streaming).
• Expertise in data warehouse/lakehouse solutions (Snowflake, Databricks, Delta Lake, BigQuery, Redshift, Synapse) and storage formats (Parquet, ORC, Delta, Iceberg, Avro).
• Strong SQL development skills for ETL, analytics, and performance optimization.
• Familiarity with Kubernetes (K8s), containerization (Docker), and deploying data pipelines in distributed/cloud-native environments.
• Experience with data quality frameworks (Great Expectations, Deequ, or custom validation), monitoring/observability tools, and automated testing.
• Working knowledge of data modeling (star/snowflake, normalized, denormalized) and metadata/catalog management.
• Understanding of data security, privacy, and regulatory compliance (access management, PII masking, auditing, GDPR/CCPA/HIPAA).
• Familiarity with BI or visualization tools (PowerBI, Tableau, Looker, etc.) is an advantage but not core.
• Previous experience with data migrations, modernization, or refactoring legacy ETL processes to modern cloud architectures is a strong plus.
•
Bonus:
Exposure to open-source data tools (dbt, Delta Lake, Apache Iceberg, Amundsen, Great Expectations, etc.) and knowledge of DevOps/MLOps processes
EDUCATIONAL QUALIFICATIONS
:
• Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or related field (or equivalent experience).
• Certifications in cloud platforms (AWS, GCP, Azure) and/or data engineering (AWS Data Analytics, GCP Data Engineer, Databricks).
• Experience working in an Agile environment with exposure to CI/CD, Git, Jira, Confluence, and code review processes.
• Prior work in highly regulated or large-scale enterprise data environments (finance, healthcare, or similar) is a plus
Show more
Show less