Experience: 5 - 10 Years
Location: Kochi
Work Mode: Hybrid
Job Type: Fulltime
Mandatory Skills:Python, Pyspark (Scala or Spark), SQL, Rest APIs, Orchestration, Delta Live tablesDelta Lake, Databricks, Datafactory, SSIS, ETL, Data PipelinesSnowflake, Snowsql, Snowpipe, Performance TuningMicrosoft Fabric: Dataflows, pipelines, OneLake, integration with Power BIInfrastructure as Code (Terraform, Bicep)Certifications: Databricks, Snowflake, and/or Fabric certification is a plus
Qualifications
- Education: Bachelor's degree in Computer Science, Data Engineering, Information
Systems, or related field. Master's degree is a plus.
- Experience:
- 5+ years of professional experience in data engineering
- Proven experience with Databricks, Snowflake, and Microsoft Fabric
- Experience working with on-premise relational databases (SQL Server,
Oracle, DB2, or similar)
- Strong understanding of ETL/ELT processes, data warehousing, and data
lakehouse concepts
- Demonstrated ability to work with both structured and unstructured data
- Technical Skills:
- Databricks: PySpark/Scala/Spark SQL, Delta Lake, MLflow integration,
notebook orchestration, Delta Live tables
- Snowflake: SnowSQL, Snowpipe, performance tuning, security/role
management, data sharing.
- Microsoft Fabric: Dataflows, pipelines, OneLake, integration with Power BI
- Advanced SQL skills
- Python, Scala, Java
- Git, CI/CD for data solutions, Infrastructure as Code (Terraform, Bicep)
- Experience with metadata management and data lineage documentation
- Experience with SSIS, ADF, and other ETL/data pipeline tools
- Real-world experience using Databricks notebooks to interact with REST APIs
- Soft Skills:
- Strong communication and collaboration skills, with the ability to work
effectively in a team-oriented environment
- Detail-oriented and committed to maintaining high data quality and reliability
- Ability to manage multiple priorities and deliver results in a fast-paced
environment
Preferred Qualifications
- Certifications: Databricks, Snowflake, and/or Fabric certification is a plus
- Additional Skills: Experience in CI/CD practices for deploying data pipelines and
ETL jobs
Key Responsibilities
- Design, develop, and optimize data pipelines and ETL jobs across cloud and on-
prem environments
- Integrate and manage data between Databricks, Snowflake, Fabric, and on-prem
databases
- Collaborate with business and analytics teams to understand data needs and
deliver high-quality, reusable datasets
- Adhere to data governance standards and ensure compliance with data security
and privacy policies in all aspects of data handling
- Support real-time, streaming, and batch data processing use cases
- Partner with architects to define scalable data lakehouse and warehouse strategies
- Ensure data quality, consistency, and performance through monitoring,
automation, and optimization
- Collaborate with cross-functional teams, including data governance, analytics, and
IT, to understand data requirements and support various data initiatives.
- Document processes, pipeline configurations, and data models to ensure
transparency and knowledge sharing within the team.
- Identify and implement process improvements for data ingestion, transformation,
and pipeline automation
- Stay updated with the latest features and best practices in Databricks, Snowflake,
and Fabric technologies to enhance platform capabilities
Skills: pipeline,skills,python,microsoft,data pipeline,pipelines,integration,microsoft fabric,orchestration,sql,ssis,o,rest api,etl,data factory,snowflake,delta,performance tuning,pyspark,data bricks,data