Key Responsibilities
Technical Leadership & Project Execution
- Provide technical thought leadership, compare different technologies to meet business requirements and cost control drivers.
- Work with Business and IT groups to design and deliver a data lake platform.
- Produce & maintain the overall solution design for the entire Data Lake Platform.
- Execution of data strategy, help in the design and architecture of solutions on the platform
- Enforce technical best practices for Big Data management and solutions, from software selection to technical architectures and implementation processes.
- Document and publish best practices, guidelines, and training information.
- Ensures all functional solutions and components of the Data Lake platform service are designed and implemented in a way to always meet SLAs.
- Contributes to the continuous improvement of the support & delivery functions by maintaining awareness of technology developments and making appropriate recommendations to enhance application services.
- Focus on data quality throughout the ETL & data pipelines, driving improvements to data management processes, data storage, and data security to meet the needs of the business and customers
Required Skills and Experience
Education & Experience
- Bachelors degree in computer science, Engineering, or a related field.
- 8+ years experience working in Data Engineering
- 8+ years experience in building massively scalable distributed data processing solutions
- 8+ years experience of database design & development.
- Experience in building and optimizing/fine tuning complex transformations in data pipelines
- Building Data Pipelines & ETL jobs using cloud-native technologies & design patterns
- Experience in designing resilient systems & creating disaster recovery plans
- Working in Agile Scrum or Kanban teams & deploying solutions using Continuous Delivery best practices
- Using automated database migration tools & have strong opinions on version control best practices for SQL scripts.
Technical Skills
- Big Data
- Apache PySpark
- Databricks
- Snowflake
- AWS Glue
- AWS EMR
- Tables formats: Delta table, Iceberg
- File formats: Parquet
- RDBMS - PostgreSQL
- Languages
- Proficient in: Python
- Proficient in SQL
- Proficient in PySpark
- Cloud Technologies and Tools
- Experience in designing cloud-based data pipelines & solutions
- AWS: EMR, AWS Glue, S3, EC2, RDS, Aurora PostgreSQL, Lambda.
Preferred/Desirable Skills
- AI
- Exposure to GenAI technologies is an added advantage.
- Experience in Agile/Scrum environments and collaboration with distributed teams across regions.
- Ability to manage and coordinate work with other team members
Soft Skills
- Proven leadership and mentoring capabilities.
- Strong communication and collaboration skills across cross-functional teams.
- Self-driven and proactive with a high degree of ownership and accountability.
- Adaptable to fast-paced, dynamic work environments.
- Strong analytical and problem-solving mindset with a focus on delivering quality outcomes.
What will you be doing in this role?
- You will interact with Product and the Technology Development leads and Managers directly to understand the business problem.
- Define and implement our test platform strategy on Cloud, have a meaningful impact on our customers, and working in our high energy, innovative, fast-paced Agile culture.
- You will be key member in architectural decisions, and application workflows.
- You will also create technical documentation for the broader team.
- You will drive effective feedback gathering from customers and translate into clear technical requirements for their solutions.