Role Purpose    
    The purpose of this role is to design, test and maintain software programs for operating systems or applications which needs to be deployed at a client end and ensure its meet 100% quality assurance parameters 
 
     
Responsibilities:
  Design and implement the data modeling, data ingestion and data processing for various datasets  Design, develop and maintain ETL Framework for various new data source  Develop data ingestion using AWS Glue/ EMR, data pipeline using PySpark, Python and Databricks.  Build orchestration workflow using Airflow & databricks Job workflow  Develop and execute adhoc data ingestion to support business analytics.  Proactively interact with vendors for any questions and report the status accordingly  Explore and evaluate the tools/service to support business requirement  Ability to learn to create a data-driven culture and impactful data strategies.  Aptitude towards learning new technologies and solving complex problem.   
Qualifications
 :   Minimum of bachelors degree. Preferably in Computer Science, Information system, Information technology. 
 Minimum 5 years of experience on cloud platforms such as AWS, Azure, GCP.  Minimum 5 year of experience in Amazon Web Services like VPC, S3, EC2, Redshift, RDS, EMR, Athena, IAM, Glue, DMS, Data pipeline & API, Lambda, etc.  Minimum of 5 years of experience in ETL and data engineering using Python, AWS Glue, AWS EMR /PySpark and Airflow for orchestration.  Minimum 2 years of experience in Databricks including unity catalog, data engineering Job workflow orchestration and dashboard generation based on business requirements  Minimum 5 years of experience in SQL, Python, and source control such as Bitbucket, CICD for code deployment.  Experience in PostgreSQL, SQL Server, MySQL & Oracle databases.  Experience in MPP such as AWS Redshift, AWS EMR, Databricks SQL warehouse & compute cluster.  Experience in distributed programming with Python, Unix Scripting, MPP, RDBMS databases for data integration  Experience building distributed high-performance systems using Spark/PySpark, AWS Glue and developing applications for loading/streaming data into Databricks SQL warehouse & Redshift.  Experience in Agile methodology  Proven skills to write technical specifications for data extraction and good quality code.  Experience with big data processing techniques using Sqoop, Spark, hive is additional plus  Experience in data visualization tools including PowerBI, Tableau.  Nice to have experience in UI using Python Flask framework anglular 
    Mandatory Skills:   Python for Insights  
 Experience :   5-8 Years.