Job
Description
1. ETLHands on experience of building data pipelines. Proficiency in two or more data integration platforms such as Ab Initio, Apache Spark, Talend and Informatica2. Big DataExperience of big data platforms such as Hadoop, Hive or Snowflake for data storage and processing 3. Data Warehousing & Database ManagementUnderstanding of Data Warehousing concepts, Relational (Oracle, MSSQL, MySQL) and NoSQL (MongoDB, DynamoDB) database design 4. Data Modeling & DesignGood exposure to data modeling techniques; design, optimization and maintenance of data models and data structures5. LanguagesProficient in one or more programming languages commonly used in data engineering such as Python, Java or Scala 6. DevOpsExposure to concepts and enablers - CI/CD platforms, version control, automated quality control managementAb InitioExperience developing CoOp graphs; ability to tune for performance. Demonstrable knowledge across full suite of Ab Initio toolsets e.g., GDE, ExpressIT, Data Profiler and ConductIT, ControlCenter, ContinuousFlowsCloudGood exposure to public cloud data platforms such as S3, Snowflake, Redshift, Databricks, BigQuery, etc. Demonstratable understanding of underlying architectures and trade-offsData Quality & ControlsExposure to data validation, cleansing, enrichment and data controls ContainerizationFair understanding of containerization platforms like Docker, KubernetesFile FormatsExposure in working on Event/File/Table Formats such as Avro, Parquet, Protobuf, Iceberg, DeltaOthersBasics of Job scheduler like Autosys. Basics of Entitlement managementCertification on any of the above topics would be an advantage.