Technical Proficiency & Cognitive skills:
- Experience as a Site Reliability Engineer with hands-on knowledge of Site Reliability Engineering (SRE) practices & Principles, including implementing and managing SLOs, error budgets, observability, incident response, and automation in high-availability environments.
- Proven track record in configuring and supporting F5 infrastructure, including advanced ADC configurations such as TLS termination and caching.
In-depth knowledge and practical experience with Cloudflares DDoS protection and Web Application Firewall (WAF) capabilities. - Strong understanding and hands-on experience in setting up and managing PKI systems, including CloudHSM-backed PKI lifecycle management.
- Experience in supporting middleware environments for various platforms such as JAVA, .NET, NodeJS, and Angular.
- Solid understanding of database concepts and their application in modern infrastructure.
- Proficiency in modern DevOps practices, including continuous integration and continuous deployment (CI/CD) processes.
- Experience with infrastructure as code using tools like Terraform and Ansible for automating secure deployments.
- Strong knowledge of Azure Active Directory (AD) for authentication and authorization.
- Expertise in configuring and managing load balancers (F5, NGINX, ALB, App GW) and monitoring tools such as Splunk.
- Experience with AWS services including EC2, VPC, CloudFront, S3, Route53, RDS, Lambda, and more.
- Experience with Azure services such as Virtual Machines, Storage, App Service, Azure Functions, Azure SQL, PostgreSQL, AKS, and more. - Client Understanding and Advising
: Advocates for client needs and perspectives.
-
Learning Orientation
: Keeps up with new SRE, cloud, middleware, Application Delivery Controller and automation trends. -
Foundation Architecture Knowledge
: Supports standards for hybrid cloud and on-prem Load balancing and PKI Infrastructure. -
Strategic Technology Planning
: Contributes to technological roadmaps, especially for SRE and cloud (Platform as a product)
Modernize and Innovate
Deliver Results for Clients
- Decision Making: Makes informed decisions, especially in incident response.
Roles & Responsibilities:
Independent contributor IT professional providing advanced expertise to ensure the effective performance of one or more elements of the organization s technical infrastructure.
Maintain and modernize the existing load balancing environment, ensuring high availability and optimal performance. Support the bank s Enterprise load balancer infrastructure and its associate modules (F5 , Cloudflare , Cloud Native Load Balancing Services)
Setup and configure Cloudflare services, including DNS, CDN and security features like DDoS and WAF.
Implement and maintain WAF rules and page rules.
Monitor website performance and security using Cloudflare analytics and logs.
Optimize caching strategies and content delivery to improve load times and user experience.
Oversee the lifecycle management of SSL certificates for both external and internal CA signing authorities.
Manage the internal PKI authority and associated private key management through CloudHSM, seeking opportunities to modernize the PKI infrastructure.
Look at enhancements and opportunities for modernizing the PKI Infrastructure
Develop new and support existing applications that support services provided by the Platform Engineering team.
Plan, Install, maintain, configure Azure and AWS services including but not limited to ALB, App GW, App Proxy
Automate repetitive manual tasks and make it available as a self-service catalog item.
Build tools to reduce occurrences of errors and improve customer experience
Review work done by junior team members and provide technical support and mentorship
Embrace Site Reliability Engineering (SRE) practices to enhance resilience and operational efficiency.
A good knowledge on Cloud Technologies is essential for this role to help support the cloud migration road map.
Follow best practices by enforcing standards across various technologies.
When provided with an objective to improve performance in their area(s) of technology, develops and implements action plans needed to effect the change.
Provides technical support and mentorship to team members.
Support in adopting cloud native middleware services. - Bachelor s or Master s degree with at least 5 years of relevant experience.
- Experience in adopting Site Reliability Engineering practices to work. Having an SRE certification is a mandatory requirement
- Experience working in Agile environments and a SAFE Agile certification is mandatory
- Strong experience configuring and supporting Load Balancing and PKI Infrastructure
- Good understanding of the multiple middleware technologies and custom COTS product hosting s
- Experience with Azure DevOps (as both developer and administrator).
- Solid knowledge of modern DevOps practices, including CI/CD, git, Docker.
- Experience with Infrastructure as Code tools (Terraform, Chef, etc.).
- Demonstrated experience working in Agile environments
- Hands-on experience with AWS and Azure cloud services. Having cloud certification in Azure/AWS is an added advantage