Do you like collaborating across teams to solve complex problems?
Do you have a passion for cutting edge technologies and tackling system problems?
Join our highly-skilled Site Reliability team!
Our team designs, develops, and manages applications and infrastructure that support Akamais Compute products and services. We create solutions that manage our Compute platform, focusing on cloud interfaces - Compute Portals and APIs. We do this while maintaining Akamais mission to make life better for billions of people, billions of times a day.
Partner with the best
In this role, youll ensure the operation and uptime of our Compute services and infrastructure. Youll supervise and maintain our critical infrastructure. Youll collaborate with cross-functional teams to create tooling and software that monitors and improves the reliability of our systems. Youll work with various technologies as we release brand new applications and modernize our existing tooling.
As a Senior II Site Reliability Engineer, you will be responsible for:
- Providing technical leadership, mentorship, and support to SRE and project teams, fostering collaboration and motivation
- Defining requirements during the product lifecycle to influence design, standards, and operational readiness.
- Partnering with engineering, operations, and support teams to ensure availability, reliability, scalability, and usability of platforms.
- Developing and enhancing automation tools to streamline daily operations, reduce manual effort (toil), and improve performance.
- Troubleshooting and resolve complex system issues through proactive investigation, automation, and systems programming
- Managing and improving Compute identity & access management platforms to accelerate issue detection and remediation.
- Participating in on-call rotations, leading incident resolution, and contributing to robust, stable code delivery alongside other teams.
Do what you love
To be successful in this role you will:
- Have a Bachelors degree in Computer Science or equivalent, with relevant hands-on experience in infrastructure and software architecture at scale.
- Be experienced in infrastructure automation tools like SaltStack, Terraform, and Ansible, and CI/CD tools such as Jenkins or CloudBees.
- Have expertise in Linux administration, Docker-based environments, and Kubernetes; skilled in optimizing performance using tools like Redis.
- Be familiar with observability tools, Prometheus, Grafana, Loki, Sentry, NewRelic, and web proxies such as Nginx/Envoy/HAProxy
- Have understanding of SLOs and system reliability principles.
Work in a way that works for you
Learn what makes Akamai a great place to work
Connect with us on social and see what life at Akamai is like!
We power and protect life online, by solving the toughest challenges, together.
At Akamai, were curious, innovative, collaborative and tenacious. We celebrate diversity of thought and we hold an unwavering belief that we can make a meaningful difference. Our teams use their global perspectives to put customers at the forefront of everything they do, so if you are people-centric, youll thrive here.
Working for you
At Akamai, we will provide you with opportunities to grow, flourish, and achieve great things. Our benefit options are designed to meet your individual needs for today and in the future. We provide benefits surrounding all aspects of your life:
- Your health
- Your finances
- Your family
- Your time at work
- Your time pursuing other endeavours
Our benefit plan options are designed to meet your individual needs and budget, both today and in the future.
About us
Join us
Are you seeking an opportunity to make a real difference in a company with a global reach and exciting services and clients? Come join us and grow with a team of people who will energize and inspire you!
#LI-Remote