We are seeking a self-motivated and enthusiastic DevOps Engineer to join the Trimble Connect s Site Reliability Engineering team, which is responsible for provisioning and operating our core services in the public cloud.
Key Responsibilities
-
Articulate technical characteristics of services and technology areas and guide development teams to engineer and add capabilities to internal tools.
-
Quickly grasp and analyze new or new-to-you systems that are complex and rapidly changing.
-
Investigate, evaluate and prototype service offerings from infrastructure providers across the public cloud and different tools
-
Root cause analysis for production issues
-
Identify problems and opportunities for improvements that are common across many teams and services.
-
Develop automation and monitoring solutions
-
Utilize best practices in cloud security and operations
-
Optimize application for maximum speed and scalability
-
Collaborate with other team members and stakeholder
-
Evaluate new tools, technologies, and processes to improve speed, efficiency, and scalability of continuous integration environments
-
Responsible for fixing compliance issues and requirements raised by SecOps tools
-
Responsible for optimize cost across cloud services, logging and monitoring tools
-
Foster collaboration with software product development, architecture, and engineering team to ensure releases are delivered with repeatable and auditable processes
-
Define capacity planning, design cost controls and rollout the cost optimization strategy
-
Learn and be passionate about cloud computing
Required Skills and Experience
-
3+ years of strong experience with demonstrably deep AWS knowledge, monitoring, troubleshooting, and related DevOps technologies.
-
Strong experience in core AWS compute services, including EC2, ECS, and Lambda.
-
Strong expertise in AWS Networking, including VPCs, Subnets, Security Groups, and NACLs.
-
In-depth knowledge of AWS disaster recovery solutions including Data Layer.
-
Proficiency in infrastructure automation using Terraform and AWS CloudFormation.
-
Experience with CI/CD pipelines using Jenkins and Github actions.
-
Skilled in creating golden images for instances using Packer or similar tools.
-
Experience with Cloud Orchestration frameworks and providing SRE support for these systems.
-
Proven ability to build and automate patching for both Linux and Windows OS images.
-
Deep understanding of Linux/Unix operating systems.
-
Familiarity and experience with architectural design principles on AWS.
-
Hands-on experience with serverless technologies and architectures.
-
Experience with scalability, security, and performance engineering for web services on AWS.
-
A collaborative team player open to feedback and new ideas.
-
Ability to support and troubleshoot scalability, high availability, performance, monitoring, and backup and restore operations across various environments.
-
Proven ability to work independently across multiple platforms and applications to understand and manage dependencies.
-
Strong experience with scripting and process automation using Shell, Python.
Desirable Skills and Experience
-
Experience with monitoring tools like DataDog, ELK , AWS Cloudwatch, InfluxDB , Grafana, Prometheus
-
Experience in Atlassian tools , Bitbucket , Jira and Confluence
-
Experience with containers
-
Experience with serverless application models
-
Experience with microservices
-
Experience with NoSQL databases
-
Experience with enterprise messaging
-
Azure knowledge is an added advantage.