Posted:1 week ago|
Platform:
Remote
Contractual
Infrastructure as Code engineer to contribute to a large-scale, production AWS/Kubernetes platform supporting AI/ML workloads across multiple environments (dev, int, prod) and 9+ AWS regions.
**3-4 years** in DevOps/Platform/Infrastructure Engineering roles
Core Technologies (2-3 years minimum)
- **TypeScript/Python**
- Primary language for all IaC
- Experience with Node.js ecosystem and npm, python packaging
- **Containerization**
- multi-stage docker builds, container registry management
- Understanding of key container registry concepts such as layer caching, and pull through cache
- **Pulumi** or **Terraform** (or similar IaC tool) - 2+ years production experience
- Understanding of declarative infrastructure
- Resource dependencies and state management
- **AWS** - 2+ years production experience
- Core services: EKS, VPC, IAM, RDS, SQS, S3, SSM Parameter Store
- Preference for experience with accelerated compute management (NVIDIA, AMD, Inferentia)
- **Kubernetes** - 2+ years production experience
- Preferred AWS EKS experience
- Knowledge of Networking controllers like Cilium, Istio, Envoy Gateway
- Experience with cluster autoscaling technologies, preference for Karpenter
- Basic kubectl proficiency
### Required Knowledge Areas
- **CI/CD** - Experience with automated deployment pipelines
- GitHub Actions (preferred) or similar
- Understanding of deployment strategies (rolling, blue/green)
- Experience with multi-environment promotion workflows
- **Observability** - Experience with common platform observability tooling
- Log aggregation, preference for Loki
- time-series operational metrics, preference for Prometheus
- Open Telemetry based tracing, preference for Tempo
- Grafana dashboard creation
- **Git/Version Control** - Proficient
- Branching strategies, pull requests, code review
- Collaborative development workflows
- **Linux/Bash**
- Comfortable with command line and scripting
- experience working and customizing their terminal experience
- **Security Best Practices**
- Secrets management (not storing credentials in code)
- Principle of least privilege
- Understanding of encryption at rest and in transit
### Soft Skills
- **Self-directed learner** - Can research and understand existing patterns with little direction
- **Communication** - Can document work and explain technical decisions very well written, as much of communication will be async
- **Collaborative** - Experience with Pull Request Code Review processes
Tekgence Inc
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
Practice Python coding challenges to boost your skills
Start Practicing Python NowSalary: Not disclosed
Salary: Not disclosed