Description
Job Title : Devops EngineerExperience : 5+yrsEmployment type : Part TimeTimings : It would be for 3 - 4 hours on a daily basis starting from 7pm onwards
Key Responsibilities
Cloud Infrastructure Management (Azure & AWS) :
- Manage and scale cloud-based environments, ensuring high availability, fault tolerance, and security.
- Implement and maintain multi-tenant architectures in cloud platforms (Azure and AWS), ensuring resource isolation and efficient cost management.
- Configure and optimize resources across both Azure and AWS platforms, leveraging best practices for cloud security, networking, and performance.
CI/CD Pipeline Maintenance
- Design, implement, and optimize continuous integration and continuous deployment (CI/CD) pipelines across development, staging, and production environments.
- Ensure high availability and minimal downtime for production releases by automating rollbacks, canary deployments, and blue-green deployments.
- Integrate monitoring and alerting into the pipelines to catch issues before they reach production.
AI/ML Ops Integration
- Work closely with the AI/ML teams to deploy and monitor machine learning models in production environments, ensuring smooth integration with CI/CD pipelines.
- Implement and maintain automated workflows for ML model training, testing, and deployment.
- Use tools like Azure ML, SageMaker, or other relevant services to enable ML lifecycle management.
Test-Driven Development (TDD) Integration
- Promote and implement Test-Driven Development (TDD) practices within CI/CD pipelines to improve software quality and reduce defects in production.
- Ensure that code quality is maintained by integrating automated unit and integration tests into the pipelines, along with ensuring that sufficient test coverage is in place.
Automation & Scripting
- Develop automation scripts and tools to streamline repetitive tasks across various environments.
- Implement Infrastructure as Code (IaC) using tools like Terraform, CloudFormation, or ARM Templates.
- Create and manage cloud-native services such as Kubernetes (AKS, EKS), containerized applications, and serverless architectures.
Monitoring & Performance Tuning
- Set up and manage monitoring systems (e.g., Azure Monitor, CloudWatch, Prometheus, Grafana) for tracking system performance, cost, and security metrics.
- Implement logging and alerting systems to ensure rapid detection of issues in production.
- Continuously improve infrastructure performance by analyzing bottlenecks and resource optimization opportunities.
Collaboration & Mentoring
- Collaborate with software engineers to ensure smooth integration of new code into the pipelines, supporting Agile development cycles.
- Provide guidance and mentorship to junior DevOps engineers and developers on best practices for cloud infrastructure, CI/CD, and automation.
Security & Compliance
- Ensure cloud infrastructure complies with industry standards for security, privacy, and governance.
- Implement security best practices, including identity and access management (IAM), encryption, and network security.
Essential
Skills & Qualifications :
Cloud Experience
- Extensive experience with Azure and AWS cloud platforms, with a strong understanding of their respective services (e.g., Azure Kubernetes Service, AWS EC2, S3, Lambda, CloudFormation, and Azure Resource Manager).
- Experience with multi-tenant architecture at scale on cloud platforms, including designing isolated environments and managing cross-tenant resources.
CI/CD & Automation
- Strong experience with CI/CD tools such as Jenkins, GitLab CI, Azure DevOps, or AWS CodePipeline.
- Familiarity with Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or Azure ARM templates.
- Proficient in containerization technologies such as Docker, Kubernetes, and container orchestration platforms (AKS, EKS).
AI/ML Ops Knowledge
- Familiarity with AI/ML Ops practices and tools (e.g., Azure Machine Learning, AWS SageMaker, MLFlow, Kubeflow).
- Experience automating the deployment and monitoring of machine learning models in production environments.
Programming & Scripting
Strong programming skills in languages such as Python, Go, or Ruby.Experience with scripting in Bash, PowerShell, or Python for automating tasks.
Test-Driven Development (TDD)
- Proven experience integrating Test-Driven Development (TDD) into CI/CD pipelines.
- Familiarity with testing frameworks like JUnit, pytest, Mocha, or equivalent for unit, integration, and functional tests.
Monitoring & Logging
- Experience with monitoring tools such as Prometheus, Grafana, CloudWatch, Azure Monitor, and Datadog.
- Familiarity with logging frameworks such as ELK Stack, Fluentd, or Splunk.
- Security Best Practices :
- Experience with cloud security practices, including IAM, security groups, VPNs, and encryption standards.
(ref:hirist.tech)