Posted:3 months ago|
Platform:
Hybrid
Full Time
Proven experience in an SRE, DevOps, or infrastructure engineering role with a focus on monitoring, automation, and orchestration. Strong knowledge of Networking and Security domain, with the ability to critically analyse network designs and propose innovative improvements to enhance performance, reliability, stability and security Expertise in monitoring tools (Prometheus, ELK) with ability to optimize monitoring systems and integrate ML/AI models to improve visibility, anomaly detection, and proactive issue resolution. Extensive hands-on experience with automation tools such as Terraform, Ansible, and Jenkins, along with proficiency in CI/CD pipelines, to efficiently streamline and optimize network operations and workflows. Proficiency in scripting languages (Bash, Python, Go). Proficiency with containerization and orchestration (Docker, Kubernetes). Understanding of cloud platforms such as AWS, Azure, or Google Cloud. Familiarity with microservices architecture and distributed systems. Work closely with developers, QA, and operations teams to foster a DevOps culture focused on security, reliability, and automation. Monitoring & Alerting: • Design, implement, and manage comprehensive monitoring solutions using tools like Prometheus, Grafana, ELK stack, etc. • Develop and maintain alerting systems that proactively provide insights into system health and performance. • Integrate ML/Gen AI models for anomaly detection, trend analysis, and proactive alerts to enhance observability • Identify and implement innovative features to improve visibility into system performance and reliability. • Integrate ML/Gen AI models for anomaly detection, trend analysis, and proactive alerts to enhance observability. • Identify and implement innovative features to improve visibility into system performance and reliability • Define and track SLIs, SLOs, and SLAs for critical services and ensure continuous compliance. Automation & Infrastructure Management: • Automate infrastructure provisioning and management using tools such as Ansible or Terraform eliminate manual interventions. • Build and maintain CI/CD pipelines ( GitLab CI) to streamline deployments and ensure system consistency. • Implement automated testing and validation processes for infrastructure and applications. Orchestration & Infrastructure as Code: • Leverage containerization and orchestration technologies (Docker, Kubernetes) to manage scalable, resilient, and fault-tolerant services. • Use Infrastructure as Code (IaC) to automate and standardize environment provisioning and configuration management. Networking & Security: • Review network designs and propose enhancements using emerging technologies and industry best practices for efficiency and innovation. • Ensure the security and compliance of infrastructure by implementing best practices in network security, including encryption, firewall management, access controls, and intrusion detection. • Perform regular security audits and vulnerability assessments to identify and mitigate risks. • Monitor network traffic and optimize performance through network tuning and troubleshooting.
NMS Consultant
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
My Connections NMS Consultant
Pune, Trivandrum
10.0 - 20.0 Lacs P.A.