Jobs

Interviews
Job Alerts
Tools

Upskill and Grow with AI

Mock Interview Practice interviews in realistic simulations

Coding Practice Improve your coding skills with challenges

Certification Earn certifications to validate your skills

AI Learning Get trained with AI expert sessions

Career Path AI insights for smarter career decisions

AI Job Match Score AI-Powered Job Match Against Your Resume and Optimize Your Resume

Career Tools and Resources

Resume Builder Build Professional Resume with Ease

ATS Friendliness Check Check Resume Friendliness for Applicant Tracking Systems

Auto Apply Apply to hundreds of jobs on any platform effortlessly

Co-Pilot (Chrome Extension) Your AI Assistant for Seamless Browsing Efficiency

Interview Questions Streamline interviews with ready-to-use questions

Salaries Discover market-driven salary insights across skillsets and geographies

Companies Explore leading companies actively hiring talent
For Employers

Home
>
Jobs in mulshi
>
Growel Softech Pvt Ltd
>
Reliability Architect

Reliability Architect

Growel Softech Pvt Ltd

10 years

18 - 35 Lacs

mulshi maharashtra india

Posted:9 hours ago| Platform:

Apply

Skills Required

reliability optimization mentoring support collaboration monitoring automation software automate analysis splunk analyze troubleshooting visualize metrics reports scheduling reporting logging ai ml integration learning development service testing planning design scalability debugging management chef ansible terraform gitlab documentation consistency

Work Mode

On-site

Job Type

Full Time

Job Description

Work Location : All Pan India locations except MumbaiReliability Architect with over 10 years of experience in proactive monitoring,automation, and observability. Skilled in AIOps/MLOps, infrastructuremanagement, and performance optimization using modern tools and practices.Adept at leading incident response, mentoring support teams, and driving cross-functional collaboration to ensure system reliability and scalability.

Key Responsibilities

 Monitoring and AutomationProactively monitor software systems to prevent incidents and automate routineoperational tasks. Effective MonitoringDesign monitoring systems that trigger alerts based on symptoms rather thanoutages, ensuring early detection and resolution. Application Performance Monitoring (APM)Implement and manage APM tools like New Relic or Dynatrace to trackapplication performance, identify bottlenecks, and optimize resource usage. Log Analysis with SplunkUse Splunk to analyze logs for troubleshooting, anomaly detection, andimproving system reliability. Dashboards PreparationBuild intuitive dashboards to visualize system health, performance metrics, andoperational KPIs. Alerts SetupConfigure intelligent alerts based on thresholds and anomalies to ensure timelyincident response. Reports SchedulingAutomate regular reporting to provide insights into system performance,reliability, and trends. Reliability MetricsDefine and track metrics such as SLOs, SLIs, and error budgets to measure andmaintain system reliability. Observability SkillsApply observability practices including distributed tracing, logging, and metricscollection to gain deep insights into system behavior. AI-Driven Monitoring & AutomationUtilize AIOps techniques to proactively detect anomalies, automate incidentresponse, and enable self-healing systems through intelligent alerting andpredictive analytics. Observability & ML IntegrationIntegrate machine learning models with observability tools to enhance systeminsights, optimize performance, and ensure reliability of AI-powered services inproduction. Cross-Team CollaborationWork closely with development and support teams to enhance service reliabilitythrough rigorous testing and release procedures. Capacity PlanningParticipate in system design reviews and capacity planning to ensure scalabilityand performance. Debugging and Incident ResponseLead incident response efforts, analyze debugging information, and managerollbacks of faulty software deployments. Mentoring Support TeamsGuide and mentor L1/L2 support teams to establish best practices in monitoringand observability. Infrastructure ManagementManage infrastructure using tools like Chef, Ansible, Terraform, GitLab CI/CD,and Kubernetes. DocumentationMaintain comprehensive documentation of processes and procedures to ensureoperational consistency and reduce redundancy. Proactive MindsetApproach challenges with enthusiasm, ownership, and a continuousimprovement mindset.

More Jobs at Growel Softech Pvt Ltd

Python Developer

Noida, Uttar Pradesh, India

5.0 - 5.0 yrs

Salary: Not disclosed

Automation Developer

Pune, Maharashtra, India

7.0 - 7.0 yrs

Salary: Not disclosed

Sr Executive (Accounts)

karnataka

4.0 - 8.0 yrs

Salary: Not disclosed

Data Engineer

Trivandrum, Kerala, India

7.0 - 10.0 yrs

Salary: Not disclosed

Android app development Java (expert)

Hyderabad, Telangana, India

Experience: Not specified

Salary: Not disclosed

Mock Interview

Practice Video Interview with JobPe AI

Start Job-Specific Interview

Start Your Job Search Today

Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.

Job Application AI Bot

Apply to 20+ Portals in one click

Download Now

Download the Mobile App

Instantly access job listings, apply easily, and track applications.

Enhance Your Skills

Practice coding challenges to boost your skills

Start Practicing Now

Growel Softech Pvt Ltd

Login to

Please Verify Your Phone or Email

Confirm Action

Reliability Architect