Posted:16 hours ago|
Platform:
On-site
Full Time
At Times Internet, we create premium digital products that simplify and enhance the lives of
millions. As India’s largest digital products company, we have a significant presence across a
wide range of categories, including News, Sports, Fintech, and Enterprise solutions.
Our portfolio features market-leading and iconic brands such as TOI, ET, NBT, Cricbuzz, Times
Prime, Times Card, Indiatimes, Whatshot, Abound, Willow TV, Techgig, and Times Mobile
among many more. Each of these products is crafted to enrich your experiences and bring you
closer to your interests and aspirations.
As an equal opportunity employer, Times Internet strongly promotes inclusivity and diversity. We
are proud to have achieved overall gender pay parity in 2018, verified by an independent audit
conducted by Aon Hewitt.
We are driven by the excitement of new possibilities and are committed to bringing innovative
products, ideas, and technologies to help people make the most of every day. Join us and take
us to the next level!
We are looking for a Site Reliability Engineer (SRE) to join our News Team. The SRE will be
responsible for maintaining the reliability, scalability, and performance of our critical
infrastructure, ensuring high availability for our services.
As a Site Reliability Engineer (SRE) in the News Team, you will be responsible for ensuring the
stability, performance, and scalability of our systems. You will play a key role in various
migration activities, including Kubernetes cluster upgrades, and application re-platforming. A
significant part of your role will involve migrating applications into Kubernetes, ensuring
seamless deployment, high availability, and minimal downtime.
Additionally, you will be responsible for configuring and maintaining Elasticsearch and Kafka
clusters, ensuring optimal performance, availability, and reliability. You will work on tuning
Elasticsearch for efficient search and indexing, managing Kafka for real-time data streaming,
and troubleshooting any issues related to these services.
You will work on automating operational tasks, optimizing infrastructure, and proactively
resolving issues to maintain system reliability. Additionally, you will collaborate with
development, DevOps, and infrastructure teams to implement best practices for security,
observability, and scalability. Your expertise will be crucial in improving deployment pipelines,
incident response, and overall system performance.
● Ensure IT services and infrastructure uptime.
● Implement monitoring, alerting, and incident response processes
● Automate repetitive ops tasks (deployments, scaling, failover).
● Respond to outages and production incidents (on-call duties).
● Perform root cause analysis (RCA) and drive postmortems.
● Measure and optimize system performance (latency, throughput, resource usage).
● Support reliable and safe code releases
● Ensure systems are patched, hardened, and compliant with standards.
● Collaborate with technology teams for new requirements and deliver them
● 8+ years of experience in Site Reliability Engineering, or a related role.
● Proficiency in Kubernetes, Docker, and container orchestration.
● Experience with CI/CD tools.
● Strong knowledge of Linux systems and scripting (Bash, Python).
● Familiarity with configuration management tools like Ansible,Helm.
● Experience with monitoring and logging tools (ELK Stack, or NewRelic).
● Strong troubleshooting skills and incident management experience.
● Experience with Elasticsearch and Kafka
● Knowledge of networking concepts, load balancers, and DNS.
● Experience in performance tuning and optimization.
● Systems & OS Knowledge
● Linux/Unix administration (process management, system tuning, networking)
● Understanding of filesystems, memory, CPU, kernel basics (centos / Ubuntu )
● Scripting for automation: BASH, python
● Knowledge of cloud platforms : AWS, GCP, Azzure
● Networking and Protocols
● TCP/IP, DNS, HTTP/HTTPS, CDN concepts
● Debugging latency, connectivity, and routing issues
● CI/CD and DevOps Practices
● Jenkins, GitHub Actions, GitLab CI, BitBucket, Git
● Working knowledge of Apache, Tomcat, Nginx
● Knowledge of DNS, Load Balancer, WAF, Firewall.
● Working knowledge of Monitoring tools and ELK
● Knowledge hypervisor like VMware.
● Strong on Virtualization technologies, Docker, Kubernetes
● Knowledge of Database concepts
● Bachelor’s degree in Electronic and Telecom, Computer Science, Information
Technology, or a related field.
● 8+ years of experience in Site Reliability Engineering
Times Internet
Upload Resume
Drag or click to upload
Your data is secure with us, protected by advanced encryption.
Browse through a variety of job opportunities tailored to your skills and preferences. Filter by location, experience, salary, and more to find your perfect fit.
We have sent an OTP to your contact. Please enter it below to verify.
gurugram
8.0 - 12.0 Lacs P.A.
greater hyderabad area
Experience: Not specified
Salary: Not disclosed
hyderabad, chennai, bengaluru
17.0 - 27.5 Lacs P.A.
12.0 - 16.0 Lacs P.A.
noida, uttar pradesh, india
Salary: Not disclosed
mohali district, india
Salary: Not disclosed
gurugram, haryana, india
Salary: Not disclosed
12.0 - 15.0 Lacs P.A.
pune, maharashtra, india
Salary: Not disclosed
chennai, tamil nadu, india
Experience: Not specified
Salary: Not disclosed