Inviting applications for the role of Site Reliability - Principal Site Reliability Engineer

Principal Consultant-Site Reliability Engineer-Hyderabad
Basic Information
Ref # ITO066255
Location Hyderabad
Date Published: Friday, May 12, 2023
Band 4C
Points 600
Description

With a start-up spirit and 90,000+ curious and courageous minds, we have the expertise to go deep with the world’s biggest brands—and we have fun doing it! We dream in digital, dare in reality, and reinvent the ways companies work to make an impact far bigger than just our bottom line. We’re harnessing the power of technology and humanity to create meaningful transformation that moves us forward in our pursuit of a world that works better for people. 

 Now, we’re calling upon the thinkers and doers, those with a natural curiosity and a hunger to keep learning, keep growing. People who thrive on fearlessly experimenting, seizing opportunities, and pushing boundaries to turn our vision into reality. And as you help us create a better world, we will help you build your own intellectual firepower.

Welcome to the relentless pursuit of better.

Inviting applications for the role of Site Reliability - Principal Site Reliability Engineer

Being a is Principal Site Reliability Engineer to lead the quality assurance department of any industry. Ensure that team of QA engineers are on the right track all the way in the project, resolving conflicts across the team, review the schedules and plans, mitigation of the risks, checking quality in phases, update management, build a challenging and motivating environment.

Responsibilities

Developing and maintaining monitoring and alerting systems to quickly detect and respond to problems.
Incident management process, tooling and automation (runbooks, dashboards, alerting, engagement, etc.).
Automating routine operations tasks to reduce manual intervention and improve efficiency.
Participating in on-call rotations to respond to incidents and ensure system availability.
Developing and implementing performance testing and capacity planning strategies to ensure systems can handle expected loads.
The ideal candidate for an SRE role should have a strong background in software engineering and operations, with a deep understanding of distributed systems, networking, and cloud technologies. They should also have excellent communication skills, as they will often be required to collaborate across teams to identify and address issues.

Qualifications/Minimum qualifications

The ability to break down complex problems into solvable components.
Amazon Web Services, Microsoft Azure, GCP.
Terraform or CloudFormation.
Experience as an SRE, Software Engineer, or Production Engineer.
Experience writing code with one or more programming languages: Python, Java, C, C++, Go, JavaScript, or Ruby.
Experience with log aggregation solutions: Splunk, ELK, Sumo Logic.
Experience with metrics monitoring platforms: SignalFX, Datadog, Dynatrace, AppDynamics or other enterprise APM.
Strong desire to learn and grow.
Strong interest in SRE topics like SLIs, SLOs, resilience, scaling, and performance.

Skill Required:

VMWare ESX/ESXi – up to v7.x
VMWare Cloud
SAN/NAS Storage (administration, VMware touch points, FC/ISCSI)
AWS native load balancers
AWS EC2, ECS, Containers
Terraform
Splunk
CloudWatch
CI/CD
Jenkins
Automation utilizing Python
Incident management.
Preferred Qualifications/ skills

Experience supporting large scale distributed systems.
Experience with infrastructure configuration and automation tools: Terraform, Puppet, Ansible.
Good working knowledge of build automation and continuous integration/delivery ecosystem: Git, Gerrit, Maven/Gradle, Jenkins, Docker, Nexus, Artifactory, Selenium.
Experience with security in the cloud: Intrusion, penetration, and vulnerability scanning.

Comments

Popular posts from this blog

Inviting applications for the role of Training Manager – AP English

AP - Vendor Master Data – Process Associate - German – Remote RO

Senior Manager - Black Belt | Genpact hiring |