Site Reliability Engineer
- For Career Direction
- Work From Home
- 2 to 5 Years
Key Responsibilities
- Drive scalability, reliability, and operational excellence across critical systems and infrastructure
- Define and implement SRE best practices, including SLOs, SLIs, SLAs, and resilience engineering
- Lead incident management, root-cause analysis, and post-mortems to continuously improve system reliability
- Design and maintain high availability (HA) and disaster recovery (DR) strategies
- Build and optimize incident response playbooks to reduce MTTR
- Implement and enhance monitoring, logging, and observability using modern tooling
- Perform capacity planning, load testing, and performance tuning
- Collaborate with engineering, DevOps, and security teams to embed reliability into the SDLC
- Mentor junior engineers and promote a culture of reliability and automation
- Advocate and support chaos engineering, game days, and resilience testing
Must-Have Skills
- Strong hands-on experience with cloud platforms (AWS, GCP, or Azure)
- Expertise in Kubernetes and container orchestration
- Solid experience with monitoring, logging, and observability tools (Prometheus, Grafana, ELK, OpenTelemetry)
- Strong scripting and automation skills using Python, Go, or Bash
- Deep understanding of networking concepts (DNS, CDN, load balancing, security)
- Hands-on experience with incident management, HA, and DR strategies
- Experience implementing reliability metrics (SLOs, SLIs, SLAs)
- Good communication and collaboration skills for cross-functional work
Good to Have
- Experience with Chaos Engineering and resilience testing
- Exposure to AIOps or ML-based anomaly detection
- Familiarity with security best practices and compliance (SOC2, ISO 27001)
- Experience working in fast-paced, agile environments
- SRE-related certifications (Google SRE, Datadog, etc.)
- Experience supporting multi-timezone production systems
- Experience building or scaling B2B products from scratch
- Experience leading or contributing to SRE transformations at scale
About the job
Here, opportunities come in two ways — roles within Career Direction and roles for our client requirements. Most internal positions are Work From Home, providing flexibility, comfort, and a balanced work environment. Client-based roles may require Hybrid or Work From Office setups depending on project demands, collaboration needs, and company policies. This structure allows candidates to explore a wide range of opportunities and apply for roles that best align with their skills, preferences, and long-term career goals.
Career Direction benefits apply only to Career Direction employees and may include health and wellness support, financial wellbeing programs, flexibility and time off, family care assistance, community involvement opportunities, and personal development support. These benefits, along with growth opportunities, are provided based on factors such as performance, dedication, skill level, assessment results, and available projects. Offerings may vary as they depend on organizational policies, project requirements, and client expectations.
Note: Salary range information is omitted as it applies to US-based roles, and compensation details will be discussed during the interview.