Site Reliability Engineer (US – Remote)

US - Remote

Sysdig is the secure DevOps company, and we’re at the forefront of the container, Kubernetes, and cloud revolution. We are passionate, technical problem-solvers, continually innovating and delivering powerful solutions to confidently run cloud-native applications. Our consistent contributions to open source software projects reflect our commitment to the open cloud movement.

We value diversity and open dialog to spur ideas, working closely together to achieve goals. And we're a great place to work too — we were awarded the 2021 Bay Area Best Places to Work Award from San Francisco Business Times and the Silicon Valley Business Journal. We are looking for team members who share our commitment to customers and are willing to dig deeper, understand problems and deliver innovative solutions. Does this sound like the right place for you?

Your Opportunity

As a Site Reliability Engineer, you’ll be responsible for the availability, performance, and resilience of the Sysdig platform in our largest on-premise customer environments. You will collaborate with high-performing infrastructure and engineering teams both within Sysdig and customer organizations to help drive the scalability and stability of our platform.

Your Responsibilities
  • Participate in a globally distributed team of Site Reliability Engineers, supporting multiple Sysdig applications across our most critical on-premises customers.
  • Produce best-practice recommendations for on-premises customers to improve customer experiences.
  • Implement disaster recovery and reliability improvement initiatives, including performance tuning and infrastructure optimization.
  • Maintain and support the production environments and communicate directly with customer stakeholders.
  • Participate in an on-call rotation with other Site Operations Engineers.
Your Background
  • Required experience includes:
  • -Deploying Kubernetes workloads in a production environment
  • -Diagnosing and troubleshooting customer-facing production service outages
  • -Writing applications or automation using Python/Golang or Bash
  • DevOps related tools with a minimum of 2 years of experience in 3 or more of the following areas: 
  • -Version Control: Git
  • -Configuration Management: Helm
  • -Infrastructure as Code: Terraform
  • -Application Monitoring: Prometheus, Grafana
  • Managing database clusters such as Cassandra, Elasticsearch, Kafka, PostgreSQL is highly preferred.
  • Experience of Kubernetes Operators is a big plus.
  • Strong sense of ownership and a focus on customer delight
  • Strong analytical and written skills
  • Ability to work independently and as part of a team
Key Technologies
Kubernetes, Golang, Python, Cassandra, Kafka, Elasticsearch, PostgreSQL, Terraform, Helm

When you join Sysdig, you can expect:

  • Competitive salary
  • Top-notch health insurance coverage 

Additionally, we offer a variety of benefits and perks, such as:

  • 401k with company matching up to 3%
  • Flexible vacation policy 
  • A monthly allowance that can be used for the following types of expenses (Employee wellness, House Cleaning services, Home internet, Phone expenses, Office supplies, Office furniture)

Are you ready to join us?

We're excited to receive your application.