We’re looking for a Site Operations Engineer to help us lead the container revolution. You’ll be joining a globally distributed engineering team, and will be responsible for monitoring and maintaining private Sysdig Platform infrastructure at our most critical on-premises customers.
Sysdig is the cloud-native intelligence company, and we’re at the forefront of the container and microservices adoption in the enterprise. We’re the people who are making reliable, secure containers a reality for enterprises everywhere.
We're passionate about solving the most complex operational challenges that companies face when they transition to Kubernetes, Docker, and cloud-native architectures on a massive scale. We offer the best of both worlds: we're a well-funded startup ($121.5 million) with a 300+ enterprise customer base (300 and counting). And we're a great place to work too — we were awarded the 2019 Bay Area Best Places to Work Award from San Francisco Business Times and the Silicon Valley Business Journal. Have we gotten your attention yet?
Sysdig was born from open source, so your work here will cross the divide between developer-led OSS and battle-tested commercial software at scale. We’re proud that our open source tools are widely used and loved by technologists and developers. Falco, our open-source container security project, is now a part of the Cloud Native Computing Foundation and rapidly scaling. We’re big fans of Prometheus too!
As a Site Operations Engineer, you’ll be responsible for the availability, performance, and resilience of the Sysdig platform in our largest on-premise customer environments. You will collaborate with high-performing infrastructure and engineering teams both within Sysdig and customer organizations to help drive the scalability and stability of our platform.
What you’ll be doing:
- Participate in a globally distributed team of Site Operations Engineers, supporting multiple Sysdig application stacks across our most critical on-premises customers
- Manage the services that comprise the Sysdig platform (Kubernetes, Cassandra, Elasticsearch, Redis, etc).
- Implement disaster recovery and reliability improvement initiatives, including performance tuning and infrastructure optimization
- Maintain and support the production environments and communicate directly with customer stakeholders
- Participate in an on-call rotation with other Site Operations Engineers
Kubernetes, Docker, Python, Cassandra, Kafka, Terraform, public/private cloud ecosystems
What you should bring:
- Experience managing Kubernetes clusters in a production environment
- Worked with containers such as Docker, Rkt (Rocket), containerd
- Aptitude for troubleshooting complex problems in high-throughput web applications and network services
- Solid understanding of Linux systems and networking
- Experience in diagnosing and troubleshooting customer-facing production service outages
- Command of a scripting language such as python or bash
- Strong sense of ownership and a focus on customer delight
Things we love to see:
- Management of any of these clusters: Cassandra, Elasticsearch, Kafka, Redis, HBase
- Proficiency with configuration management tools. We love Terraform, but you may have experience with Puppet, Chef, or SaltStack
- Experience creating and tuning Kafka, Cassandra, or Redis clusters
- Used log aggregation services like Elasticsearch or Splunk
- Experience supporting a customer-facing product hosted in a public or private cloud ecosystem
Why Join Sysdig:
Cloud-native is fundamentally changing how organizations build and run applications to fully take advantage of the cloud computing model. Sysdig is the cloud-native intelligence company making it happen. Join us and you’ll be working at the cutting-edge of infrastructure technology and the birth of an entirely new industry. Be the one who solves the hard challenges of operating Kubernetes and Containers at scale – and have fun doing it with a great group of people.
When you join Sysdig, you can expect:
- Competitive salary
- Top-notch health insurance coverage
- We offer the best of both worlds: we’re a well-funded startup ($121.5 million) with a 300+ enterprise customer base (300 and counting)
Additionally, we offer a variety of benefits and perks, such as:
- 401k with company matching up to 3%
- Flexible vacation policy
- Monthly self-improvement grant – spend on yourself however you see fit!
- Weekly team lunches and snacks every day of the week
- Monthly house cleaning allowance
- Fun team with company events and lots of espresso
Are you ready to join us?
We're excited to receive your application.