Cloud Operations System Engineer / Site Reliability Engineer

C3 Energy Redwood City, CA

RESPONSIBILITIES:

* Maximize system uptime and availability, ensuring functional and performance SLAs

* Establish end-to-end monitoring and alerting on all critical aspects

* Solve complex problems for critical services and build automation to prevent problem recurrence

* Influence and create new designs, architectures, standards, and methods for supporting the platform

* Initiate and lead scripting and automation to streamline system updates and upgrades

* Set up critical infrastructure, tools and framework to streamline the deployment cycle

* Work cross functionally with Services and Engineering teams

REQUIREMENTS:

* Demonstrated experience in deploying, managing, and operating scalable and fault tolerant Linux/JVM-based infrastructure in AWS (or other public cloud)

* Expertise in Linux Operating Systems, Networking, and Database concepts

* Experience with Cassandra (or other NoSQL alternative)

* Expertise in cloud providers, such as Amazon Web Services.

* Experience with configuration management systems such as Chef or Puppet.

* Experience in Ruby or Python; to automate and monitor systems

* Excellent problem solving, critical thinking, and communication skills

* Experience supporting as a DevOps or sys admin for commercial SaaS solutions

* BS or MS in Computer Science, related field, or equivalent professional experience

Similar jobs you might like

Senior Site Reliability Engineer
Patreon San Francisco, CA
Senior Site Reliability Engineer
WePay Redwood City, CA