Site Reliability Engineer

  • Job category
  • Job level
  • Contract type
    Full Time
  • Location
  • Salary
    S$6000 - S$9000

Job Description

Get to know the Role:

  • As a Site Reliability Engineer (SRE) you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems.
  • Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation.
  • You’ll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks.
  • As an SRE you’ll be focused on running better production applications and systems.
  • SRE is a key contributor to core infrastructure and functional development teams throughout the life cycle to help support software for reliability and scale.
  • Key areas of focus include automation, application/platform uptime and quality, packaging/distribution techniques, platform design “operability”, analytics, deployment, adoption, and tool development, among others.
  • The position will wear many hats from owning day to day health and performance, to identifying incidents/developing remediation plans, to working with open source software and experienced packaging techniques, to working with development teams and contributing to the strategic roadmap and execution.
  • Candidates from a variety of software, platform, or automation engineering backgrounds will be considered for this position.

The day-to-day activities:

  • Design, code, test and deliver software to automate manual operational work
  • Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
  • Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes
  • Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions
  • Perform the L1/L2/L3 support activities for the Production Support project with analysis and design work, including impact of requirements across all system components
  • Build and drive adoption for greater self-healing and resiliency patterns
  • Design automated software and product upgrades, change management, and release management solutions
  • Participate in the 24x7 support coverage as needed

The must haves:

  • Bachelor's degree in information systems, information technology, computer science, or similar.
  • 3+ years professional experience in a software management position.
  • Experience with dockers / containers / k8s.
  • Direct production operations experience in a cloud environment.
  • Experience contributing to technology and product strategy.
  • Experience leading capability building initiatives across diverse areas such as infrastructure and operations automation, software quality, delivery automation and other core engineering.
  • Demonstrated experience of driving operational efficiency and transparency of a growing engineering organization.

Closing on 09 Dec 2021

orview more job listings from this company