Senior Site Reliability Engineer
GXS BANK PTE. LTD.
S$7500 - S$11250
Get to know our Team:
We are living in dynamic times. Technology is reshaping how we live, and we want to use it to redefine how financial services are offered. Grab is the leading technology company in Southeast Asia offering everyday services to the masses. Singtel is Asia’s leading communications group connecting millions of consumers and enterprises to essential digital services. This is why we are coming together to unlock big dreams, and financial inclusion for people in our region is just one of them. We want to build a digital bank with the right foundation - using data, technology and trust to solve problems and serve customers.If you have what it takes to help build this new Digibank with us.
Get to know the Role:
- As a Site Reliability Engineer (SRE) you will help build a meaningful engineering discipline, combining software and systems to develop creative engineering solutions to operations problems.
- Much of our support and software development focuses on optimizing existing systems, building infrastructure and reducing work through automation.
- You’ll join a team of curious problem solvers with a diverse set of perspectives who are thinking big and taking risks.
- As an SRE you’ll be focused on running better production applications and systems.
- SRE is a key contributor to core infrastructure and functional development teams throughout the life cycle to help support software for reliability and scale.
- Key areas of focus include automation, application/platform uptime and quality, packaging/distribution techniques, platform design “operability”, analytics, deployment, adoption, and tool development, among others.
- The position will wear many hats from owning day to day health and performance, to identifying incidents/developing remediation plans, to working with open source software and experienced packaging techniques, to working with development teams and contributing to the strategic roadmap and execution.
- Candidates from a variety of software, platform, or automation engineering backgrounds will be considered for this position.
The day-to-day activities:
- Design, code, test and deliver software to automate manual operational work
- Troubleshoot priority incidents, facilitate blameless post-mortems and ensure permanent closure of incidents
- Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes
- Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions
- Perform the L1/L2/L3 support activities for the Production Support project with analysis and design work, including impact of requirements across all system components
- Build and drive adoption for greater self-healing and resiliency patterns
- Design automated software and product upgrades, change management, and release management solutions
- Participate in the 24x7 support coverage as needed
The must haves:
- Bachelor's degree in information systems, information technology, computer science, or similar.
- 3+ years professional experience in a software management position.
- Experience with dockers / containers / k8s.
- Direct production operations experience in a cloud environment.
- Experience contributing to technology and product strategy.
- Experience leading capability building initiatives across diverse areas such as infrastructure and operations automation, software quality, delivery automation and other core engineering.
- Demonstrated experience of driving operational efficiency and transparency of a growing engineering organization.
Closing on 10 Dec 2021orview more job listings from this company