DevOps Engineer (Artificial Intelligence Platforms)
GOVERNMENT TECHNOLOGY AGENCY
Public / Civil Service
The Government Technology Agency (GovTech) aims to transform the delivery of Government digital services by taking an outside-in" view, putting citizens and businesses at the heart of everything we do. We also develop the Smart Nation infrastructure and applications, and facilitate collaboration with citizens and businesses to co-develop technologies.
Join us as we support Singapore’s vision of building a Smart Nation - a nation of possibilities empowered through info-communications technology and related engineering.
Who we are
GovTech's Data Science and Artificial Intelligence (DSAI) division uses technology and data to help deliver high-quality digital services to citizens and businesses in Singapore. We build software products for government agencies to better understand and use their data to improve operations and decision making.
What the role is
You will work on both small and large scale projects, building and maintaining the infrastructure behind them. We are fully aligned with the Government’s cloud-first policy and you will bring capability to help us realize this. The role includes:
- Managing the development, deployment, orchestration and maintenance of data pipelines for our Data Science products
- Providing DevOps architecture implementation and operational support
- Architecture and planning for cloud deployments (Private and Public cloud)
- Developing and managing processes, automation, best practices, documentation
- Development and operation of continuous integration and deployment pipelines.
What it is like working here
We build products that serve a variety of agency users, who use them to solve highly meaningful problems pertinent to our society, from transportation, to education, to healthcare. The public sector is full of opportunities where even the simplest software can have a big impact on people’s lives. We are here to improve how we live as a society through what we can offer as a government.
- Rapid Prototyping - Instead of spending too much time debating ideas we prefer testing them. This identifies potential problems quickly, and more importantly, conveys what is possible to others easily.
- Reliable Productization - To scale an idea, a prototype or a Minimum Viable Product to a software product, we scrutinize and commit to its usability, reliability, scalability and maintainability.
- Ownership - In addition to technical responsibilities, this means having ideas on how things should be done and taking responsibility for seeing them through. Building something that you believe in is the best way to build something good.
- Continuous Learning - Working on new ideas often means not fully understanding what you are working on. Taking time to learn new architectures, frameworks, technologies, and even languages is not just encouraged but essential.
As we often deal with big data and computing requirements, you are also able to take a long-term strategic view of the platforms you work on, and help provide this perspective to the team. To do so, you will:
- Effectively prioritize and execute tasks in a high-pressure environment
- Develop and maintain internal engineering productivity tools and environments
- Perform independent research into product and environment issues as required
- Monitoring automation to effectively detect/predict/prevent issues in the environment and code base
- Future-proofing the technical environments and ensuring extreme high levels of automation, availability, scalability and resilience
- Hands-on coding and mentoring, working in highly collaborative teams and building quality environments
- Have knowledge in and/or continuously learn lots of different open source technologies and configurations
What we are looking for
The customers for our products are normally agency users, which means that breadth of knowledge in government IT infrastructure and experience in government networks will help. Since our direction is cloud-first, you will likely have some experience in patch/update scheduling, and knowledge of security incident response procedures. A disciplined approach and strong problem-solving instincts are fundamental to succeed. Your aptitude for completing the tasks and attitude to continuous learning are more valued than any formal certification. To succeed, you will need to possess some of the following:
- Excellent problem solving and methodical troubleshooting skills
- Strong knowledge and experience in DevOps automation, containerization and orchestration using tools eg. Ansible, Airflow, Docker, Kubernetes, Terraform, Artifactory/Nexus Sonatype
- Cloud computing deployment and management experience in AWS, GCP
- Strong understanding of Apache Spark/Flink, Hadoop, distributed file systems and resource scaling/scheduling, streaming message queues (RabbitMQ, Kafka)
- Strong understanding of virtualization and networking concepts
- Experience with patch maintenance, regression testing and security incident response
- Experience with interactive workloads, machine learning toolkits and how they integrate with cloud computing e.g. Databricks, KX
- Experience with highly scalable distributed systems
- Experience with on-premise deployments, government application and networking infrastructure/routing
- Breadth of knowledge - OS, networking, distributed computing, cloud computing
Closing on 20 Jan 2022