SINGAPORE TOURISM BOARD
Public / Civil Service, Others
Purpose of job:
Support the Data Science team in:
- Ensuring optimised data collection and data flow
- Helping project manage and coordinate DT's data ingestion and data processing pipelines across platforms
- Ensuring that all data systems meet our business requirements and enable scalability of business processes
- Collaborate with the data science team to create data tools to assist the team in building and optimising our data-related initiatives
- Help setup, configure, deploy and validate machine learning models and analytics scripts on Amazon Sagemaker
- Develop data integrations (through API, SFTP etc) between AWS S3, Redshift instances and on-premise database instances (e.g. HANA)
- Work closely with team to identify, define, ingest and process data from multiple sources in support of model development
- Analyse and assess the effectiveness and accuracy of new data sources (e.g. datasets received from stakeholders) and annotation/ labelling of new training inputs.
- Assemble large, complex datasets that meet functional and non-functional business requirements
- Identify, design and implement internal process improvements: automating manual processes, optimising data delivery, re-designing infrastructure for greater scalability, etc.
- Work closely with vendors and internal stakeholders to project manage and coordinate DS&A’s data ingestion and data processing pipelines across platforms which can include mobile apps, SaaS platforms, on-premise databases and partner systems
- Support the integration and deployment of developed algorithms, machine learning and analytical models into current analytics system/production
- Develop monitoring toolkits to ensure that integration is executed successfully and alerts where integrations have failed
- Help architect DS&A’s data integrations and data processing flows between external / 3rd party data sources, AWS cloud datawarehouses (e.g. Redshift) and internal on-premise database instances for workloads at scale
- Provide guidance to internal teams on best practices for cloud to on-premise data integrations
- Develop set processes for data mining, data modelling and data production
- Recommend different ways to constantly improve data reliability and quality, including helping review and enhance the existing data collection procedures to include data for building analytics models relevant for industry transformation
- Help in the implementation of CI/CD and deployment of ML models in production
- At least 5 years of experience in a related field with real-world skills and testimonials from formal employers.
- Working experience for structured and unstructured datasets is essential
- Experience with big data tools: Hadoop, Spark, Hive, Sqoop, etc.
- Experience with relational SQL and noSQL databases, including Postgres and Cassandra.
- Experience with AWS cloud services: EC2, S3, EMR, Redshift, RDS, Lambda functions.
- Experience with AWS Sagemaker.
- Experience with stream-processing systems: Storm, Spark-Streaming, Kalfka etc.
- Experience with data pipeline and workflow management tools.
- Experience with object-oriented / object function scripting languages: Python, R, Java, etc.
- Experience building and optimising ‘big data’ data pipelines, architectures and datasets.
- Experience in performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
- Strong analytics skills related to working with structured and unstructured datasets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Intellectual curiosity to find new and unusual ways of how to solve data management issues.
- Experienced data pipeline builder and data wrangler who enjoys optimising data systems and building them from ground up.
- Strong project management and organisational skills.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- Experience in using Qliksense will be advantageous.
- Certified Scrum Master/ Agile Developer
- Certified AWS Cloud Architect
We regret to inform that only shortlisted candidates will be notified.
Closing on 21 May 2021