Do You Know Your AI and Data Science Heroes?

By Curafy.ai Posted 2yr(s) ago Reading Time: A few minutes


Apart from having a robust data repository and architecture, choosing the right people to support the end-to-end data pipeline is equally important.

In an Enterprise Artificial Intelligence (AI) and Data Science team, getting the right composition of data engineers, data scientists and architects is important.

Let's take a look at the various roles in the data pipeline.

Business Owner

The business owner co-develops the scope of the AI project with the data science team on how AI will be used to support the business.

For example, AI can be used to enhance operational efficiency of logistic routes, or to understand customer behavior for tailored customer engagement.

In an Agile framework, the business owner also writes user stories for the project.

AI Translator

An AI translator works closely with the business owners on how AI can be used to meet business goals. 

The AI translator converts the scope of the business project requirements into AI functional specifications.

In an Agile framework, the AI translator converts the business owner’s user stories into a technical scope for development.

Getting the right mix of data engineers, data scientists and solutions architects is important to support data pipeline. 

Solutions Architect

The solutions architect designs the pipeline architecture and strategic blueprint for implementation. The solutions architect defines how data is ingested, stored, integrated and consumed by the different data systems.

They also align the data architecture to the enterprise strategy and related business systems existing within the organization.

For example, the solutions architect has to assess the existing systems and infrastructure, before designing a data architecture that can co-exist with these systems.

Data Engineer

A data engineer prepares data for analytical and operational use by building and managing the data pipeline to pull data from various source systems.

Data engineers support the ingestion, integration, consolidation and cleansing of data, before structuring the data into a ready-to-use form for data scientists to use in their machine learning algorithms or for the business to use for data analytics.

Being a data engineer requires familiarity with programming languages such as C#, Java, Python, Ruby, Scala and SQL. Other common systems used by data engineers are Hadoop datalakes, NoSQL databases, Apache Spark and Lambda architecture.

Data Scientist

A data scientist designs algorithms and uses machine learning techniques to extract insights to support business decisioning. Data scientists require technical competencies in mathematics, statistics and programming.

They need to be familiar with a host of machine learning strategies and statistical concepts such as supervised learning, unsupervised learning and reinforcement learning.

Each business case has a unique mix of statistical concepts and it is essential for data scientists to blend statistical concepts with actionable business insights to solve real world problems using the right machine learning strategies.

Machine Learning Engineer

A machine learning engineer optimizes on the models developed by data scientists to make them more effective and scalable when deployed.

A machine learning engineer needs a good statistical understanding of machine learning and deep learning models, coupled with programming skills.

In addition, a machine learning engineer also drives the best practices in model lifecycle management (MLM) and monitor the performance of these models in production.

Data Visualization Specialist

A data visualization specialist produces graphical representations of data for building reports, interactive dashboards and data mapping for business insights. This requires a mix of front-end application programming and UX/UI design skillsets to customize organization-specific applications for tactical and strategic business requirements.

For example, companies may build data visualization tools for graph network analysis to understand the relationships between individuals, companies and subsidiaries for the purpose of corporate unwrapping.

Being a data visualization specialist requires familiarity with Relational Database Management Systems (RDBMS), SQL and JavaScript libraries such as d3.js, crossfilter.js and chart.js.

A data visualization specialist is responsible for taking data and turning it into visuals (eg. pie chart, graphs, maps, and infographics).