Becoming a nation of data professionals

0

ACCORDING to Forbes magazine, by 2025, most of the world’s major companies will collectively generate approximately 180 zettabytes of data. To put this into perspective, one zettabyte is enough to store 36 million years’ worth of high-definition quality video. As such, data has become the new oil, where companies increasingly monetise data as their main source of revenue.

Enter a new breed of professionals: the data scientists. A data scientist employs a range of statistical and computational skills to analyse and interpret complex data in order to assist businesses in their decision-making.

In 2012, a Harvard Business Review article called it the “sexiest job of the 21st century”. Meanwhile, one survey released in Singapore last year revealed that an average junior data scientist’s salary was up to RM130,000 per annum.

Despite its increasing popularity, data scientist is only one of a growing number of data science-related jobs that we may collectively call data professionals. They include data modellers, data analysts, and data engineers. According to the Malaysian Digital Economy Corporation (MDEC), Malaysia is one of a few countries in the world that prioritises the importance of data science excellence as part of its national strategy. By 2020, the country aims to churn out 20,000 data professionals. This column offers several thoughts in relation to this national goal.

 

Cultivate data-driven thinking

 

Given the anticipated future demand for data professionals, we should start thinking about cultivating a data-driven mindset early among children. Data science curricula should begin making its way into primary and secondary school classrooms. There is already evidence how this could be done.

Aspiring Minds (www.aspiringminds.com), an India-based employability assessment company, recently piloted a data science education project among the fifth through ninth graders (10- to 15-year-olds) in India and the United States, where students were given half-day, hands-on tutorials on how to perform a full-cycle data science task.

The project adopted a data science pedagogical design that aims at maximising student engagement while minimising prerequisite knowledge. The students were fully engaged as they were given highly relatable problem statements such as predicting if a particular kid is ‘friend-worthy’. To do this, they learned to construct a friendship dataset from scratch and build a predictive model from the collected data.

All that is required is the basic knowledge of counting, addition, percentages, comparisons and basic computer skills. The children’s responses were overwhelmingly positive. A similar approach could be adopted in our schools.

 

Not just for computer geeks

 

People from diverse education backgrounds should be encouraged to explore careers as data professionals, not only by those with computer or statistics-related degrees. In fact, some of the most impactful data professionals in history had no computer experience. It is the insatiable curiosity, relentless drive to solve problems, and communication prowess that often make a great data scientist.

Florence Nightingale is widely regarded as the founder of modern nursing. But many of us probably do not realise that Nightingale was also a prodigious statistician and a true pioneer in data visualisation techniques. At the height of the Crimean War of the 19th century, Nightingale embarked on analysing soldier mortality data from various British military camps. Her findings were more than revealing. Her analysis showed that more British soldiers had died in these camps from wound infections than the number of those killed in the battlefield. Employing a pie chart-like visualisation known as the ‘Coxcomb Diagram’, Nightingale’s data showed a strong correlation between soldiers’ mortality rate and the camps’ hygiene level. Subsequent improvements to the camps’ sanitary system reduced the death rate from 42 per cent to merely 2 per cent, prompting a nationwide sanitary reform by the British government.

 

Industry-academia-government synergies

 

More intensive industry-academia synergies are needed to train truly market-ready data professionals. There are already some good examples of such synergies. The European Union-funded Edison project (www.edison-project.eu) recently released the Edison Data Science Framework. The framework provides a comprehensive set of model data science curricula that can be adopted by universities worldwide. Importantly, the Edison project serves as an excellent venue for academia-industry dialogues towards creating more industry-aligned data science curricula.

Closer to home, the newly established Asean Data Analytics Exchange (Adax) (www.adax.asia) in Kuala Lumpur aims at becoming a regional collaborative hub between businesses, academia, governments and start-ups who wish to rapidly adopt data science solutions as an integral part of their operations.

Moreover, its brand-new Data Science Finishing School for Graduates initiative provides our fresh university graduates with the opportunity to take part in a six-month paid data science internship programme at various industry partners.

In short, the need for creating more data professionals in Malaysia is real. But if we are serious about growing highly capable data professionals for the future, it is important to put effort in training our young to begin developing data-centric thinking, encouraging multidisciplinary interests in the profession, and creating stronger synergies between the industries, academia and government.

 

Dr Yakub Sebastian is a lecturer with the Faculty of Engineering, Computing and Science at Swinburne University of Technology Sarawak Campus.