Data Scientist Dr Lau Cher Han explains the use of JavaScript in his job and what it takes to become a data scientist.
With over 15 years of experience in software development, database technologies, data mining and analytics, Dr Lau Cher Han is a Data Scientist and full-stack developer with a wealth of experience and knowledge in big data analysis. He now teaches software engineering and big data topics at universities and conducts training programmes as well as offering consultancy services for various corporations in sectors such as travel, finance and tech startups.
We caught up with him to find out what does it takes to become a data scientist and also the use of JavaScript in managing big data.
Dr Lau Cher Han
What exactly does a data scientist do?
Our daily tasks involve gathering data, massaging data, cleaning data, modelling data, and creating visualisations. We also spend quite a significant amount of time on pre-processing data to ensure we get the highest quality of data before we move on to the actual tasks. We use these findings to answer questions that achieve specific business goals (such as reducing costs, increase productivity etc.). Then we have to communicate the results to clients or stakeholders.
As chief data scientist, I no longer to spend as much time on these as when I was a scientist. I still love to keep my hands busy and code sometimes. Now, I spend most of my time looking at the architecture, efficiency, and evaluate performance. I help to build and grow a team, and make sure that we map the right person to a position that he/she can perform best.
Could you tell us a bit about your background? How did you end up becoming a data scientist?
I started coding when I was nine, using a Lotus 1-2-3 and DBase in DOS. I wrote programs using C and Basic for business applications such as EPF and SOCSO calculator, and to control a 9-pin printer to print receipts. From there, I picked up Windows programming, evolved into an ASP programmer, developed web services using Microsoft.NET. I am fortunate to have picked up SQL and JavaScript, languages that are still going strong today.
It was never my plan to become a data scientist. I got my Diploma in Computer and Networking Technology from Singapore Polytechnic. I was amazed by networking, routing and wanted to build my own LAN gaming facilities using hubs and crossover cables. I always thought that I would become a web developer. The real turning point was my bachelor’s degree. I picked the database major, where I learnt data mining and machine learning techniques. That eventually led me to a PhD in machine learning, focused on unstructured data.
What does a data scientist use JavaScript for?
We use JavaScript for visualisations, asynchronous tasks, and handling real-time data. We use D3.js to create stunning visualisations and interactive applications for users to explore complex business data. We also use NodeJS plus SocketIO to handle real-time data. It might be counter-intuitive to use JavaScript as the language for data science tasks. JavaScript is playing a crucial part in entire data science workflow.
Most data scientists still prefer to use Python / R for conventional data science tasks. With the recent rise in machine learning, libraries like Tensorflow are already available in JS (https://js.tensorflow.org/). Now we already build ML models in browsers.
How does JavaScript help to manage all those medical data? How does it make it easy for use in the data science community?
JavaScript runs on almost all platforms including wearables. Using JavaScript reduces compatibility issues, and it enables data scientists to collect data from medical devices, and run algorithms in a streamlined fashion.
Why JavaScript over different modes such as Python?
I recommend JavaScript over other languages for beginners and startups. It’s learning curve is not as steep, as we can use JavaScript for both client-side and server-side programming. NodeJS is efficient because of its single threaded event call back mechanism. That enables us to develop scalable real-time applications. It’s all about picking the right tool for the right tasks. Although Python has developed a robust ecosystem of data science tools that help data scientists perform analytical work, I believe that JavaScript will develop an ecosystem of suitable tools of its own flavour in the near future.
Find out more here: www.cherhan.net