Data scientists and AI Professionals are in high demand, with high compensation and a great ability to influence business decisions.
Learn Data Science / AI and you can join these elite professionals!
However, many aspiring Data Scientists find it very challenging and even intimidating to learn Data Science due to the long list of what you need to learn such as Linear Algebra and Statistics.
Furthermore, some who have learned Data Science find it difficult to actually get a job. Why is this? One reason is that most of the courses tend to focus more on the theory and less on the practical application in a business environment.
The aim of this article is to help you be an effective Data Scientist in the industry. That is in contrast with a Data Science researcher with, perhaps, very deep knowledge of Data Science but with less ability to deliver a true business value.
Information, courses and University Degrees in Data Science (DS) and Artificial Intelligence (AI) are readily available for anyone to learn. However few are able to start, develop their learning and achieve a level where they can effectively perform the role of a Data Scientist in the industry. Reasons behind this low rate of success stem from the apparent overwhelming amount of skills, concepts and techniques needed and the intimidating nature of some of these concepts such as Linear Algebra, Calculus and Statistics. Attempting to plough through all these can be very frustrating. Consequently, many gave up or are even scared to try.
Based on my experience in learning, mentoring and teaching, I believe that with the appropriate strategies, learning and advancing Data Science skills can be made much more accessible.
So, if you are an aspiring Data Scientist, using the following proven strategies, you can do it!
These strategies include:
- Learning how to learn better
- Developing a data-driven mindset
- Applying the concept of Minimal Viable Learning
- Developing your communication skills
- Leveraging your existing expertise and strengths
- Creating your own Data Science project portfolio
The rest of the article explains these strategies and informs you how to apply them to learn and keep learning to reach a level to effectively perform the role of a Data Scientist in the industry and continuously enhance your performance.
Learning how to learn better
The main purpose of learning is to be able to apply this learning in real-life situations. In the case of Data Science skills, learning effectively means that you are able to apply your skills to uncover insights in data to help business do better.
In many Data Scientist job interviews, graduates of 4-year Data Science degrees can list many Machine Learning algorithms, but stumble if they are asked how to deal with a real, perhaps, novel business challenge.
People learn differently. Some learn better with structured classroom courses. Some are bored with lectures and prefer to learn at their own pace. Others, yet, prefer hands-on learning by experimenting and making things. I call these learning approaches modes of learning.
Modes of learning include, among many others:
- Listening to, say, a podcast
- Watching a Youtube video
- Reading an article or a book
- Attending a class
- Doing an online course
Based on my own experience in learning new technologies, mentoring and teaching, I believe that using multi-modal learning is the most effective way of learning any topic by anyone more deeply and with less frustration.
Multi-modal learning means, simply, that you use all the above modes alternately and repeatedly. Moreover, I also observed that these different modes of learning are not equally effective. Some of these modes are passive (for example, watching a youtube video or reading a book). Others are active where you do things and get back results, such as programming or collaborating with others.
The diagram below shows how these different modes of learning rank in their effectiveness in making you able to apply the learning. Remember that the aim of learning is not to restate what you have learned but to apply it to a new problem. The diagram shows that the more active you are involved in the learning the more you will remember it and know how you apply it.
Note that the diagram does not imply a sequence of these learning modes. It shows the ranking of effectiveness. The ideal approach is to use them all. If you are learning a concept or a technique and could not fully understand it using one of these modes, switch to another and alternate between the passive and active modes until it sinks in. It is, of course, ideal if you have a trainer that guide you through this process but it is not the only way to learn. Even with structured classroom learning, you have to guide yourself. After all, you are the best one who can assess your knowledge and your level of confidence in applying this knowledge.
Developing a data-driven mindset
The above diagram shows that Questioning is the most effective way to learn. Questioning is what gives you the motivation to learn what it takes to answer the questions you care to answer. Data Science is hard and fuzzy and without motivation, you may easily give up.
Questioning is especially significant in performing the role effectively, as I believe that one of the most important skills of an effective Data Scientist is the ability to critically ask better questions of data.
Critically asking better questions means that you maintain a healthy dose of scepticism. Ask:
- What is the question we are trying to answer?
- What is the value of answering it?
- Can we answer it reliably?
- What is the source of data?
- Can we trust it?
It is easy to draw premature conclusions. Machine Learning algorithms are powerful but can be opaque. Presenting conclusions that cannot be explained plainly to stakeholders is useless! So, question your questions, the data and the conclusions you are making.
Data Science inspires more effective ways of identifying, framing and solving problems. It allows the data to reveal insights that otherwise hidden. The ability to critically question, interrogate and communicate these insights is what I call the data-driven mindset. I believe that a data-driven mindset is one of the most important skills a Data Scientist must have to be effective. This skill is much more important than Math or programming!
Data-driven mindset makes you look at almost any topic or a question as a data question. This mindset makes you ask the following questions, among others:
- How can I investigate this topic or answer this question in an evidence-based way?
- What data do I need?
- What data is available?
- How reliable is this data?
- What conclusion can I make from this data?
- What is my confidence level of these conclusions?
Developing and cultivating the data-driven mindset is simple. Look at every topic or question you may have and consider it your next Data Science project. It could be your diet, job prospects or the upcoming election.
Start by identifying and collecting data needed to understand the topic and, hopefully, end up with evidence-based conclusions. The data-driven mindset can be outlined by the following activities:
- State the question you are seeking to answer
- Identify and collect data needed to answer the question
- Assess, analyse and understand the data
- Apply the appropriate algorithm to uncover insights pertained to your question
- Communicate to stakeholders and get feedback
Without a data-driven mindset, all the most powerful Machine Learning algorithms will not help you to become an effective Data Scientist! So, start right now! You do not need any deep Python, R or mathematics knowledge. You just need the data-driven mindset.
Applying the concept of Minimal Viable Learning
A naive approach to learning all Data Science skills, concepts and techniques in depth before starting to apply them on real problems is futile. Even an experienced Data Scientist is unlikely to know all Machine Learning techniques and algorithms. These algorithms are developing and multiplying at an increasingly rapid pace.
A more effective approach is just to “learn enough” to a level that enables you to tackle, perhaps small, but real life problems, apply this learning, understand the gaps in your knowledge and learn how to fill these gaps on demand, if possible, or by going back the learning mode.
You need to know what is “good enough” as a starting point and to know how much you need to improve it or optimise it for the question you are tackling.
“Good enough” in learning is analogous to the concept of the Minimal Viable Product (MPV) in a startup: obtaining maximum validated value with the least effort. The Minimal Viable Learning (MVL) aims to move you as quickly as possible from a learning mode to creating mode where you can create solutions for real-life problems by applying what you have learned. Applying your learning is the only way that you actually learn! Or as Barbara Oakley stated in her great books “A Mind For Numbers”, “Learning Is Creating”.
So, what is “good enough” in learning Data Science? When can you start applying your skills on real data? Much sooner than you think!
Based on my experience in guiding aspiring Data Scientists to successful careers, I believe that learning the following skills can make you ready to tackle real-life data questions:
- A critical data-driven mindset (do not worry, this itself develops as you deal with real data)
- Statistical Thinking (understanding that we always deal with samples of data)
- Programming (expressing your thoughts in systematic and reproducible steps). Python is the ideal programming language for Data Science and it is relatively easy to learn than other languages.
- Visualisation (show the results in an accessible way). Many Python packages provide you with what you need.
- Communicating effectively
Note that the above list of skills do not include mathematics which I know intimidates many new aspiring Data Scientists. The reason is that in Data Science we use Computational Mathematics which is much more accessible and easier to apply than the Mathematics you learned in school or university. You still need to understand the underlying concepts. I suggest you learn this through programming and visualisation.
In conclusion, with the Minimal Viable Learning mindset, you should be able to move quickly to deal with real data and real questions to get your Data Science skills development underway.
Developing your communication skills
Insights from data or from any other sources are almost worthless unless they influence a decision. To influence a decision, insights must be communicated in a clear, appealing and easy to digest way to stakeholders.
The first step in communication is to know your audience. If your stakeholders are business oriented, you must be able to speak their language. You must understand the domain of the business. You must understand the business objectives. You must understand the value of answering the question at hand. Without this understanding, you are probably doing an academic exercise that may be interesting but will not bring true business value.
The most effective way to communicate insights from data is to visualise it. Learning how you show your analysis in a way that demonstrates how these insights can be used to make a business decision is crucial to your role as a Data Scientist.
Communicating effectively is just the start to be an effective Data Scientist. Collaboration with others within your organisation to learn, gain buy-in and support is also crucial.
Leveraging your existing expertise and strengths
Unless you are a fresh graduate from university, you are likely to have previous experience in other fields. Because of the popularity of the Data Scientist role, many people from IT or business are seeking to add this skill to their portfolio.
The good news is that whatever your previous experience is, it is likely to be valuable to your new aspiration. Performing the Data Scientist role requires a very wide range of skills and knowledge. It is unlikely that somebody would have all these skills when they start. Examine how your current expertise can be leveraged to effectively perform the role of a Data Scientist.
So, leverage your current skills to differentiate and strengthen your credentials for a role as a Data Scientist
Creating your own Data Science project portfolio
Using the above learning strategies, you should quickly to reach a stage where you are doing Data Science projects. Collect these in a form that you can share with others. The best form of this is using Jupyter notebooks.
Select projects that are in the domain of your expertise and your interests and share them publicly in Github. These projects will be an essential part of your resume.
Summary, conclusion and call for action
This article aims to help you be an effective Data Scientist in the industry. The above strategies are proven to get aspiring Data Scientists to make headway towards, not just getting a job, but also becoming an effective Data Scientist.
I encourage anyone who finds the Data Scientist job appealing to apply these strategies to learn and advance their knowledge and skills in this exciting field.