Intuition is probably the most undervalued quality in data science, even though it has played a prominent role in Science, throughout the years. Even in mathematics, intuition is very important, since it illuminates avenues of research or novel ways of tackling a particular problem. However, even though intuition is of high regard in most scientific fields today, in data science it is not valued much, especially lately, when the emphasis is on the engineering and modeling aspects of the field.
Data science involves a lot of nitty gritty work, which is why Kaggle competitions are a bit misleading when it comes to introducing the field to newcomers. Despite their practical value, they emphasize one particular aspect of data science (the most interesting one), creating the belief that it’s all about clever feature engineering and models. So, when someone goes deeper into the field they tend to shift to the other extreme and focus on the data engineering aspects of it, which constitute well over 80% of the actual work a data scientist does. Preparing the data, formating it, playing around with the variables and turning them into features, are parts of the data engineering part of the pipeline that require more grit than intuition or even intelligence. That’s all fine, but many people forget to get back to the bigger picture afterwards: what data science is all about. If you are thinking “insights” at this point, you are on the right track. However, to bridge the data to these insights, we need some intuition.
We need intuition to figure out the most information-rich features and build them. Without intuition, we wouldn’t be able to figure out what models would be best to try out (contrary to what many people think, there are A LOT of models we can use, not just the more popular ones that appear in textbooks and data science MOOCs). Also, if we are to employ deep learning, which is a great way to tackle the most challenging problems out there, especially if we have a truckload of data at our disposal, then we need intuition there too, in order to figure out what architecture to employ and how to best leverage the meta-features that these deep ANNs will construct after they are trained. Things are not plug-and-play as some people tend to evangelize, especially when it comes to these modern tools. The need for some broader perspective and strategic thinking, both of which stem from intuition, is evident in all data science projects.
How do we develop intuition? Well, that’s the million dollar question. In my experience, it stems from intelligence as well as latteral thinking. When we think things more openly, much like an artist does, we tend to leverage more that part of our mind that is related to intuition. If you manage to come up with a fairly original way of dealing with the data (even if someone else somewhere has come up with it too), if you figure out some clever heuristic that will cut down the computational cost of your process, and if you build a novel ensemble to harness the signals from various models, then you are using your intuition constructively and you are thinking like a data science creative.
Intuition is closely related to creativity, which is why it is often the case that people who build data science teams look out for this characteristic in their recruits, especially if it is for a more senior position. However, for some reason they don’t use that word much (intuition) since it has some undesirable connotations. Oftentimes, intuition is considered to be in the domain of pseudo-science, since its fruits fail to be understood by the more down-to-earth practitioners of data science. Nevertheless, intuition has been used successfully by many inventors, debunking the claim that it is the domain of crackpots. The problem is that it is very hard for most people to assess intuition in an individual, which is why it is often neglected in more hands-on fields. However, if you have used your intuition in a project and have come up with a creative approach to it, that is not only original but also apparent to someone who views your work, then that’s a sign that these people cannot ignore.
So, even if intuition is not so fashionable today, when fancy A.I. tech is all the rage, it still has a place in data science. Just like the fabled warrior-magicians in the Star Wars sage manage to combine both the mastery of hands-on techniques with an intuitive approach to life (through the Force), so can we, as data scientists, employ both technical skill with intuition, to tackle the challenges of big data problems and derive actionable insight from the chaotic data we are given.
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy flair when it comes to technology, technique, and tests.