Questions are an integral part of scientific work, especially when research is concerned. Without questions, we can't have experiments and the capability of testing our observations with our current understanding of the world. Naturally, this applies to all aspects of science, including one of its most modern expressions: data science. This article will explore the value of questions in science, how they relate to hypotheses, and how data science comes into the picture.
Questions are what we ask to figure out how things fit in the puzzle of scientific work. They can be general or specific (the latter being more common), and they are always linked to hypotheses in one way or another. The latter are the formal expressions of questions and usually take a true or false label, based on the tests involved. Tests are related to experimentation, whereby we see how the evidence (data) accumulated fits the assumptions we make. Without questions and hypotheses, we cannot have scientific theories and anything else useful in this paradigm of thought. Note that all this is possible because we are open to being wrong and are genuinely curious to find out what's going on in the area we are investigating.
Questions and hypotheses are also vital because they are a crucial part of the scientific method, the cornerstone of science. Since its inception a few centuries ago, the scientific method has laid the framework of scientific work, particularly in forming new theories and developing a solid understanding of the world. It's closely tied to experimentation since, at its core, science is empirical and relies on observable data rather than predefined ideas. Although it's not as prominent as it used to be, the latter is in the realm of philosophy, although it has a role in science. The scientific method is precise and relies on a disciplined methodology in scientific work, but it's also open to creativity. After all, not all questions are deterministic and predictable; some of them may be led by a deeper understanding of the world, led by intuition.
But how does all this relate to data science? First of all, data science is the application of scientific methodologies in problems beyond scientific research. This way, it is a broader application of scientific principles and the scientific method, involving data from all domains. Because of this, it's crucial to think about questions and try to answer them as systematically as we can, using the various data analysis methodologies at our disposal. Not all of these methodologies are as clear-cut and easy as Statistics. Still, all of them involve some models that try to describe or predict what would happen when new data comes into play, something rudimentary in scientific work.
If you wish to learn more about this topic and data science's application of the scientific method, check out my latest book Doing Science Using Data Science. In this book, my co-author and I explore various science-related topics, emphasizing the practical application of data science methodologies, describing how these ideas are implemented in practice. The book is accompanied by a series of Jupyter notebooks, using Julia and Python. Check it out when you have a moment. Cheers!
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.