Scientific knowledge is a greatly misunderstood matter, especially today. As we are bombarded with scientific innovations regularly and see scientists getting featured in various media, or even have films made about them (e.g. the classic "A Beautiful Mind" and "The Theory of Everything"), we may conclude that science is easy or that anyone with enough determination and some intelligence can make it in the scientific world. However, science is anything but easy, while someone's mental prowess and willpower although they play an important role in all this, are not the best predictors of their success. Sometimes, people are just at the right place at the right time (as in the case of Einstein). In any case, to get a better understanding of all this, let's break it down to its fundamental components through a high-level model of sorts and see how they come into place to make scientific knowledge come about.
First of all, scientific knowledge is the knowledge that comes about through the scientific method, preexisting knowledge or information (e.g. through observations or raw data) and concerning a particular problem. The latter may be something concrete (e.g. a machine that can transform chemical energy into work) or abstract (e.g. a mathematical model that explains how two variables relate to each other and how one of them can act as a predictor for the other). A problem may also be a weakness of an existing theory or model, that requires further understanding before it can more widely useful.
Scientific knowledge has three primary aspects: research, fidelity, and application. Research has to do with the integration of information into a theory and/or new knowledge that supplements existing knowledge. This doesn't have to be groundbreaking since even a meta-analysis on a subject can provide crucial insight in understanding the problem at hand from a more holistic perspective. Perhaps that's why the first steps in a scientific project involve exploration and a critical analysis of the literature. In any case, it's hard to imagine scientific knowledge without research at its core since otherwise, it can become static, dogmatic and even superficial. The variety of different approaches to research and the value people place on it in all scientific institutions (including the non-academic ones) attests to that.
Fidelity has to do with developing confidence in something, particularly something new or different from what's already known. If the product of one's research doesn't carry confidence with it and it's just speculation or a thin interpretation of the data, it doesn't provide much usefulness. A scientist needs to attack the new knowledge with everything she's got to ensure that it holds water. That's why experimentation is so important as well as in some cases, peer review. The latter is very useful though not always essential since if the experiments are carried out properly and the scientist has no vested interest in the new knowledge (i.e. his intentions are pure), if there are issues with it they will surface sooner rather than later. If the new knowledge remains firm, the fidelity of it will grow along with the confidence of the scientist in what it can do. Naturally, this confidence level will never reach 100% since in science there is always room for disproving something.
This brings us to the next aspect: application. This involves the application of this new knowledge into a real-world problem or some other situation that's somewhat different from the original one. It has to do with making predictions using the new theory, predictions that are of some value to someone beyond the scientist. "Application is the ultimate and most sacred form of theory," a Greek philosopher once wrote. Even if he wasn't a scientist he must have been on to something. After all, most of today's new scientific output is geared towards applications of one form or another. That's not to say that a purely theoretical kind of research is not of any value. However, purely theoretical research is still knowledge in progress. Once this research finds its way to robustness (through fidelity) and a model or a physical system that applies itself to solve a real problem, then it will have completed its evolutionary journey.
Naturally, research, fidelity, and application are not isolated from each other. There is a great deal of interaction between them, partly because they are part of an organic whole and partly because one stems from the other. Without research, we cannot talk about fidelity, nor an application. Technology (which is linked to the latter) doesn’t come out of thin air. Also, no matter how much research we do, without testing the new knowledge to ensure a level of fidelity in it, we cannot use this knowledge elsewhere without contaminating the existing pool of knowledge. What’s more, if an application doesn’t work well enough we often need to go back to the research stage to refine the underlying knowledge. Finally, sometimes it is the application that drives both research and fidelity, giving everything an end goal and a quality standard. Otherwise, we could be researching for research’s sake without ever producing any new knowledge that can benefit others.
Because of all this, it makes sense to connect three data points to form a triangle. We can also draw the circle around this triangle as a way to picture another important aspect of scientific knowledge: its scope (see figure below). No scientific theory aims to explain everything, except perhaps some ambitious projects in Physics that aim to unify all existing theories regarding the universe (though none of them have been successful while their chances of success are a highly debatable matter). That’s where scope fits in to put all this into perspective.
In data science scope is beautifully explained in the "No Free Lunch Theorem" which goes on to say that if a model or algorithm has an edge over the alternatives for a particular kind of datasets, it means that it is bound to be weaker against these same alternatives for a different kind of dataset. In other words, no model outperforms all its alternatives always, just like there is no car out there that's better than all other cars for all sorts of terrain. Naturally, as the scientific knowledge grows for a particular domain, new knowledge may come about that has a larger scope (e.g. a theory that explains more of the observed phenomena and the corresponding data). Still, most new knowledge tends to have a very specific scope that although small in relation to the whole domain, it is still useful, as there is value in niche systems (think of a pickup truck that although a bit specialized, it addresses certain use cases very effectively and efficiently).
The aforementioned aspects of scientific knowledge are the basis of the high-level model proposed here as an effort to describe it. Naturally, this is just the beginning, as other factors come into play once we examine things from a larger time frame. This, however, is something that deserves its own article, so stay tuned...
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.