Although when people think of math in data science, it’s usually Calculus, Linear Algebra, and Graph Theory that comes to mind, Geometry is also a very important aspect of our craft. After all, once we have formatted the data and turned into a numeric matrix (or a numeric data frame), it’s basically a bunch of points in an m-dimensional space.
Of course, most people don’t linger at this stage to explore the data much since there are various tools that can do that for you. Some people just proceed to data modeling or dimensionality reduction, using PCA or some other method. However, oftentimes we need to look at the data and explore it, something that is done with Clustering to some extent. The now trending methodology of Data Visualization is very relevant here and if you think about it, it is based on Geometry.
Geometry does more than just help us visualize the data though. Many data models use geometry to make sense of the data, for example, particularly those models based on distances. I talked about distances recently, but it’s hard to do the topic justice in a blog post, especially without the context that geometry offers.
Perhaps geometry seems old-fashioned to those people used to fancy methods that other areas of math offer. However, it is through geometry that revolutionary ideas in science took root (e.g. Theory of Relativity) while cutting edge research in Quantum Physics is also using geometry as a way to understand those other dimensions and how the various fundamental particles of our world relate to each other.
In data science, geometry may not be in the limelight, unless you are doing research in the field. However, understanding it can help you gain a better appreciation of the data science work and the possibilities that exist in the field. After all, a serious mistake someone can make when delving into data science is to think that the theory in a course curriculum or some book is all there is to it. When you reduce data science to a set of methods and algorithms you are basically limiting the potential of it and how you can use the field as a data scientist. If however, you maintain a sense of mystery, such as that which geometry can offer, you are bound to have a healthier relationship with the craft and a channel for new ideas. After all, data science is still in its infancy as a field while the best data science methods are yet to come...
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.