This book, which is probably my last solo project, is one I wrote after discussing it with a technical publisher last summer. Although he was quite keen on the idea and was willing to offer me a good deal, I decided to stick with my current publisher (Technics Publications) for various reasons. So, a few months afterwards, this book came along. It wasn't an easy journey since at the same time I had to finalize the Data Scientist Bedside Manner book, which I co-authored with Yunus Bulut. However, with enough patience and perseverance, this book also came along shortly after the Bedside Manner was finalized.
Julia for Machine Learning is a fresh take on Julia's data science potential, with a focus on machine learning models. Although I don't cover AI-based ones in it, I make references to Julia packages you can use. The book is accompanied by a few Jupyter notebooks as well as three .jl script files containing heuristics never been seen in this language (a couple of them are brand new that I developed in the past year and are only available in Julia).
Although the book is not yet available in the publisher's website, you can find it on Amazon, both in paperback and Kindle format. Happy Julia programming!
The concept of antifragility is well-established by Dr. Taleb and has even been adopted by the mainstream to some extent (e.g. in Investopedia). This is a vast concept and it’s unlikely that I can do it justice, especially in a blog post. That’s why I suggest you familiarize yourself with it first before reading the rest of this article.
Antifragility is not only desirable but also essential to some extent, particularly when it comes to data science / AI work. Even though most data models are antifragile by nature (particularly the more sophisticated ones that manage to get every drop of signal from the data they are given), there are fragilities all over the place when it comes to how these models are used. A clear example of this is the computer code around them. I’m not referring to the code that’s used to implement them, usually coming from some specialized packages. That code is fine and usually better than most code found in data science / AI projects. The code around the models, however, be it the one taking care of ETL work, feature engineering, and even data visualization, may not always be good enough.
Antifragility applies to computer code in various ways. Here are the ones I’ve found so far:
All this may seem like a lot of work and it may not agree with your temporal restrictions, particularly if you have strict deadlines. However, you can always improve on your code after you’ve cleared a milestone. This way, you can avoid some Black Swans like an error being thrown while the program you’ve made is already in production. Cheers!
Last week I had to do a major operation for my computer. Namely, I had to replace the hard drive, as it was failing (regular warnings from the computer’s SMART diagnostics reminded me of the fact). Also, the fact that there wasn’t a single computer shop around that was a) open for business and b) willing to undertake such a task, didn’t help things. So, after waiting for about a month for a new hard disk to arrive by post, I took out my toolbox and started the operation of changing the SSD that my computer had. Naturally, I had backed up all my data beforehand and gotten a USB disk ready with an OS image installed so that I use after the new hard disk was up and running.
I won’t go into detail regarding the unbelievable challenges this process entailed (from stuck screws to a failing USB disk, to archive files that were apparently corrupt and unable to restore their content to the new hard disk). Instead, I’d like to focus on the gist of this whole experience, something that’s by far more relateable than the specifics of my situation. In essence, this whole situation was a “close to the metal” kind of experience, one that was both grounding and educational in a hands-on sense. Planning things is fairly easy but executing the plan and improvising alternative routes due to unforeseen (and possibly unforeseeable) circumstances is something we can all learn about. For example, at one point I had to find a different way to get the system running (an alternative USB disk), do a video call with a friend of mine (thanks Matt!) to troubleshoot the issue, and even come up with a contingency for backing up data in the future, so that it’s less prone to issues.
How does all this relate to data science? Well, in data science / AI projects we often have to deal with challenging situation that require us to get out of our comfort zone. We may even need to go to “closer to the metal” territory, e.g. the OS shell, for ETL tasks and such. Also, we may have to re-examine the architecture of the model used (e.g. the number of nodes for each layer, in the case of an ANN), the data used for the training of the model (do we really need all of the variables / data points?), and other factors that we often don’t think about.
Being closer to the metal is not something that concerns programmers only or computer technicians. It’s a state of mind that can come in very handy, even in high-level professions such as ours. Just like a good leader in a company has good relations with every echelon of his organization, even people he doesn’t interact with on a regular basis, a good data scientist ought to do the same. Detachment is useful in problem-solving but let’s not make it our default way of being. Sometimes we need to roll up our sleeves and handle tools we don’t usually use (e.g. the aforementioned screwdriver). With the right attitude, this can be a growing experience. Cheers!
If you don’t know what the word hyperthesis means don’t worry, it’s a term I came up myself. Stemming from the Greek “υπέρθεση” which means “hyperposition” or “superposition” depending on how you translate it, it is a term that describes transcendence of the binary state, but in a dynamic context (not to be confused with the quantum superposition which is somewhat different). In other words, it has to do with the controlled oscillation between extreme states until an equilibrium state is attained, at least at a reasonable robustness level that is predefined in the specs of the project at hand.
The Hyperthesis Principle is, therefore, a principle that describes the behavior of a complex system that is characterized by a hyperthetical behavior. Namely, if a system's state oscillates between two extreme states until it reaches an equilibrium of sorts, it exhibits hyperthetical behavior. If this behavior is a function of the parameters of the data this system relies on, then the system can, in theory, attain a stable evolutionary course that will result in equilibrium, namely a robust state.
“What does this have to do with data science, doc?” I can hear you say. Well, if you have been reading my blog, you may recall that predictive data models, especially the more sophisticated ones, are in essence complex systems. As such they may be in any state in the high bias – high variance spectrum. Now, we may tweak the parameters like a drunkard, hoping that we get them right, or we can do so through an understanding of the data and the model at hand. One way to accomplish the latter is through grid search, though this may not always be easy or affordable computationally. Imagine an SVM, for example, that is trained over a large dataset. It may take a while to find the optimum parameters for that model through a grid search, which is why we often revert to more stochastic approaches. This is where AI creeps in, even if we don't call it that. However, whenever a sophisticated optimization method is applied, the system exhibits a form of rudimentary intelligence. The more advanced the optimizer, the more it fits the bill, and calling it AI comes effortlessly.
Anyway, if we were to apply intelligence, artificial or otherwise, to a problem like that, we are in essence applying the hyperthesis principle. How well we do so, depends on how well we understand the problem we are trying to solve. However, being aware of this principle and applying it consciously can greatly facilitate the whole process. After all, all this is done through an iterative process, oftentimes involving several iterations of training and testing. Setting up the corresponding experiments can be aligned with the aforementioned principle, optimizing the whole process. So, instead of tweaking the model haphazardly, we make changes to it that make sense and navigate it towards a point in the parameter space that optimizes performance and robustness.
Understanding all this is the most important step in truly understanding AI and allowing this understanding to enhance our thinking. Also, it is at the core of the data science mindset. Cheers!
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.