I’ve talked about cyber security in the past and even described what a robust cyber security system would look like (i.e. the system I coined as Thunderstorm). However, I never really talked about how essential cyber security really is in a data science setting (not on this blog anyway). Is it really that necessary or just a nice-to-have?
Clearly, a lot of work is done off-site nowadays when it come to data analytics on big data. This doesn't mean that data scientists always work remotely (though that would be quite feasible, if management would approve). Actually, a lot of the heavy-lifting is done on the cloud or on computer clusters that may not be where the data scientists work. So, in order for the transportation of data (both raw and processed) to be safe and private, a certain level of security is necessary. This usually takes the form of SSL, VPN, or other secure channels for transporting information efficiently and without much risk (there is always risk, especially with today’s black-hat hackers!).
Of course, the need for cyber security increases when parts of the team need to exchange information, data, and programming code, related to the data science project at hand. These people may be in the same general location, but more often than not, their data exchanges take place over the Internet (e.g. over the company’s private Github account). Fortunately, most of these systems have security embedded in them, so cyber security in this case is not something we become aware of. Imagine how things would change though, if the embedded security protocol got broken, and you’d have to rely on a different avenue for exchanging all this sensitive information with your colleagues.
Beyond these more or less apparent cases where cyber security is necessary, there are several others, more relevant to the data science tasks. For example, anonymization of data falls under the same umbrella, although it is quite different to the aforementioned cyber security processes. Naturally, the space of a blog article is insufficient for doing this topic justice. Suffice to say that cyber security may be ubiquitous, but it’s definitely not something to take for granted. If you could enrich your knowledge of this discipline, I’d urge you to do so. One way to manage that (or at least get started), is through my Ethics for Data Science video, as well as my latest book.
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy flair when it comes to technology, technique, and tests.