Docker, Micro-services, and Cloud Computing in General, for Data Science Projects
Cloud computing has taken the world by storm lately, as it effectively democratized computing power and storage. This has inevitably impacted data science work too, especially when it comes to the deployment stage of the pipeline. Two of the most popular methods for accomplishing this is through containers and micro-services, both enabling your programs to run on some server somewhere with minimal overhead.
The value of this technology is immense. Apart from the cost-saving that derives from the overhead reduction, it makes the whole process easier and faster. Getting acquainted with a container program like Docker isn't more challenging than any other software a data scientist uses, while there are lots of docker images available for the vast majority of applications (including a variety of open-source OSes). Cloud computing in general is quite accessible, especially if you consider the options companies like Amazon offer.
The key value-add of all this is that a data scientist can now deploy a model or system as an application or an API on the cloud, where it can live as long as necessary, without burdening your coworkers or the company’s servers. Also, through this deployment method, you can scale up your program as needed, without having to worry about computational bandwidth and other such limitations.
Cloud computing can also be quite useful when it comes to storage. Many databases nowadays are available on the cloud, since it's much easier to store and maintain data there, while most cloud storage places have quite a decent level of cybersecurity. Also, having the data live on the cloud makes it easier to share it with the aforementioned data products deployed as Docker images, for example. In any case, such solutions are more appealing for companies today since not many of them can afford to have their own data center or any other in-house solutions.
Of course, all this is the situation today. How are things going to fare in the years to come? Given that data science projects may span for a long time (particularly if they are successful), it makes sense to think about this thoroughly before investing in it. Considering that more and more people are working remotely these days (either from home due to COVID-19, or from a remote location because of a lifestyle choice), it makes sense that cloud computing is bound to remain popular. Also, as most cloud-based solutions become available (e.g. Kubernetes), this trend is bound to continue and even expand in the foreseeable future.
Hopefully, it has become clear from all this that there are several angles to a data science project, beyond data wrangling and modeling. Unfortunately, not many people in our field try to explain this aspect of our work in a manner that's comprehensible and thorough enough. Fortunately, a fellow data scientist and I have attempted to cover this gap through one of our books: Data Scientist Bedside Manner. In it, we talk about all these topics and outline how a data science project can come into fruition. Feel free to check it out. Cheers!
Remote Working from Lisbon
Being quite international, I often travel and as lately I got a bit restless I decided to travel more. So, these months I’m on the road, so to speak, as I work remotely. The fact that most of my work activities lately revolve around my new book (co-authored with Yunus E. Bulut), for Technics Publications, I can work for anywhere and do so fairly easily. So, for this month or so I’m in Lisbon, Portugal.
Working remotely isn’t easy but if you are adaptable and flexible, it’s quite feasible. Besides, the companies I work with are quite trusting and flexible, so working for them remotely is not only feasible but preferable. Although it’s much easier in places like the US or the UK, where internet connections are reliable and fairly fast, it is possible to work in other places too, as long as I feel comfortable enough with the language and the everyday routine. Basically, the main thing one needs is a temporary office and a good internet connection, as well as places to hang out and make the most of one’s free time. Fortunately Lisbon offers that.
At first I looked at co-working spaces but I decided against it afterwards. The one I liked the most (at least on paper) was quite challenging to get to (you have to take the elevator from the nearby building, walk down a long corridor, climb some stairs, and then hope you’ll be let in the office space itself. The fact that the people there didn't make much of an effort to help with any of that (they somehow assumed you’d intuitively find your way in, as if you are a detective in training!) discouraged me from using that space. Also, the fact that they didn't reply to my email made me think that they weren't really that professional. I did find another co-working space where people were more professional, but it was quite far from where I’m staying and I didn't want to take a cab every day to get there. So, I ended up working from a nice coffee shop in a trendy spot of the city instead.
Even though co-working spaces were not a viable option for me in Lisbon, I still found the city very enjoyable so far. It’s much cooler than Bologna (temperature-wise), people are very friendly, and well, there is access to the ocean. What more could someone ask of a city if he’s staying there for a month? Now, I don’t know how the place is in the winter time, but I’d rather keep it this way. The houses here are not so great with insulation, while it seems that most of the people visiting Lisbon do so in the summertime, so I’d expect it to be less bustling with activity. Nevertheless, since it’s quite South, it’s bound to be warmer and sunnier than other parts of the continent.
The internet connections here are surprisingly good. At least they are good enough for a video conference and that’s good enough for me. If you want to upload or download really large files it may take a while, but here the pace of life is slower, so it doesn't seem much of a problem if you need to wait a few more minutes for syncing some files with the cloud.
Lately I came across various digital nomads who live and work in Lisbon. Some of them were more on the expats side of the spectrum, but all of them were very interesting and fun to talk to. It's also interesting that they were in a variety of professions, so the idea that you have to be a developer in order to have this lifestyle doesn't hold any water.
With remote work becoming more and more acceptable in various data science related organizations, staying at cool destinations is a more appealing options. If you find yourself on that boat, Lisbon is definitely a place to consider, especially if you are big on cities with character and natural beautiful scenery, especially during the summer time.
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.