Being quite international, I often travel and as lately I got a bit restless I decided to travel more. So, these months I’m on the road, so to speak, as I work remotely. The fact that most of my work activities lately revolve around my new book (co-authored with Yunus E. Bulut), for Technics Publications, I can work for anywhere and do so fairly easily. So, for this month or so I’m in Lisbon, Portugal.
Working remotely isn’t easy but if you are adaptable and flexible, it’s quite feasible. Besides, the companies I work with are quite trusting and flexible, so working for them remotely is not only feasible but preferable. Although it’s much easier in places like the US or the UK, where internet connections are reliable and fairly fast, it is possible to work in other places too, as long as I feel comfortable enough with the language and the everyday routine. Basically, the main thing one needs is a temporary office and a good internet connection, as well as places to hang out and make the most of one’s free time. Fortunately Lisbon offers that.
At first I looked at co-working spaces but I decided against it afterwards. The one I liked the most (at least on paper) was quite challenging to get to (you have to take the elevator from the nearby building, walk down a long corridor, climb some stairs, and then hope you’ll be let in the office space itself. The fact that the people there didn't make much of an effort to help with any of that (they somehow assumed you’d intuitively find your way in, as if you are a detective in training!) discouraged me from using that space. Also, the fact that they didn't reply to my email made me think that they weren't really that professional. I did find another co-working space where people were more professional, but it was quite far from where I’m staying and I didn't want to take a cab every day to get there. So, I ended up working from a nice coffee shop in a trendy spot of the city instead.
Even though co-working spaces were not a viable option for me in Lisbon, I still found the city very enjoyable so far. It’s much cooler than Bologna (temperature-wise), people are very friendly, and well, there is access to the ocean. What more could someone ask of a city if he’s staying there for a month? Now, I don’t know how the place is in the winter time, but I’d rather keep it this way. The houses here are not so great with insulation, while it seems that most of the people visiting Lisbon do so in the summertime, so I’d expect it to be less bustling with activity. Nevertheless, since it’s quite South, it’s bound to be warmer and sunnier than other parts of the continent.
The internet connections here are surprisingly good. At least they are good enough for a video conference and that’s good enough for me. If you want to upload or download really large files it may take a while, but here the pace of life is slower, so it doesn't seem much of a problem if you need to wait a few more minutes for syncing some files with the cloud.
Lately I came across various digital nomads who live and work in Lisbon. Some of them were more on the expats side of the spectrum, but all of them were very interesting and fun to talk to. It's also interesting that they were in a variety of professions, so the idea that you have to be a developer in order to have this lifestyle doesn't hold any water.
With remote work becoming more and more acceptable in various data science related organizations, staying at cool destinations is a more appealing options. If you find yourself on that boat, Lisbon is definitely a place to consider, especially if you are big on cities with character and natural beautiful scenery, especially during the summer time.
Recently I attended JuliaCon 2018, a conference about the Julia language. There people talked about the various cool things the language has to offer and how it benefits the world (not just the scientific world but the other parts of the world too). Yet, as it often happens to open-minded conferences like this one, there are some unusual ideas and insights that float around during the more relaxed parts of the conference. One such thing was the Nim language (formerly known as Nimrod language, a very promising alternative to Julia), since one Julia user spoke very highly of it.
As I’m by no means married to this technology, I always explore alternatives to it, since my commitment is to science, not the tools for it. So, even though Julia was at an all-time high in terms of popularity that week, I found myself investigating the merits of Nim, partly out of curiosity and partly because it seemed like a more powerful language than the tools that dominate the data science scene these days.
I’m still investigating this language but so far I’ve found out various things about it that I believe they are worth sharing. First of all, Nim is like C but friendlier, so it’s basically a high-level language (much like Julia) that exhibits low-level language performance. This high performance stems from the fact that Nim code compiles to C, something unique for a high-level language.
Since I didn’t know about Nim before then, I thought that it was a Julia clone or something, but then I discovered that it was actually older than Julia (about 4 years, to be exact). So, how come few people have heard about it? Well, unlike Julia, Nim doesn’t have a large user community, nor is it backed up by a company. Therefore, progress in its code base is somewhat slower. Also, unlike Julia, it’s still in version 0.x (with x being 18 at the time of this writing). In other words, it’s not considered production ready.
Who cares though? If Nim is as powerful as it is shown to be, it could still be useful in data science and A.I., right? Well, theoretically yes, but I don’t see it happening soon. The reason is three-fold. First of all, there are not many libraries in that language and as data scientists love libraries, it’s hard for the language to be anyone’s favorite. Also, there isn’t a REPL yet, so for a Nim script to run you need to compile it first. Finally, Nim doesn’t integrate with popular IDEs such as Jupyter and Atom, and as data scientists love their IDEs, it’s quite difficult for Nim to win many professionals in our field without IDE integration.
Beyond these reasons, there are several more that make Nim an interesting but not particularly viable option for a data science / A.I. practitioner. Nevertheless, the language holds a lot of promise for various other applications and the fact that it’s been around for so long (esp. considering that it exists without a company to support its development) is quite commendable. What’s more, there is at least one book out there on the language, so there must be a market for it, albeit a quite niche one.
So, should you try Nim? Sure. After all, the latest release of it seems quite stable. Should you use it for data science or A.I. though? Well, unless you are really fond of developing data science / A.I. libraries from scratch, you may want to wait a bit.
The previous week has been intense as I was working on a part of the proposal for a new project, attending a conference, and figuring out some things about my publication-related endeavors. With all that in mind, it was natural that I didn’t post anything on the blog, even though I wanted to. However, as my focus is always on quality, I didn’t want to just publish a rushed post or a simple announcement. That’s why I waited until now to get a new post out.
The Event of the Decade
On 8/8/18 the new release of Julia came out. This wasn’t just any release though, but the big one: 1.0. It is really hard to overestimate the importance of this release, even if the most conservative Julia users still feel that it would take a few months before the full force of v. 1.0 will reach the world. After all, just because Julia is now production ready, it doesn’t mean that everyone using it can benefit from this the same way, since the packages people depend on may take some time before they are fully compatible with the new release. Nevertheless, those who prefer to rely on our own code primarily can experience the benefits of Julia right now. Whatever the case, the fact is that Julia has now entered a new era, since it has proven itself to be robust and even faster than ever before.
To give you an example of that, in the conference there was a talk about how Julia is applied in Robotics, via a specialized package some Robotics researcher developed recently. Even though this guy had worked with C++ before for the same project, he eventually shifted to Julia for the vast majority of the code, since it was good enough (i.e. sufficiently fast and reliable) to perform challenging optimization-related tasks in real-time. To be exact, the operations were 36% faster than real-time, enabling a robot operation frequency of 1000 Hz, at least in the simulations he was conducting. At the time of this writing, no other language has accomplished that, without having significant dependencies on C libraries.
Ramification of Version 1.0 in Data Science and A.I.
But how does all this affect us, as data science and A.I. professionals? Well, Julia isn’t evolving merely on the Base package or the fairly niche application of Robotics. In fact, there are now full-fledged packages that cover a variety of data science related applications, including deep learning models. In the conference there was a talk about the Knet package, for example, which is a deep learning package built entirely on Julia. Personally I don’t know any other deep learning tool that has been built entirely on a data science language (I don’t consider C++ to be such a language by the way, since data scientists tend to use high-level languages mainly). What’s more, this deep learning tool has comparative performance with other more established frameworks, while in one of the benchmarks it outperformed all of them.
But data science is not just deep learning. There is a significant part of it that has to do with more conventional methods, mainly deriving from Statistics. What about Julia’s role in all that? Well, Julia has a number of fairly mature packages in Stats, including Bayesian Stats. What’s more, there is a new book being written right now on Stats with Julia, by a couple of academics who teach Stats in a university in Australia. So, it’s safe to say that Julia is pretty evolved in this aspect of data science too.
More specialized parts of data science, such as Graph Analytics also have corresponding packages in Julia, while the LightGraphs package I talked about in my Julia for Data Science book, is still out there, now better than ever. Data engineering packages also exist, while there are several packages on optimization too, something data science can benefit from greatly, for the more challenging problems tackled.
From all this, I believe it’s fair to say that the age-old argument that “Julia is not ready for DS / A.I. because x, y, z” is now as ridiculous as the belief that the number of available libraries is what makes a language more suitable for data science. Sure, packages can help, but it’s mostly due to their quality, not their quantity, while how fast a language runs is an important factor when analyzing the truckload of factors in a modern data model. That’s not to say that Python, Scala, and other data science languages are not useful any more, but ignoring the value of Julia in the data science / A.I. arena is silly and to some extent unprofessional.
Recently I decided to do something a bit more experimental, which very few people have tried covering in a video. So, I tackled a more niche sub-topic of Natural Language Processing, related to custom-made features and their construction. Despite its seemingly simple nature, this skill is something that can differentiate you from a newcomer in NLP. This A.I. video assumes some knowledge of NLP but you don’t need to be a seasoned data scientist to follow. Also, I provide several examples, as well as an original taxonomy to help you organize all this information in your mind. So, check it out on Safari when you have a moment.
Note that a subscription to the Safari portal is required in order to view the video in its entirety. With the subscription you have access to a large number of books and videos, across various publishers and domains.
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.