Recently I had a nice chat with a fellow data scientist who works at LinkedIn. After bouncing some ideas off him, I decided to make another video, based on a topic of mutual interest, partly for demonstrating to him how straight-forward the process is, once you have done the research on the topic. This video is now published on Safari here (subscription required). Enjoy!
With so many options for publishing videos online nowadays, someone may wonder “why would I want to go through hoops to get something published on Safari?” This is a valid question, and it’s equivalent to asking “why should I get published through a publishing house when I can self-publish on Amazon, or some other platform?” Although there is merit in self-publishing, there are two main issues with it: quality assurance (QA), and marketing.
Before I get into the details of all this, let me inform you that I've been down the self-publishing path and it wasn't as glamorous as people make it out to be. I published not just 1, but 3 e-books, created a website for them, and even hired people to help promote them. A few years later the only real benefit I've seen through all this was the experience I’d gained through the whole process. So, if this is your sole motivation, that’s fine. If you however want to make enough money to make the whole thing worthwhile, then there are better options out there.
Getting published on Safari (or any other professional video platform) ensures a certain quality standard. Of course not all videos there are great, but at least you won’t find many that are a total waste of time or riddled with inaccurate information, like you would on YouTube, for example. The reason is that for a video to get on the Safari site, it first goes through some QA process. If there is an issue about it, you will need to revise it. This doesn't happen often, if you know what you are doing, but it’s a good fail-safe.
Marketing is another matter where platforms like Safari excel. If something is on Safari, people will see it and may watch/read it. If you have a video on YouTube, few people will notice it and even fewer will watch the whole thing. Especially now with the new strict policies that YouTube has adopted, content creators have it hard. Unless you create a lot of content regularly, your exposure on YouTube is bound to be very limited. Of course, if you create a lot of content, the quality is bound to drop, but YouTube doesn't seem to care much about this. As long as they get lots of people watching the videos they host, and keep the ad money rolling, they are fine. And if your vid gets flagged because some oversensitive person finds it problematic for whatever reason, that’s your problem, not YouTube’s.
I’m not trying to say that YouTube is bad. Every video hosting platform has its use cases. However, for quality content that you expect to at least pay for the effort you've put into creating it, a more professional platform like Safari makes more sense. You can create a promo video and put it on YouTube, or Vimeo. But if you spend a week creating a data science or A.I. video, you are better off publishing it through proper channels, like Safari.
To give you an idea of the profits that a Safari video can yield, last year I published a book. I spent about 9 months writing it and editing it. It was considered successful and helped me get some traction in the field, while also promote the programming language it was about. One of the videos I created and published for Safari yielded about the same revenue. It had taken me about a week to create it and edit it, while I also enjoyed it more, since it felt more like a creative endeavor, rather than work. Since I don’t have a huge following, I doubt that the same video could yield the same revenue if it were published on YouTube or some other open platform.
If you find that you have content you wish to share with the world, in a professional manner, I’d recommend you consider Safari as an option. If you find that it entails too much work and you are unsure as to where you need to start, you can always go through a publisher, like Technics Publications, like I did. As Nelson Mandela eloquently said, “it always seems impossible until it's done.”
Recently someone on LI recommend that I bring more JOY to the world instead of merely complain about it (I wasn’t complaining but apparently she thought I were!). I’m not an entertainer, nor a psychology expert, but perhaps you don’t need to be in these lines of work in order to bring joy to the people you interact with. I thought about it and decided that perhaps data science could be a source of joy to other people. However, for this to happen, it needs first and foremost to be joyful to you.
Deriving joy from a challenging and oftentimes frustrating procedure such as a data science project is not easy. In fact, many people can’t stand that largest part of the work such a project entails. However, with the right mindset, even the more tedious aspects of the work can be enjoyable (i.e. be conducive to joy). So, what is this mindset that turns boredom to beauty and drudgery to delight?
Although there is no magic formula for making things more enjoyable in data science, if you have the attitude of the data science amateur when you approach a problem, your chances of enjoying it are better. This doesn’t mean being sloppy and checking Stackoverflow or Quora every 5 minutes. The amateur’s attitude is, as the word amateur implies, an attitude based on love for what you are doing. The amateur doesn’t care if they get paid for their work. They may even never get paid, but they do it anyway because they find it fulfilling. It’s like a hobby for them.
However, a data scientist still needs to be professional about her work. There are deadlines, meetings with stakeholders, and of course debugging scripts that throw errors at the worst possible time! Handling these matters takes professionalism, but it doesn’t need to be a mechanical and draining process. If you see part of your work as a data scientist (even the debugging stage) as a learning experience and have what is known in Zen as the beginner’s mind, you are bound to find everything a bit more enjoyable. It’s the joy that comes from detachment and lack of rigid expectations from your work, something that every professional knows.
Remembering all this, especially on a Monday morning, is not as straight-forward as it may seem when you think of it. However, being joyful is a matter of perspective and at the end of the day a matter of habit. Aristotle famously said that “virtue is a matter of habit” and some could argue that joy is a kind of virtue. Maybe not something you would put on your resume or talk about in an interview, but definitely something worth keeping in mind in those long mornings when you may be tempted to question your career choices. After all, if you could be joyful about data science as a field once, you can be joyful about data science work too. And if you still feel that you need some help to get your enthusiasm flowing, invigorating a joyful mindset, you can always read my book Data Science – Mindset, Methodologies, and Misconceptions. :-)
When people nowadays talk about A.I., they usually refer to the deep learning methodology and other ANN frameworks. This is great, considering that ANNs were almost considered a dead-end once, due to the inability of technology to help them exhibit their potential. Yet, now computers are more powerful than ever and GPUs are commonplace as add-ons, enabling deep learning and other ANN-based system to function at greater scales. However, there are some other A.I. methodologies that are equally valid and actually predate ANNs. These I refer to as the “hipsters of A.I.” since they were part of the A.I. field before A.I. was cool.
The A.I. hipster methodologies are A.I. frameworks that are not ANN-related. These are systems like Fuzzy Logic (FL), which came about years before ANNs reached a level of development that made them worth using in machine learning. FL systems were used heavily in data analytics, while they were even implemented in hardware. At one point, researchers even experimented with a hybrid system that is part FL and part ANN (this was called ANFIS and was in essence an Artificial Neural network that optimized the membership functions of a Fuzzy Inference System).
Another hipster methodology is the family of optimization methods. These are systems like Genetic Algorithms, Simulated Annealing, and Particle Swarm Optimization (as well as its many variants). Although the scope of these A.I. fields is limited to finding optima of particular functions (aka fitness functions), their usefulness covers a variety of fields. Even dimensionality reduction processes sometimes make use of GAs or some other optimization tool. Note that these system are not the same as the analytical optimization methods known from Calculus, since they tackle very complex search spaces, with oftentimes dozens of variables, and use a stochastic process in the back-end.
If there is one take-away from these hipster A.I. systems it is that there is more than meets the eye when it comes to artificial intelligence. That’s not to say that deep learning systems are not worth your while, but it’s good to keep an open mind about other A.I. systems that may not be as popular today, but may have played (and still play) an important role in the evolution of the field.
Also, having a solid understanding of A.I. through its various methodologies, allows us to be able to think forward in a creative way. Instead of merely trying to extend the methodologies we know, we may come up with new ones, enriching A.I. in ways that we wouldn't be able to fathom if our understanding were limited to a single A.I. framework. Isn't that what A.I. is about, finding novel ways to solve problems, leveraging clever heuristics and imaginative architectures?
So, my latest video is now available online at the Safari portal. I didn't post this yesterday, as I had already published an article for the blog. As I have been writing more articles that I can get published on DSP, I had to resort to this blog again. Also, I am not currently working on a book, so I have more time for writing for other channels (e.g. this blog, beBee, etc.).
Anyway, if you have a subscription for Safari, check out my video. I’m certain it would be worth your time. As always, I’m open to feedback via the “contact” page of this blog.
Short answer: yes. Longer answer: definitely, as long as they make a conscious effort to cultivate the necessary parts of this mindset and integrate them into a functional whole. Easier said than done, right? Perhaps. Maybe that’s why some companies ask for someone that has 15+ years of experience in the field, even if the field didn't exist 15 years ago! What they may really be asking is for someone who knows what this field entails and knows how to make things happen, using the corresponding methodologies. So, the question that naturally arises is “how can someone get this understanding of the field without having to spend a large part of their career in it?”
There are several strategies to accomplish that, none of which are easy or something that you can learn in a bootcamp. Even really good data science courses, may not be sufficient for this purpose. The reason is that the mindset of a data scientist is very diverse and not something you can put into a syllabus. There is a reason why the brightest data science practitioners seek a mentor, or some kind of personal learning experience, in order to gain some kind of mastering of the craft. Yet, as I’ve explained in the Mentoring in Data Science video, the mentor is not there to answer all your questions, even if he could answer most of them. The role of the mentor is to help you become your own mentor eventually. Of course there are exceptional people out there that don’t require a mentor, since they know everything they need to know, or they have the resources and resourcefulness to obtain this knowledge on their own. When I meet one such person I’ll be sure to blog about them!
Apart from being part of a mentorship, you can learn about the mindset of the data scientist by practicing science, in a data analytics setting. This is quite different from taking this or the other tool, applying it, and then creating some insightful visuals from the results. Practicing science also involves conducting experiments, asking deep questions, and challenging yourself and what you know. It’s realizing that all scientific theories are disprovable and not taking anything as gospel, since you are secure in the knowledge that everything in science is in flux. The only thing that’s perhaps immune to this constant change, is the mindset, the essence of the role of the data scientist. One robust way to attain this understanding is to strip away all the transient aspects of the role, one by one, through scientific research. In other words, you need to become the craft, rather than merely practice it like a technician of sorts.
In my latest book I underline several aspects of the data science craft that I’ve found, through both experience and research. They are relevant and useful for bringing about the data science mindset in someone. Of course, it is next to impossible to cover all the angles in a single book, but it is a good start. Applicable to all levels of data science practitioners, this book can at the very least make you fascinated about data science and motivate you to learn more about it, without getting consumed by the techniques or the aspects of it that are more in vogue these days (e.g. artificial intelligence). After all, just like everything else in science, data science is more of a process than anything else. It’s up to you to make it an insightful and intriguing one...
Geometry is probably one of the most undervalued aspects of Mathematics. So much so, that people consider it something that is relevant mainly for those pursuing that particular discipline, as in their minds geometry is divorced from other, more practical fields, such as data analytics. However, geometry has always been an applied discipline, intertwined with engineering. As data science and data analytics in general is closely linked to engineering, at least in certain principles, it makes sense to at least consider the relationship between geometry and data analytics.
Geometry involves the study and use of visual mathematical concepts, such as the line, the circle, and other curves, to solve various problems or prove relationships that may be used to solve other, more complex problems. The latter are referred to as theorems and are the core of the scientific literature of geometry. So, unlike other more theoretical parts of mathematics, geometry is practical at its core since it endeavors to solve real-world problems. Although the latter have become increasingly sophisticated since geometry was in its glory days (antiquity), many problems todays still rely on geometry for their solution (e.g. the field of optics, the calculation of trajectories of rockets, and more). Besides, since the times of Descartes, the famous philosopher-mathematician, geometry has become more quantifiable, particularly with his invention of analytical geometry.
Data analytics is in essence a field of applied mathematics, with an emphasis on numeric data, the kind that features heavily in geometry. Although direct connections between the shapes and the proportions of geometry with the data analytics concepts are few and far in between, the mindset is very similar. After all, both disciplines require the practitioner to find out some unknown quantity using some known data, in a methodical and logical manner. In geometry, these correspond to a particular point, shape, or mathematical relationship. In data analytics, these are variables that take the form of features (through refinement, selection, and processing in general) and target variables. Of course, data analytics (esp. data science), has a variety of tools available that facilitate all these, while in geometry it’s just the practitioner’s imagination, a pencil, some paper, and a couple of utensils. However, the mental discipline behind both fields is of the same caliber, while creativity plays an important role in both.
I’m not saying that geometry alone will make someone a good data analytics professional, or that you should give up your data science courses to take up geometry. However, if you have the time and you can also see something elegant in geometry problems, then it can be a very useful past-time, much more useful than other, strictly analytical endeavors. After all, imagination hasn't gone out of fashion, at least not in the applied sciences, so anything that can foster this faculty, while at the same time encourage mental discipline, is bound to be helpful. As a bonus, spending time with geometry is bound to help your visualization skills and enable you to view certain data analytics problems from a different angle (no pun intended). Besides, the same mindset that helped people build pyramids and accomplish several other architectural feats, is what forged many modern algorithms in machine learning, for example, turning some abstract idea or question into something concrete and measurable, be it a design or a process. Isn't that one of the key attributes of a data analytics project?
Why the Role of A.I. in the Job Market Is Very Much a Business Decision Technical Professionals Can Contribute to
Lately there is a lot of talk about AIs potentially taking people’s jobs in the future and how this is either catastrophic, or some kind of utopia (or, less often, some other stance in between). Although we as data science and A.I. professionals have little to do with the high-level decisions that have some influence on this future, perhaps we are not so detached from the reality of the situation. I’m not talking about the A.I. choir that is happy to recite its fantasies about an A.I.-based future that is akin to the sci-fi films that monetize this idea. I’m talking about grounded professionals who have some experience in the development of A.I. systems, be it for data science or other fields of application.
The problem with business decisions is that they are by their nature related to quite complex problems. As such, it is practically impossible to solve them in a clear-cut manner that doesn't invite reactions, or at least some debate. That’s why those individuals who have the courage to make these decisions are paid so handsomely. It’s not the time they put in, but the responsibility they undertake, that makes their role of value. However, it is important to make these decision as future-proof as possible, something that these individuals may not be able to do on their own. That’s why they have advisors and consultants, after all. Besides, even if some of the decision-makers are technical and can understand the A.I. matters, they may lack the granularity of comprehension that an A.I. professional has.
People who make business decisions often see A.I. as a valuable resource that can help their organization in many ways (particularly cut down on some costs, via automation or increased efficiency in time-consuming or expensive processes). However, they may not always see the implications of these moves and the shortcomings of this, still not yet mature, technology. A.I. systems are not objective, nor immune to errors. After all, most of them are black boxes, so whatever processes they have in place for their outputs are usually beyond our reach, and oftentimes beyond our comprehension. Just like it is impossible to be sure what processes drive our decisions based on our brain patterns, it is perhaps equally challenging to pinpoint how exactly the decisions of an A.I. are forged. That’s something that is probably not properly communicated to the decision makers on A.I. matters, along with the fact that AIs cannot undertake responsibility for these decisions, no matter how sophisticated these marvels of computing are.
Perhaps some more education and investigation into the nature of A.I. and its limitations is essential for everyone who has a say in this matter. It would be irresponsible to expect one set of people to navigate through this on their own and then blame them if their decisions are not good enough or able to withstand the test of time. This is a matter that concerns us all and as such we all need to think about it and find ways to contribute to the corresponding decisions. A.I. can be a great technology and integrate well in the job market, if we approach it responsibly and with views based on facts rather than wishful thinking.
Sentiment Analysis is a popular NLP topic that I've been involved in for a while now. I even wrote an article about it for a friend of mine, who is an editor at a marketing blog. Anyway, after I finally finished my latest book (Technics Publications, ETA: Fall 2017), I had some time to work on a video for Safari Books Online. This video is now online at Safari and is probably going to be followed by similar ones on NLP and NLU related topics. Any suggestions are welcomed!
Bugs are terrible and high-level mistakes are even worse! Yet, most data science books out there don't say much about them, or how we can deal with them when they arise in our data science work. Reading these books may give someone the impression that everything in the data science world is smooth and filled with rainbows, something that is (sadly) far from the truth! So, instead of being in denial about this very important matter, we can choose to tackle it calmly and intelligently. This is why I made this video, which is now available on Safari Books Online for everyone interested in having a better and more bug-free data science life. Enjoy!
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy flair when it comes to technology, technique, and tests.