With more and more people getting into data science and AI these days, certain aspects of the field are inevitably over-emphasized while others are neglected. Naturally, those providing the corresponding know-how are not professional educators, even if they are competent practitioners and very knowledgeable people. As a result, a lot of emphasis is given to the technical aspects, such as math and programming related skills, data visualization, etc. What about domain knowledge though? Where does that fit in the whole picture?
Domain knowledge is all that knowledge that is specific to the domain data science or AI is applied on. If you are in the finance industry, it involves economics theory as well as how certain econometric indexes come into play. In the epidemiology sector, it involves some knowledge as to how viruses come about, how they propagate, and their effects on the organisms they exploit. Naturally, even if domain knowledge is specialized, it may play an important role in many cases. How much exactly depends on the problem at hand as well as how deep the data scientist or AI practitioner wants to go into the subject.
Domain knowledge may also include certain business-related aspects that also factor in data science work. Understanding the role of the different individuals who participate in a project is very important, especially if you are tackling a problem that is too complex for data professionals alone. Oftentimes, in projects like this, subject matter experts (SMEs) are utilized and as a data scientist or AI professional you need to liaise with them. This is not always easy as there is limited common ground that can be used as a frame of reference. That's where some general-purpose business knowledge comes in handy.
Naturally, incorporating domain knowledge in a data science project is a challenge in and of itself. Even if you do have this non-technical knowledge, you need to find ways to include it in the project organically, adding value to your analysis. That's why certain questions, particularly high-level questions that the stakeholders may want to be answered, are very important. Pairing these questions with other, more low-level questions that have to do with the data at hand, is crucial. Part of being a holistic, well-rounded data science / AI professional involves being able to accomplish this.
Of course, exploring this vast topic in a single or even multiple blog posts isn’t practical. Besides, how much can someone go into depth about this subject without getting difficult to read, especially if you are accessing this blog site via a mobile device? For this purpose, my co-author and I have gathered all the material we have accumulated on this topic and put it in a more refined form, namely a technical book. We are now at the final stages of this book, which is titled “Data Scientist Bedside Manner” and is published by Technics Publications. The book should be available before the end of the season. Stay tuned for more details...
Lately, I've made some progress on a data science research project I've been working on for the past couple of years. I’ve hinted about it in previous posts, though due to the nature of this work I’ve abstained from going into any details. Besides, most people are not that open to new ideas, unless they are marketed by some established company or some renowned professor.
Anyway, the other day I made a breakthrough in this work, something that can have significant implications in how we deal with private data. What’s more, I've developed a new way of summarizing a dataset (which is innately different from sampling it), with minimal loss of information. This opens new avenues of research and the possibilities of new data science and A.I. methods are vast. Naturally, I'll need to look into this more, so any online writing I do will have to take second priority.
Parallel to that, I’ve been working on another project lately, something I plan to continue for the foreseeable future. However, an important part of it is completed, which I’ll make sure I’ll announce in the next few days.
As a result to all this, I’m now more open to hosting other people’s articles on data science and A.I. topics, given that they are not spammy in any way. Back-links are also acceptable, given that they are towards relevant sites to the articles. So, if you have something you’d like to contribute to the blog, now is a great opportunity to do so.
Whatever the case, I plan to continue writing on this blog albeit at a slower pace for the time being, so stay tuned!
Rhythm in learning is something that most people don't think about, mostly because they take it for granted. If you were educated in a structure-oriented country, like most countries in the West, this would be instilled in you (contrary to countries like Greece where disorder and lack of any functional structure reign supreme). However, even then you may not value it so much because it is not something you're conscious of always. The need to be aware of it and make conscious effort comes about when you are on your own, be it as a freelancer or a learner in a free-form kind of course (i.e. not a university course of a boot camp). And just like any other real needs, this needs to be fulfilled in one way or another.
The idea of this article came about from a real situation, namely a session with one of my mentees. Although she is a very conscientious learner and a very good mentee, she was struggling with rhythm, mostly due to external circumstances in her life. Having been there myself, I advised her accordingly. The distillation of this is what follows.
So, rhythm is not something you need to strive for as it's built-in yourself as an innate characteristic. In other words, it's natural, like breathing and should come by on its own. If it doesn't, it's because you've put something in its way. So, you just need to remove this obstacle and rhythm will start flowing again on its own. This action of removal may take some effort but it's a one-time thing (unless you are in a very demanding situation in your life, in which case you need to re-set your boundaries). But how does rhythm manifest in practice? It's all about being able to do something consistently, even if it's a small amount certain days.
In my experience with writing (a truly challenging task in the long run, particularly when there is a deadline looming over you), I make it a habit of writing a bit every day, even if it's just a single paragraph or the headings and subheadings structure of a new chapter. Sometimes I don't feel like working on a book at all, in which case I take the time to annotate the corresponding Jupyter notebooks or write an article on this blog. Whatever the case, I avoid idleness like the plague since it's the killer of rhythm.
When it comes to learning data science and A.I., rhythm manifests as follows. You cultivate the habit of reading/coding/writing something related to the topic of your study plan or course curriculum. Even a little bit can go a long way since it's not that bit that makes the difference but the maintenance of your momentum. It's generally harder to pick up something that has gone rusty in your mind, particularly coding. However, if you coded a bit the previous day, it's so much easier. If you get stuck somewhere, you can always work on another drill or project. The important thing is to never give up and go idle.
Frustration is oftentimes inevitable but if you leverage it properly, it can be a powerful force as it has elements of willpower in it, willpower that doesn't have a proper outlet and it trapped. This is what can cause the break of rhythm but what can also remedy it. You always have the energy to carry on, even at a slower pace sometimes. You just need to tap into it and apply yourself. That's when having a mentor can do wonders, yet even without one, you can still manage, but with a bit more effort. It's all up to you!
Being a data science author is not a simple matter. With the bookshelves brimming with data science books these days, one may come to think of this as being something easy and accessible to everyone. Perhaps the latter is true since nowadays everyone can publish a data science book through some publisher with very low standards or he can publish the book himself, thanks to Amazon and other sites that are happy to make your book available to everyone. Some people stoop so low as to give away their book for free, something that says more about the quality of their book than it does for their generosity (of course there are exceptions to this, since many academics prefer this approach since the academic publishers make their books inaccessible to most of their students due to the high price tag they force on them). Whatever the case, being a data science author involves more than just putting a book out there for the world to view and perhaps read.
In my experience for the past 10 years or so, authoring a book is quite different to just writing one and making it accessible to the public. Authoring a book is all about providing a certain level of quality and going through the oftentimes exhausting process of revisions and edits, once the first draft is completed. Fortunately, the first book I authored was on something I had spent 5 years working on, namely my PhD project. The book was my PhD thesis, which is much like a normal technical book, though geared towards a more limited audience.
Other books I've authored were mostly through a publisher, except for some ebooks and a novel ("I, AGI: the adventures of an advanced Artificial Intelligence"). Every time it was a challenge of sorts, through one through which I could grow as a writer. Here is a list of the things I learned that are necessary to author a book:
Beyond these, several other things are necessary for authoring a book, perhaps too many to list in a blog article. However, for anyone serious about writing, these are a good place to start. Cheers!
These days I'm working feverishly on a book project so there is no time for any new data science / A.I. related post here. If you want something else to read, feel free to check my articles on beBee, such as the latest one, available here. Parallel to all this, I'm preparing another educational project, something I'll talk more about later on. Stay tuned!
So, the royalties for the last 3-month period came for my self-published novel today ("I, AGI; the adventures of an advanced AI") and they were quite underwhelming. In fact, with the money I received I couldn't even cover my expenses for this book. Yes, I did pay others to help out, such as an editor and someone to handle the formatting that Kindle Publishing expects of its books, including the cover design. After all, I have a lot of respect for my audience, even if probably most of the people who read the book chose to not pay for it (there are loopholes when it comes to Amazon Kindle). Still, the reviews I got about it, from reliable sources like Goodreads, were quite positive, so I must have done something right!
Anyway, I could have published this book elsewhere and perhaps if I had 6 months to a year to spend, I could have found a literary publisher for it (unfortunately my regular publisher doesn't do novels!). Yet, even then it's not really worth it for the revenue a fiction book can bring. After all, the standards for sci-fi these days are quite high and I'm more of a non-fiction author. So, why did I bother with this whole project? Well, mostly because I enjoy writing, all kinds, not just non-fiction. And if you have a story in your head that you wish to share with others, the low revenue that stems from a publication of this story doesn't pose a real obstacle.
Also, and perhaps more importantly, I had a message to get to the world, regarding the safety aspect of A.I. and AGI. Of course, I've made this point through other forms, such as a video on the topic and numerous articles on this blog. However, if you care about reaching as many people as possible, you need to be creative about how you promote your idea. And that's exactly what I did.
So, even if Amazon Kindle is not the most profitable way to publish an ebook, even if the people reading this book probably have dozen other books on their to-read list and are less likely to value it the same way we used to value books before the Internet era, even if people are mesmerized about the benefits of A.I. today and are quite reluctant to view any of the potential shortcomings, I'm glad I published this book. At the very least, it was a learning experience and a way to gauge the literary market first hand. And who knows, if things go well, I may author a sequel to this novel as there is more to the story!
Someone may wonder why would someone write a technical book these days, especially when the earnings of such an endeavor are small and getting smaller. I wondered about the same thing once until eventually, I came to realize some key reasons why such an endeavor is indeed worthwhile, particularly in the Data Science area. I’d like to share with you here the most important of these reasons (benefits), knowing fully well that these are just my own insights and that you’d be able to find your own ones, should you ever consider writing a technical book.
First of all, when you discover something, even if it’s not entirely new, the natural next step is to try to share this with others, be it to advance your career, personal branding, or for whatever other reason drives you. If you are serious about this task, you’d want to make sure that whatever you deliver has a certain quality standard, so writing a book on the topic could help you accomplish that. If you are more inclined to use film as your medium, you may decide to go with a video instead, but that would require a longer time and sufficiently more expertise. So, a technical book would be a more viable option, especially if what you have to say can have enough commercial value to attract a publisher. You as an author may not be motivated by the royalties you’ll receive from this project, but no publisher would publish something that isn’t going to pay for the paper or digital storage it is going to need.
Also, through writing such a book you realize what you don’t know and develop a more balanced approach to the whole subject since you are more aware of what’s out there. The arrogance you may have harbored as a newcomer will gradually give way to humility and a deeper appreciation of the field, as the research required to write this book is bound to cultivate in you. Besides, even the stuff you may know well, you may realize that you may not be able to express comprehensibly, something that the editor will be more than happy to let you know! So, your development as a professional in this field will be (at least) two-fold: related to the knowledge of subtle aspects of the field and related to your ability to express all that effectively and eloquently.
Finally, writing a technical book, particularly one that is marketed professionally by a publisher, enables your thoughts to cross lots of borders, reaching out to people you wouldn’t normally find on your own. This will expose you to a larger variety of feedback for your work that can help you grow further as a professional. Not all of the feedback is going to be useful, but at least some of it is bound to be. Besides, the people who would normally read your work are likely to be people who have valued it enough to pay for it beforehand, be it through the publisher or through a subscription to a technical knowledge platform. Either way, they would most likely be people who are driven by genuine care about the subject, not just curiosity.
I could go on about this for a while, perhaps even write a book on this topic! However, as I respect your time, I’d leave it to this. What other benefits of writing a technical book can you think of? Do they justify to you the undertaking of such a project?
With so many ways to get a book out there, even in a fairly challenging subject such as data science, you may wonder what this process entails and what is the best way to go about it. After all, these days it’s easier than ever to reach an audience online and promote your work, all while branding yourself as a professional in the field.
Writing a book in data science is first and foremost an education initiative, targeting a particular audience. Usually, this is data science learners though it may be other professionals involved in data science, such as managers, developers, etc. A data science book generally tries to explain what data science can do, what its various methodologies are, and how all of that can be useful for solving particular problems (emphasis on the last part!). If you see a book that focuses a lot of the methods, particularly those of a particular methodology, it may be too specialized to be of most audiences, unless you are targeting that particular niche that requires this specific know-how.
A key thing to note when exploring the option of writing a book is a publisher. Even if you prefer to self-publish, your book must be able to compete with other books in this area and a publisher is usually the best way to figure that out. If a publisher is interested in your book, then it’s likely to be somewhat successful. Also, if you are new to book authoring, you may want to start with a publisher since there are a lot of things you’d never learn without one. Also, a book published through a publisher is bound to have more credibility and a larger life-span.
Understandably, you may have explored the various deals publishers make with their authors and figured out that you’ll never make a lot of money by publishing books. Fair enough; you’ll probably never make a living by selling your words (although it is possible still). However, if your book is good, you’ll probably make enough money to justify the time you’ve put into this project. Also, remember that most publishing deals provide you with a passive income, even if the publisher wants you to promote your book to some extent. So, even though you won’t make a lot of cash, you’ll have a revenue stream for the duration of your book’s lifetime.
With all the data science material available on the web these days, acquiring all the relevant information and compiling it into a book is a fairly straight-forward task. However, just because it is fairly feasible, it doesn’t mean that it’s what the readers need. Without someone to guide you through the whole process and give you honest feedback (that’s also useful feedback), it’s really hard to figure out what is necessary to put in the book, what should be included in an appendix, and what should be mentioned in a link. Your readers may or may not be able to provide you with this information, while if your main means of interacting with them is how many of them download your book or visit your website, you are just satisfying your ego!
A publisher's honest feedback often hurts but that’s what gradually turns you into a real author, namely one who has some authority in his/her written works. Otherwise, you’ll be yet another writer, which is fine if you just want to talk about writing a book or how you have written a book that you have on Amazon, things that are bound to be forgotten quicker than you may think…
After noticing a subtle but clear gap in the data science education of today, and after discussing this matter with a couple of my associates, I decided that a new data science book would be in order. So, after some negotiations and refinements of this idea, over the space of 3 months, we are now ready to initiate this publication project. So, once the paperwork is done, I'll be working on a new title, one that would appeal to a large audience of data science related professionals. We expect the first draft to be ready by the beginning of summer, and if all goes well, the book should be available for purchase by early autumn.
A big thanks to my publisher Technics Publications and to all of you, particularly those buying my books and watching the videos of mine that are made available on Safari. Cheers!
I've been writing a lot about A.I. lately and AGI has been a recurring topic lately. Although the possibility of this technology becoming a reality is still a bit futuristic, we can still ponder on the possibility and explore how such an A.I. system would affect us. Hence this fiction book I wrote in the past few months and published this week on Amazon (Kindle version only). Feel free to check it out when you have a minute.
The book is dedicated to researchers of A.I. Safety.
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.