Being a data science author is not a simple matter. With the bookshelves brimming with data science books these days, one may come to think of this as being something easy and accessible to everyone. Perhaps the latter is true since nowadays everyone can publish a data science book through some publisher with very low standards or he can publish the book himself, thanks to Amazon and other sites that are happy to make your book available to everyone. Some people stoop so low as to give away their book for free, something that says more about the quality of their book than it does for their generosity (of course there are exceptions to this, since many academics prefer this approach since the academic publishers make their books inaccessible to most of their students due to the high price tag they force on them). Whatever the case, being a data science author involves more than just putting a book out there for the world to view and perhaps read.
In my experience for the past 10 years or so, authoring a book is quite different to just writing one and making it accessible to the public. Authoring a book is all about providing a certain level of quality and going through the oftentimes exhausting process of revisions and edits, once the first draft is completed. Fortunately, the first book I authored was on something I had spent 5 years working on, namely my PhD project. The book was my PhD thesis, which is much like a normal technical book, though geared towards a more limited audience.
Other books I've authored were mostly through a publisher, except for some ebooks and a novel ("I, AGI: the adventures of an advanced Artificial Intelligence"). Every time it was a challenge of sorts, through one through which I could grow as a writer. Here is a list of the things I learned that are necessary to author a book:
Beyond these, several other things are necessary for authoring a book, perhaps too many to list in a blog article. However, for anyone serious about writing, these are a good place to start. Cheers!
These days I was on vacation (this image should give you a hint!), so no post this week unfortunately... However, as of next week (or even later this week, depending on my workload), I should have something for you. In the meantime, you can check out some of my older posts. Until next time!
These days I'm working feverishly on a book project so there is no time for any new data science / A.I. related post here. If you want something else to read, feel free to check my articles on beBee, such as the latest one, available here. Parallel to all this, I'm preparing another educational project, something I'll talk more about later on. Stay tuned!
So, recently I decided to make a video on this topic, based on some things I've observed in data science candidates. The hope is that this may help them and anyone else who may be looking into becoming a more holistic data scientist, instead of just a data science technician. The video I made is now available online on O'Reilly and although it's a bit longer than others I've made (not counting the quiz ones), it's fairly easy to follow. Enjoy!
Everyone wants to do business especially when it comes to data science. The more someone is aware of the merits of this field and the value it can bring, the keener that person usually is. Whether it is for a hands-on project or something more high level, the wish to do a collaborative project is bound to rise, the more they get to know you and what you can do for them. However, just because you can work with someone on a potentially interesting and lucrative project, it doesn't mean that you should. Namely, there are certain red flags you ought to be aware of and which once spotted should make you rethink the whole endeavor.
First of all, there is a lack of organization when it comes to the first meeting (and the ones that may follow). Many people want to meet but they often lack the basics of organizing a meeting. Sometimes the time is vague (e.g. they set up a day but not a clear time) or the place is unclear (e.g. there is agreement about using a VoIP system but there is no mention of which system or which room, as in the case of Zoom). If your potential client fails to provide such crucial information, probably they are still new to doing business and there are bound to be other discrepancies down the line.
What’s more, the lack of clear objectives is something to be wary of. Some people want to do wonders with data science (esp. when A.I. is also leveraged) but they have no idea how. There are no clear objectives, deadlines, and the whole project feels more like a plan drafted by a 5-year-old. Situations like this spell out trouble since no matter how hard you work, they won’t be satisfied by your deliverables.
Moreover, when someone doesn’t have a solid understanding of the field and has irrational expectations because of this. This ties into the previous point since the lack of clear objectives often stems from the lack of a solid understanding of what data science is and what it can do. With a perception tainted by the hype of data science and A.I., the client may be unaware of what is feasible and what isn't, leading to a very unrealistic set of expectations that no matter how good you are, you are unlikely to be able to meet.
Furthermore, the lack of access to the actual data is a serious issue for a data science project. If I had a dime for every time I encountered this situation, I wouldn't need to work anymore! Yes, many people may have a clear plan and a solid understanding of data science but the data is not there. Sometimes they do have it but it is inaccessible and you have to go through miles of red tape just to get a glimpse of it. Cybersecurity and privacy processes are something completely unknown to clients like this, and they are overly protective of the data they have, granting you access to it only after you have signed a contract. However, embarking on a data science project without some exploratory data analysis first is like asking for trouble, but they don't usually understand that either.
Finally, if the paperwork is not properly handled (contracts, NDAs, etc.) that’s a big red flag. This is the other extreme, whereby the client is very open about everything but has no idea of how the world works and doesn't bother with NDAs, formal contracts, etc. This way, if there are issues (something quite likely) you are screwed since there are no legal guarantees for the whole project making any pending payments as likely to become actual revenue as a lottery ticket! Also, the ownership of the IP involved in such a project can become a nightmare.
Note that all these are red flags I’ve experienced myself so this list is by no means complete. Hopefully, it can give you an idea of things to look out for, ensuring that your data science expertise is not exploited or wasted in projects that are not likely to yield any benefit for you.
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.