About five months ago, I started writing a new technical book. I didn't have to, but the idea was burning me, and as I had been working on a particular data science topic the previous months, I felt it deserved to be shared with a larger audience. Of course, I could have just shared some code or perhaps put together some article, but the idea deserved more. So, I reached out to my publisher, sent him a proposal with an outline of the book, and before long, I was good to go!
During these months I had been writing regularly (pretty much daily, at least on weekdays), with just a couple of weeks off, one for the Christmas holidays and one for the preparation of my podcast. This book project involved both text, images, and code (in the form of a new type of code notebook, for Julia). So, at least, I didn't get bored, plus I was a bit imaginative with a couple of the problems I tackled in the code notebooks.
Although it's a bit soon to tell when the book will be out, at least the most time-consuming part is behind me now. Hopefully, I'll be able to focus more on other projects now. I'll keep you posted through this blog regarding updates on the book. Cheers!
For about a month now, I’ve been working on a new technical book for Technics Publications. This is a project that I've been thinking about for a while, which is why it took me so long to start. Just like my previous book, this one will be hands-on, and I'll be using Julia for all the code notebooks involved. Also, I'll be tackling a niche topic that hasn't been done before in this breadth and depth, in non-academic books. Because of this book, I won't be writing on this blog as regularly as before.
If you are interested in technical books from Technics Publications, as well as any other material made available from this place, you can use the DSML coupon code to get a 20% discount. This discount applies to most of the books there and the PebbleU subscriptions. So, check them out when you have a moment!
Recently a new educational video platform was launched on the web. Namely, Pebble U (short for Pebble University) made its debut as a way to provide high-quality knowledge and know-how on various data-related topics. The site is subscription-based, while it requires a registration for watching the videos and any other material available on it (aka pebbles). On the bright side, it doesn't have any vexing ads! Additionally, you can request a short trial of it, for some of the available material, before you subscribe to it. Win-win!
Pebble U has a unique selection of features that are very useful when consuming technical content. You can, for example, make notes and highlight parts and add bookmarks, on the books you read. As for the videos, many of them are accompanied by quizzes to embed your understanding of the topic covered. The whole platform is also available as an app for both Android and iOS devices.
The topics of Pebble U cover data science (particularly machine learning and A.I., though there are some Stats related videos too), Programming (particularly Python), and Business, among other categories. As the platform grows, it is expected to include additional topics and a larger number of content creators. All the videos are organized in meaningful groups called disciplines, making it easy to build on your knowledge. Of course, if you care for a particular discipline only, you can subscribe to material of that area only, saving you some money.
In the screenshot above, you can see some of my own material that are available on PebbleU right now. Many of them are from my Safari days, but there are also some newer ones, particularly on the topic of Cybersecurity.
By the way, if you find the subscription price a bit steep, remember that you can use the coupon code DSML I've mentioned in previous posts, to get a 20% discount. So, check it out when you have some time. This may be the beginning of something great!
The world of data professionals is sophisticated and diverse, especially nowadays. In involves professionals whose expertise ranges from the design of data flows to databases, data analytics models, machine learning systems, and APIs that connect the users to a cloud-based solution. It's not a simple matter, while the variety and depth of all these roles leave people bewildered and uncertain about what this ecosystem is and what it can do for an organization.
We can attempt to gain an understanding of this world by reviewing the various professionals found in it. First of all, we have the data architects (aka data modelers) responsible for designing data/information flows, facilitating communication among the people in an organization, and developing the infrastructures for all movement and storage of the organization's data. They are often involved in database solutions as well as ETL processes and the creation of glossaries. Data architects are essential in an organization, mainly when there is plenty of data involved, or the data plays a vital role in the organization's workflow. Most modern organizations are like that, while the abundance of data makes these professionals necessary.
Beyond this role, there are also data analytics professionals, particularly data scientists. This sort of professionals is involved in deriving values from the available data, usually through discovering insights. Data scientists are more geared towards messy (e.g., unstructured or highly noisy) data and more advanced models. All data analytics professionals work with databases through focused querying of them, while the creation of visuals based on the data is an essential part of their pipeline. Naturally, this role involves some programming (more so in the case of data scientists) and communication with each project's stakeholders. The creation of dashboards is a typical deliverable in this role, though other kinds of data products are sometimes developed instead.
Data engineers are also an essential kind of professionals in this ecosystem. This role entails data governance, particularly when big data is involved as well as various ETL processes that facilitate data analytics work. Managing containers in the cloud and specialized software like Spark is part of these professionals' job descriptions. Data engineers are heavy on programming and often deal with computer clusters, be it physical or virtual. Their communication with the project stakeholders is relatively limited, although they liaise with data scientists quite a bit. Some data engineers are well-versed in data science methods, particularly the development and deployment of predictive models.
Finally, business intelligence (BI) folks also have a role to play in the data world. This kind of professionals involves liaising with the managers and other project stakeholders. BI professionals tend to be more knowledgeable regarding the inner workings of an organization. Simultaneously, their use of data is limited to basic models, useful graphics, and descriptions of the problem at hand. BI professionals are more related to data analysts, though they tend to be more involved in high-level tasks. Also, their use of programming is minimal.
If you want to learn more about the data professionals' world, I invite you to check out some great books, like those available at the Technics Publications' site. Although geared more towards data modeling, this publisher covers the subject quite well, providing practical knowledge from various professionals in the fields as mentioned earlier. If you use the coupon code DSML, you can get a 20% discount on any books purchased. Check it out when you have the chance. Cheers!
What is an API?
In Computer Science, an API is short for Application Programming Interface. This is in essence a facilitator for an organization (e.g. a company) to share information with its clients and partners over the internet, oftentimes bypassing websites. And API is designed for computer programs so it’s usually developers that deal with this tech, though many data scientists and business people are getting involved in this promising piece of technology.
Why are APIs important?
APIs make prototyping a service super-fast, while they enable easier and more scalable leveraging of data. The latter can come from all sorts of sources and systems since APIs are platform-agnostic. So, if you were to create a mobile app that employs geo-location data, along with various security processes (e.g. for user authentication), you can do this easily using APIs. Also, if you have a website already for handling this sort of information exchanges, you can use an API for your target audience to interact with your online system, without even having to go to your site (the API becomes a proxy for the back-end of your site enabling them to access it through the app). For these and other reasons APIs are very important today and an essential part of any data-driven organization.
Thoughts on the "API Success" book
So, what about the "API success" book by Nelson Petracek (Technics Publications)? Well, this book covers the topic from various angles, with a strong focus on the business side of it. It provides lots of examples justifying the value-add of APIs and where they fit in in a modern organization. The book is well-written and easy to read, despite the large number of acronyms used in it. Interestingly, the book covers marketing as well, making a strong case for using APIs in a business project, be it as the main product or part of a package. It even explores how APIs can facilitate partnerships with other organizations and the fostering of long-term business relationships. The author, who is a very hands-on person, has a good sense of humor and writes in a way that's engaging and easy to follow.
The strongest part of the book, in my view, is the various architectural and design-related tips and lots of advice on the life cycle of an API, along with the corresponding diagrams that make this quite comprehensible. As for shortcomings, the lack of any hands-on material or reference resources is the only one that stands out. Nevertheless, the rest of the book makes up for this, through comprehensive coverage of the topic from various angles.
How you can get this book at a 20% discount
Although this book is available in a variety of places, you can get it at a discounted price if you go to the publisher’s site and use the coupon code DSML at the checkout. The book is already reasonably priced (around $30 for the printed version) but why not get it at a lower price? After all, this is a book with evergreen content, something you’d like to refer to again and again, maybe even share with your team when building your own APIs. Check it out!
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.