A.I. and ML are often used interchangeably, while many people consider one to be a subset of the other (which one is the bigger set depends on who you ask). However, things may not be as clear-cut as they may seem, since the communities of these two fields are not all the related, while there is a sort of rivalry among the hard-core members of each one of them. Why is that though if A.I. and ML are so similar to each other, enough to confuse even data scientists? First of all, let’s start with some definitions. A.I. is the group of methods, algorithms, and processes, that bring about computer systems that emulate human intelligence, even if the intelligence they usually exhibit is quite different to our own. Also, these systems often take the form of self-sufficient machines, such as robots, as well as agent programs that roam the Internet or cyber space in general. ML on the other hand is the group of methods, algorithms, and processes that bring about computer systems that solve some data analytics problem in an efficient manner, through some training procedure (the learning part of machine learning). The latter can be with the help of some specific outcomes (aka targets) or without. Also, the training can take the form of feedback on the system’s predictions, which is like on-the-job training of sorts. Clearly, there is a close link between ML and data science, since ML systems are designed for this sort of problems. A.I. systems on the other hand, may tackle different kinds of problems too (e.g. finding the optimal route given some restrictions). So, there is a part of A.I. that is leveraged in data science and a part of A.I. that has nothing to do with our craft. That part of A.I. that is used in data science has a large intersect with ML, mainly through network-based systems, such as ANNs. Lately, Deep Learning networks, which are specialized and more sophisticated kinds of ANNs, have become quite popular and are also part of that intersect between A.I. and ML. Many people who work in A.I. consider it more of a science than ML and they are right in a way. Most of ML methods are heuristics based and don’t have much theory behind them, while the ones that are tied to Stats (Statistical and ML hybrids) are heavily restrained by the assumptions that the Stats theory has. A.I. methods are generally data-driven though, but also related to processes found in nature, so they are not out of the blue. Nevertheless, a data scientist who is being professional and pragmatic doesn’t put too much emphasis on the differences between A.I. and ML methods, since he cares more about how they can be applied to solve the problems at hand. So, even if these two families of methods are not the same, nor is one a subset of the other, they are both very useful, if not essential, in practical data science.
0 Comments
I understand that making predictions about these things is quite risky, but it’s good to take a stance about the things that matter, instead of playing it safe, like many tech “experts” out there do. Of course it’s easier to parrot the widely accepted views on every hot topic, gathering “likes” and positive comments, but no-one ever offered anything useful to the whole by being all lukewarm.
First of all, I’m not making a case against cryptocurrencies as a possibility. In fact, I find them immensely useful in potential, especially in a country where the conventional currency is plagued by inflation and by all the idiotic people managing the economy around it. Cryptocurrencies can be a viable alternative to the official currency, should they be used instead of a problematic fiat currency. The reality of cryptocurrencies, however, is very distant from this idealistic scenario. In fact, I’ve yet to encounter one cryptocurrency that is actually used as a currency of sorts. Most of them are some form of speculative investment, like a stock, but without any inherent value. Let that sink in for a bit; cryptocurrencies themselves have no value whatsoever. Someone may argue that conventional currencies have no inherent value either, and that’s a valid point. However, conventional currencies’ value doesn’t fluctuate wildly over time, since there are mechanisms to keep it somewhat stable. Naturally, there are exceptions, but even an unstable currency is generally more stable than the average cryptocurrency out there. The reason is simple: people who handle cryptocurrencies do so with one particular aim: to make money off them. They don’t care if they disappear tomorrow, as long as they cash them in first. It doesn’t take a financial genius to understand that this sort of ecosystem is not sustainable. The other reason is a bit more subtle than that, yet equally important. Most cryptocurrencies require someone to constantly work for them (a process known as mining), or provide some sort of infrastructure that’s not cheap to maintain. The translates into a running cost, which may not seem much individually, but collectively it is a lot, enough to make the whole system unsustainable. This is particularly true in cases like bitcoin, where the computational problems needed to be solved to maintain the blockchain behind the cryptocurrency get progressively more challenging, and therefore more expensive. Once enough people realize that, the fascination of these cryptocurrencies may wane, especially if some regulating mechanism comes into place. Artificial Intelligence on the other hand is a completely different animal. Even the most basic applications of it add value to whoever invests in them, be it someone tackling big data problems, or someone who just wants to optimize their technical infrastructure. Through a vast variety of ways, A.I. manages to add value to the people using it, particularly if they have developed it sufficiently, thereby automating certain expensive processes. That’s why people are amazed by it and spend hours speculating how it can help bring about numerous benefits to the world. Even if there are some inevitable pitfalls in this technology, if it is handled maturely, it can be of great benefit for the whole. Besides, as a scientific field it existed and it flourished on its own, long before the futurists used it for promoting their ideology, or before it became mainstream. Hopefully it won’t be long before the cryptocurrency craze subsides and people who waste their time and energy on it focus their efforts into something more sustainable, something that adds value to its environment rather than drain resources and time. Perhaps this could be A.I., or some other similar technology. Whatever the case, the cryptocurrencies that are around today have an expiration date, whether people are willing to accept that or not... After investigating this topic quite a bit, as I was looking into A.I. stuff, I decided to create a video on it. To make it more complete, I included other methods too, such as Statistics-based and heuristics-based ones. Despite the excessive amount of content I put together into this project (the script was over 4000 words), I managed to keep the video at a manageable length (a bit less than half an hour). Check it out on Safari when you have some time! Overview Recently I had a couple of very insightful conversations with some people, over drinks or coffee. We talked about A.I. systems and how they can pose a threat to society. The funny thing is that none of these people were A.I. experts, yet they had a very mature perspective on the topic. This lead me to believe that if non-experts have such concerns about A.I. then perhaps it’s not as niche a topic as it seemed. BTW, the dangers they pinpointed had nothing to do with robots taking over the world through some Hollywood-like scenario, but were far more subtle, just like A.I. itself. Also, they are not about how A.I. can hurt us sometime in the future but how its dangers have already started to manifest. So, I thought about this topic some more, going beyond the generic and quite vague warnings that some individuals have shared with the world over interviews. The main dangers I’ve identified through this quest are the following:
Interestingly, all of these have more to do with us, as people, rather than the adaptive code that powers these artificial mental processes we call A.I. Over-reliance on A.I. Let’s start with the most obvious pitfall, over-reliance on this new tech. In a way, this is actually happening to some extent, since many of us use A.I. even without realizing it and have come to depend on it. Pretty much every system that runs on a smart phone that makes the device “smart” is something to watch out for. From virtual assistants to adaptive home screens, to social chatbots, these are A.I. systems that we may get used to so much that we won’t be able to do without. Personally I don’t use any of these, but as the various operating systems evolve, they may not leave users a choice when it comes to the use of A.I. in them. Degradation of Soft Skills Soft skills may be something many people talk about and even more have come to value, especially in the workplace. However, with A.I. becoming more and more of a smooth interface for us (e.g. with customer service bots), we may not be as motivated to cultivate these skills. This inevitably leads to their degradation, along with the atrophy of related mental faculties, such as creativity and intuition. After all, if an A.I. can provide us with viable solutions to problems, how can we feel the need to think outside-the-box in order to find them? And if an A.I. can make connecting with others online very easy, why would someone opt for face-to-face connections instead (unless their job dictates that)? Bugs in Automated Processes Automated processes may seem enticing through the abstraction they offer, but they are far from perfect. Even the most refined A.I. system may have some hidden issues under the hood, among its numerous hidden layers. Just because it can automate a process, it doesn't mean that there are no hidden biases in its functionality, or some (noticeably) wrong conclusions from time to time. This is natural, since every system is bound to fail at times. The problem is that if an A.I. system fails, we may not be able to correct it, while in some cases even perceiving its bug may be a hard task, let alone proving it to others. Lack of Direct Experience of the World (VR and AR) This is probably a bit futuristic, since if you live in a city outside the tech bubble (e.g. the West Coast of the US), there are plenty of opportunities for direct experience still. However, as technologies like virtual reality (VR) and augmented reality (AR) become cheaper and more commercially viable, they are bound to become the go-to interface for the world, e.g. through “tourism” apps or virtual “museums.” Although these technologies would be useful, particularly for people not having easy access to the rest of the world, there is no doubt that they are bound to be abused, resulting to some serious social problems, bringing about further societal fragmentation. Blind Faith in A.I. Tech This is probably the worst danger of A.I., which may seem similar to the first one mentioned, though it is more subtle and more sinister. The idea is that some people become very passionate about the merits of A.I. and quite defensive about their views. Their stance on the matter is eerily similar to some religious zealots, though the “prophets” of these A.I. movements may seem level-headed and detached. However, even they often fail to hide their borderline obsession with their ideology, whereby A.I. is deified. It’s one thing speculating about a future society where A.I. may have an administrative role in managing resources, and a completely different thing believing that A.I. will enter our lives and solve all our problems, like some nurturing alien god of sorts. An Intelligent Approach to All This Not all is doom and gloom, however. Identifying the dangers of A.I. is a good first step towards dealing with them. An intelligent way to do that is first to take responsibility for the whole matter. It’s not A.I.’s fault that these dangers come about. Just like every technology we've developed, A.I. can be used in different ways. If a car causes thousands of people to die every year it’s not the car’s fault. Also, just like a car was built to enrich our lives, A.I.’s development has similar motives. So, if we see it as an auxiliary technology that can help us make certain processes more efficient, rather than a panacea, we have a good chance of co-existing with it, without risking our individual and social integrity. Although it's been over 2 weeks since I finished working on the Data Visualization video and about a month since I completed the Deep Learning one, both of them just got made available on Safari (a subscription based platform for various educational material). So, if you are up for some food for thought on DL and DV, check them out when you have a moment: Deep Learning vid and Data Visualization vid. Note that these are both overview videos and although in the Data Viz one I include several references to libraries in Python and Julia for creating various plots, the videos are fairly high-level. These are not in-depth tutorials on the topics. Once I decide to take a break from all the book-writing these days, I'll probably make another video either on AI or on a more conventional DS topic. So, stay tuned... People talk a lot these days about how self-driving cars will solve all of our logistics and transportation related problems when they finally hit the roads. The thing is that the problems they are trying to solve are not as simple, nor is their adoption going to be as easy as these idealistic people think. Although there is nothing wrong with dreaming of a better future, free of traffic and avoidable accidents, it’s also important to look at this matter from a more realistic point of view. First of all, the self-driving car needs to be re-examined. The idea of a car completely autonomous is a long ways from manifestation, even if there are A.I. systems out there that can navigate a car effectively over large distances. However, considering that these A.I. drivers will become the norm in the foreseeable future is quite unrealistic. The reason is simple economics. These systems are going to be very expensive, so they will naturally appeal only to a small part of the population. Also, as they gradually become more affordable, they will push down the price of conventional vehicles, making the latter more appealing. This is dynamic systems 101, something that apparently many of these visionaries of the self-driving cars are not that familiar with, just like they don’t understand people that well. If Joe and Jane find that this new self-driving car costs 50% more than the car they’ve been dreaming of for the past 5 years, because that particular make of car has been around forever and that model has been heavily advertised ever since they can remember, they will probably go with the conventional car, even if the self-driving car is an objectively better choice in general. However, if A.I. systems in cars were to adopt an auxiliary role, much like Elon Musk envisions for his Tesla vehicles, then they have a chance. After all, not many people are willing to give up control of their cars just yet. This is evident when you talk with competent drivers who have been outside the US. These people take a strong interest in the stick-shift cars, since it gives them more control over the car, making them feel better about their role as drivers. Also, stick-shift cars are more economical, require less maintenance in terms of the transmission (e.g. no transmission fluids), and are generally quite reliable (as much as their automatic counterparts). Unless of course you never learn how to use the clutch, which is another matter! If self-driving cars are self-driving only at certain times when the driver chooses to (e.g. in the case of a long road trip, or a mundane commute over I-90), then they can definitely add value. However, if they are entirely self-sufficient with no potential input from the human in the driver’s seat, then they are less likely to gain people’s trust, apart from those prejudiced towards their inherent value. Whatever the case, it is interesting to see how this new trend will evolve and what kind of data it will bring about for data science professionals to analyze! Sometime in October, one of the Foxy Data Science readers contacted me with a question/suggestion about this topic. As I hadn’t really thought about it much, I decided to look into it and write a blog post about it. I’m not an expert in AEI, but I believe I know enough about A.I. in general and about the business world to venture an insightful view on the matter. At the very least, it can trigger some interesting contemplation in you. Artificial Emotional Intelligence is a kind of A.I. that emulates the EQ aspects of our mental process. In other words, it is machines that know (to some fairly limited extent) how to exhibit qualities that fall on the intersect between intelligence and emotional maturity, aka EQ. By the way, I do not believe that EQ is more important than IQ, nor that it is any less important. Both are equally useful and neither can be a substitute for SQ (moral intelligence), which is a truly superior kind of intelligence. This, however, could be the topic of another blog post… Considering the possibility of computers and machines in general, emulating empathy and other traits that are under the EQ umbrella seems a bit futuristic. However, there are already A.I. systems that do just that. Not only that, some of them are quite successful, particularly in psychology roles, even more so than their human counterparts (link to some interesting research by USC). Could this be the end of EQ-based professions? Probably not, though these people may start considering offering something more than just listening and nodding, if they are to stand out from their AEI competition. Naturally, psychology is so much more than helping someone vent about their issues and showing them that there are more constructive ways to dealing with their problems, something that AEIs may be able to do equally well. That’s why this whole AEI business may be an incentive for these professionals to expand their profession and turn their sessions into something more, something AEIs may not be able to mimic (for the time being). Art therapists, for example, seem to do just that, combining the benefits of conventional psychology with that of an art form (usually music, painting, or dance). AEIs may be nothing more than a novelty now, but it very poignantly points to the possibility of new forms of A.I. that the original pioneers of the field may not have thought of. Movies like “Her” may be science fiction but for how long? These are interesting things to think about, since A.I. just like natural intelligence, can take many forms, not just the ones that we are more inclined to investigate so far. Surely Deep Learning may still be the most relevant A.I. for data science, but it doesn’t hurt to consider other ways that a machine can benefit the world through A.I. After all, there is much more to life than predicting a hand-written digit with high accuracy. Maybe in the years to come there will be AIs that can look at your handwriting and not only understand it, but also figure out if you are going through a difficult time in your life and require solace and comfort. We definitely live in interesting times! People like to argue, especially about things they can reason with. However, just because you can justify that your view has merit, giving some practical examples or through logical reasoning, this doesn't make alternative views invalid. If there are several programming languages in data science, perhaps an oversimplification like “X is the best language for data science because Y” doesn't hold much water. Let’s examine why. Although it is possible to rule out certain languages (e.g. Assembly or C) as optimal for data science, this doesn't mean that the problem has a clear-cut solution. Also, the assumption that a single programming language can cover all the use cases of a data science professional is a quite unjustifiable one. Some data scientists use two or three programming languages, sometimes in combination, getting the best of each, for optimal overall performance. Also, data science is all about solving a business problem in a scientific manner. Just because say Dr. Smith prefers to use language X over Y, it doesn't mean that you have to follow her example. Maybe she has used language X during her PhD and didn't have time to learn another language, or she attained mastery of that language, so she feels more comfortable doing her data science work with that. She may be a successful data scientist but following her programming habits won’t make you a great data scientist necessarily. Moreover, with new languages and new packages in the existing languages coming about all the time, which language is best is like the best performing basketball team. Definitely not something particularly stable! Besides, it’s often the case that a particular project may requite special handling, so what is a top-performer now, may not be the best option for that particular case. In addition, the almost religious attitude towards programming languages that many people have (not just data scientists) is by itself problematic. If a potential employer sees you arguing about how your language of choice is the best and that you are not open to consider alternatives, he may not be so eager to hire you, since this kind of attitude creates disharmony and difficulty in collaboration among the members of a team. Besides, in most companies nowadays, they rarely ask for a specific language in the candidate requirements. As long as you can do the task that’s required of you, they don’t really care much what your programming background is. Of course companies that have already invested in a particular language and have all their code in that language may not be so flexible, but that shouldn't be the principle factor in your decision about which language you learn. Finally, when it comes to deep learning, many modern frameworks, like Apache’s MXNet, have APIs for a variety of programming language. So if your A.I. guru friend tries to convince you that you should learn language X because that’s the best deep learning language, take that suggestion with a pinch of salt! The important thing is for whatever language you decide to learn for data science, you make sure that you learn it well. Familiarize yourself with its packages, use it to solve various problems, and learn the best strategies for debugging code written in that language. If you do that, you can still make good use of it for your data science projects, even if the majority of people prefer this or the other language instead. That’s a question that many people ask themselves and professionals in the data analytics field. However, they get different answers depending on who they ask. Naturally, the A.I. professional will tell you that of course, since A.I. methods are much better than conventional machine learning ones, while the field is booming lately. The data scientist may have a more retrained approach, as she is more likely to look at the matter scientifically, expressing some cautiousness about how influential A.I. professionals will be in the data science field. As someone who is both in A.I. and Data Science, perhaps I could offer a more balanced perspective. First of all, an A.I. professional is a specialist in A.I. methods and if we are thinking about how this person can do a data scientist’s job, we are looking at someone who focuses on data analytics, rather than some other part of A.I. (e.g. robotics, theoretical A.I., etc.). Also, when we are examining a data science professional, we are looking at someone who is not in A.I. and who uses mostly conventional data science methods for the data analytics problems he tackles. In my latest book, I outlined the importance of A.I. and how it is very influential in the data science field and the role of the data scientist. I even encouraged people to be kept up-to-date about the developments of A.I. as I predicted it will have an important role to play in the years to come. However, I did not urge anyone to drop what they are doing and focus on A.I. methods alone. If someone is already in the field, that’s great, since they already have developed the mindset of the data scientist and have mastered some of the tools, so by studying A.I. methods for data analytics, they are expanding their skill-set. That’s different from becoming A.I. specialists though. The A.I. specialists may be great at tackling Kaggle competitions, where the data is in a pretty clean and structured form (or at least mostly structured). However, this doesn't automatically make them adept at handling all kinds of data, like a data scientist does. It’s really hard to make predictions about things involving people and their work, as the market is a chaoit system. However, I can attempt to venture an educated guess about what is most likely to happen, if things continue evolving the way they do. So, as A.I. becomes more and more versatile and more robust in tackling data analytics problems, it is bound to dominate over other data science techniques. So, if you are happy using SVMs or random forests, for example, you may want to rethink your toolkit! Yet, it is unlikely that A.I. will fully automate the data science process, much like statistics have not become fully obsolete just because there are several statistical programming environments out there (e.g. Statistica, R, SAS, etc.). Statistics is and is bound to remain useful because it is much more than its techniques. The same goes for data science. Even if all the conventional methods used by a data scientist become obsolete, giving way to A.I. ones, people will continue asking questions about the data, forming hypotheses, analyzing problems so that they can be modeled as data science ones, etc. Of course, people will still communicate with the stakeholders of the projects, create visuals, do presentations, etc. So, even if the A.I. professional is bound to be an asset to an organization, he is most likely going to be part of a data science team, working side-by-side with a data scientist. As for the latter, she will be more knowledgeable about A.I. methods and will spend more time on other parts of her job, rather than doing feature selecting and building a series of models, since that’s something that will be automated by an A.I. system. Therefore, unless a major breakthrough happens in the next few years, I’d recommend you are a bit skeptical about the A.I. paradigm shift that many evangelists talk about, as if it’s the coming of a new Messiah. It would be nice if everything was suddenly easy and smooth, due to A.I., but I wouldn’t uninstall my data science software just yet... With all the hype about A.I. lately, many people have jumped on the A.I. bandwagon without realizing that what they are producing is not always related to A.I. and that their false promises can only get them that far. That’s not to say that modern processes in data science that leverage alternative approaches to analyzing data without relying on a predefined data representation system are not A.I. Far from it. However, there is a lot of jazz about knowledge representation systems (KRS), such as those applied in Natural Language Processing (NLP) that are merely transformations of text data into a quantitative format. Calling that an A.I. is calling a sedan a 4-by-4 monster truck! Knowledge representation is useful in many ways as it is an often necessary component to Natural Language Understanding (NLU) and other NLP-related systems. For example, the NLTK package in Python has a process in place that categorizes a given text into a series of parts of speech (PoS), by labeling each word with the most appropriate PoS tag. That’s useful, but it’s not exactly A.I. technology. Similar frameworks providing some kind of labeling of text data fall under the same umbrella. In fact, without someone processing their output and building some kind of model based on it, such a labeling is utterly useless. It’s like the dough someone makes, which without additional processing (e.g. baking), it’s bound to be something you’d probably not serve in a dinner party as-is (though many kids may be quite content eating it in this form). People managing data-driven products, however, are not kids. They expect some kind of value from the processing of the text-based data streams (which sometimes come at a cost) and a positive ROI. It’s quite unlikely that serving them some half-baked data using a knowledge representation system on the given data is going to make them content. Maybe they are fooled once into believing that this is A.I. at work, but it’s probably going to be a one-time thing. This is especially true if they have some data scientist on-board, who knows a thing or two about text analytics. A.I. systems are automated processes that make an in-depth transformation of the data they are fed, yielding something of value at the end. They usually require a lot of sophisticated processes in the back-end, such as the generation of a large number of meta-features, gradually refining the original features into something that encapsulates the information in them, and then use the end-result to make predictions of some kind. When it comes to data, this could be some new text that mimics the style of the original text, or some better representation of the data using a compact feature set. All this is done through computationally heavy processes that often employ the usage of GPUs. So, saying that a knowledge representation system that can run on an average computer, without any additional computing power, is an A.I. system, is inaccurate and misleading. Best case scenario, its results will be later discovered to be interesting but practically useless. After all, A.I. systems are robust because they drill into the data in ways that no human can do, and usually not even comprehend fully. So, if you hear someone claim that they have developed some new A.I. system that can handle raw text data, without the use of some non-parametric model, they are probably trying to sell you snake oil. This is expected in times where new technologies are available yet not fully understood, and charlatans trying to take advantage of the fact are promoting products convoluted enough to masquarade as this new tech, without actually offering any real value to the user. The answer to this situation is to better understand the field through methodical study (it doesn’t have to be too time-consuming) through reliable sources and the consultation of A.I. professionals and data scientist with an NLP focus. Once you are armed with this understanding, no KRS charlatans can take advantage of you since you’ll be able to see through their lies. |
Zacharias Voulgaris, PhDPassionate data scientist with a foxy approach to technology, particularly related to A.I. Archives
December 2022
Categories
All
|