Just like week, during a business trip to London, I started working on this video, on my spare time, and now it's already online! In this 40 minute video, comprising of 3 clips, I explore the topic of Optimization, through a series of questions spanning across 5 categories. Whether you are an aspiring A.I. expert or a data scientist, you can learn a lot of useful things from this test of sorts and with the right mindset, even enjoy the whole process! You can find it on the O'Reilly platform, where you need to have an account (even a trial one will do) to watch it in its entirety. Cheers!
Dimensionality reduction has been a standard methodology to deal with datasets that have a lot of features, more than a typical model can handle effectively. Reducing the number of features can also save time and storage space, while when it comes to sensitive data it can be a big plus as it enables anonymity in the people involved. What’s more, in some cases, a reduced dimensionality dataset can be more effective as there is less noise in it. However, conventional dimensionality reduction methods don’t always do the trick due to the inherent limitations they have. For example, PCA only considers linear relationships among the variables and a linear combination of features, as a solution.
Of course, other people are not sitting idle when it comes to this issue. There are several dimensionality reduction options that are being pursued, the most interesting of which is autoencoders. This AI-based method involves a data-driven approach to figuring out the nature of the data and creating new variables that can represent the underlying signal, by minimizing the error. The issue with this is that it often requires a lot of data and some specialized know-how in order to configure optimally. Also, this whole process may be fairly slow, due to the large number of computations involved.
An alternative approach has to do with feature fusion in a non-AI way. The idea is to maintain transparency to the extent this is possible, while at the same time optimize the whole process in terms of speed. The use of multiple operators, some linear and some non-linear, is essential, while the option of dropping useless features is also very useful. Naturally, this whole process would be more effective in the presence of a target variable, but it should be able to work without it, for better applicability. Whatever the case, the use of a metric able to handle non-linear correlations is paramount since the conventional correlation metric used leaves a lot to be desired.
Based on all this, it’s clear that the dimensionality reduction area is still capable of enhancements. Despite the great work that has been done already, there is still room for new methods that can address the limitations the existing methods have, which aren’t going away any time soon. Perhaps it would be best to explore this methodology of data engineering more, instead of focusing the latest and greatest system, which although intriguing, may sacrifice too much (e.g. transparency) in the name of accuracy, a trade-off that may no longer be cost-effective. Something to think about...
In an interview I recently watched, Elon Musk put forward the case of a utility (objective) function for a hypothetical advanced A.I. (basically an AGI) and how special attention must be given to such a task to avoid undesirable results. So, he suggested we use some utility function some person had recommended (probably an A.I. expert), namely that of maximizing “freedom of action for everyone,” something that’s quite reasonable and perhaps even profound if you think about it. However, if you think more about it, it becomes evident that it’s a terrible, terrible idea!
First of all, I mean no disrespect to Elon Musk. I think many of the things he’s created are great, even if some of his ideas are somewhat extreme. So even if he is not a role model of mine, I admire him as a tech entrepreneur and find that he has a lot to offer to the world through his businesses and his ideas for a better world. Except of course his idea for a utility function; that would be catastrophic, though I’m sure that in his mind it’s a brilliant solution to the utility function problem.
For starters, freedom is a very abstract concept even if it’s made more specific by the term “of action” to clarify it. How do you measure freedom of action? How would an A.I. understand this concept, especially if it never gets to experience it? Then, would maximum freedom be a good thing necessarily? Isn’t that a form of anarchy in a way? These are things that need to be addressed before asking an A.I. engineer to implement such a function for this hypothetical A.I. So, unless we figure this out, we cannot be sure that this A.I. will be benign, even if its creators have the best intentions in the world for it.
For example, an A.I. that makes use of this utility function may accelerate the depletion of natural resources of this planet (and any other planet it has access to), in order to ensure that everyone, even some random criminal on the streets or an inmate in a high security prison, has as much freedom of action as possible. Do you see where I’m going with this? Perhaps I’d better stop here before this whole post turns into some dystopian scenario or something.
The utility function problem is a difficult one and in all fairness Elon Musk is not someone knowledgeable enough in A.I. to be able to provide a bullet-proof solution to it. He may know a lot about the topic but I doubt he’s ever created an A.I. system from scratch. And unless you are close to the metal about these things, any ideas you have about how things should be regarding the high-level aspects of such complex systems is just an opinion on the matter, not a serious candidate for a solution to the problem at hand. The latter would be something that has legs and right now it seems that Mr. Musk’s suggestion is floating in the clouds just like many futurists when they talk about A.I. Perhaps that’s why many people don’t take Elon Musk’s warnings about A.I. very seriously, although I believe that’s one of the things he’s got right.
Despite the inevitable risks such an endeavor has, I’ll venture to make a suggestion of my own for a utility function, namely one that evolves over time. In other words, I propose a narrow A.I. whose sole purpose is to optimize the utility function of the AGI, perhaps in a Reinforcement Learning fashion, based on the feedback it receives from other people, while it starts with a utility function that’s as risk-free as possible (based on some simulations we run before we deploy it to the AGI). Some core heuristics may be in place to ensure a large enough diversity of signals that this A.I. will take into account, coordinating the various objectives / values that the AGI will have to uphold. Besides, it would be naive to assume that a human being, no matter how knowledgeable, can be in a position to come up with a utility function that can apply to some creature more intelligent than all the people in the world, forever.
If our own evolution has taught as anything is that there are no absolutes in nature and that we evolve to become better and adjust our values according to the circumstances we face and the challenges we wish to overcome. Why should an AGI be any different, considering that it’s created in our own image?
Trinary Logic is not something new. It’s been around for decades, though it was more of a mathematical / high-level framework. I should know, as I did my Masters thesis on this subject and how it applies to GIS. I even wrote code implementing the corresponding model I came up with, though in today’s programming world it seems like legacy code... Anyway, bottom line is that Trinary Logic is useful and could have a place in modern Information Systems, including data analytics projects. The question is, could it be applicable to A.I. too?
The answer is, as usual, “it depends.” Trinary Logic on its own is quite limited and unless you are familiar with its 700+ gates, it may be like any novel idea: interesting but not exactly something worth delving into. After all, just like any system of reasoning, Trinary Logic is meaningless without an in-depth understanding of its key contribution to the thorny issue we always tackle through reasoning: handling uncertainty effectively.
Uncertainty, oftentimes modeled as noise or randomness (depending on who you ask), is everywhere. Since we cannot eliminate it without damaging the signal too, we find ways to cope with it. Trinary Logic offers an interesting way of doing that through the 3rd value of its variables, namely the “indifferent” state. Something can be True, False, or Indifferent, the latter being something in-between. These are the states of those intermediate values in the membership functions of fuzzy variables, in Fuzzy Logic. The latter is a well-known and quite established A.I. framework with lots of applications in data science. Do you see where I’m going with this?
So Trinary Logic is a framework for reasoning, much like Fuzzy Logic, but the latter is an A.I. framework too, so Trinary Logic is A.I. also, right? Well, no. Trinary Logic is a mathematical construct, so unless it is applied to A.I. programmatically, and as a well-defined process, it is yet another concept that can’t even fetch an academic publication! But if it were to manifest as a heuristic of sorts and add value to a process in the A.I. sphere, things would be different.
Enter the Trinary Curve, a heuristic (or meta-heuristic, depending on how you use it) that encapsulates Trinary Logic in a simple yet not simplistic way, turning an input signal into something that an A.I. agent can understand and work with. Namely, it can engineer a new variable in the [-1, 1] interval (notice the closed brackets in this case), that enables the corresponding module to have the in-between state of uncertainty more evident. As a result, the A.I. agent is allowed to be unsure about something and examine it more closely, given the right architecture, instead of working with what it has and hope for the best. Note that the Trinary Curve can be customized, while its output can be normalized to a different interval (always closed) if needed. The Trinary Curve is differentiatable throughout the space it is defined, while it’s easy to use programmatically (at least in Julia).
Perhaps the Trinary Curve is a novelty and an A.I. system can evolve adequately without it. However, it is something worth considering, instead of just experimenting with the countable parameters of existing A.I. systems solely. After all, Trinary Logic is compatible with existing A.I. frameworks so if it’s not utilized, it’s primarily because of some people’s unwillingness to think outside the box, and that’s something that doesn’t have any uncertainty about it...
This week I'm away, as I prepare for my talk at the Consumer Identity World EU 2018 conference in Amsterdam (the same conference takes place in a couple of other places, but I'll be attending just the one in Europe). So, if you are in the Dutch capital, feel free to check it out. More information on my talk here. Cheers!
Dichotomy: a binary separation of a set into two mutually exclusive subsets
Data Science: the interdisciplinary field for analyzing data, building models, and bringing about insights and/or data products, which add value to an organization. Data science makes use of various frameworks and methodologies, including (but not limited) to Stats, ML, and A.I.
After getting these pesky definitions out of the way, in an effort to mitigate the chances of misunderstandings, let’s get to the gist of this fairly controversial topic. For starters, all this information here is for educational purposes and shouldn’t be taken as gospel since in data science there is plenty of room for experimentation and someone adept in it doesn’t need to abide to this taxonomy or any rules deriving from it.
The inaccurate dichotomy issues in data science, however, can be quite problematic for newcomers to the field as well as for managers involved in data related processes. After all, in order to learn about this field a considerable amount of time is required, something that is not within the temporal budget of most people involved in data science, particularly those who are starting off now. So, let’s get some misconceptions out of the way so that your understanding of the field is not contaminated by the garbage that roams the web, especially the social media, when it comes to data science.
Namely, there are (mis-)infographics out there that state that Stats and ML are mutually exclusive, or that there is no overlap between non-AI methods and ML. In other words, ML is part of AI, something that is considered blasphemy in the ML community. The reason is simple: ML as a field was developed independently of AI and has its own applications. AI can greatly facilitate ML through its various network-based models (among other systems), but ML stands on its own. After all, many ML models are not AI related, even if AI can be used to improve them in various ways. So, there is an overlap between ML and AI, but there are non-AI models that are under the ML umbrella.
Same goes with Statistics. This proud sub-field of Mathematics has been the main framework for data analytics for a long time before ML started to appear, revolting against the model-based approach dictated by Stats. However, things aren’t that clear-cut. Even if the majority of Stats models are model-based, there are also models that are hybrid, having elements of Stats and ML. Take Bayesian Networks for example, or some variants of the Naive Bayes model. Although these models are inherently Statistical, they have enough elements of ML that they can be considered ML models too. In other words, they lie on the nexus of the two sets of methods.
What about Stats and AI? Well, Variational AutoEncoders (VAEs) are an AI-based model for dimensionality reduction and data generation. So, there is no doubt that it lies within the AI set. However, if you look under the hood you’ll see that it makes use of Stats for the figuring out what the data generated by it would be like. Specifically, it makes use of distributions, a fundamentally statistical concept, for the understanding and the generation of the data involved. So, it wouldn’t be far-fetched to put VAEs in the Stats set too.
From all this I hope it becomes clear that the taxonomy of data science models isn’t that rigid as it may seem. If there was a time when this rigid separation of models made sense, this time is now gone as hybrid systems are becoming more and more popular, while at the same time the ML field expands in various directions outside AI. So, I’d recommend you take those (mis-)infographics with a pinch of salt. After all, most likely they were created by some overworked employee (perhaps an intern) with a limited understanding of data science.
I've been writing a lot about A.I. lately and AGI has been a recurring topic lately. Although the possibility of this technology becoming a reality is still a bit futuristic, we can still ponder on the possibility and explore how such an A.I. system would affect us. Hence this fiction book I wrote in the past few months and published this week on Amazon (Kindle version only). Feel free to check it out when you have a minute.
The book is dedicated to researchers of A.I. Safety.
Lately I worked on a new series of videos, this time on Optimization. This A.I. methodology is a very popular one these days, one that adds a lot of value to both data science and other processes where resources are handled. Specifically, I talk about:
* Optimization in general (including its key applications)
* Particle Swarm Optimization
* Genetic Algorithms
* Simulated Annealing
* Optimization ensembles
* some auxiliary material that supplements these topics
You can find this video series on Safari, along with my other A.I. videos. Cheers!
When designing an A.I. system these days it seems that people focus on one thing mainly: efficiency. However, even though there is no doubt about the value of such a trait, there are other factors to consider when building such a system, so that it is not only practical but also safe and useful in other projects. Namely, in order for AGI to one day become feasible, we need to start building A.I. systems that fulfill a certain set of requirements.
This is the Achilles heal of most modern A.I. Systems and a key A.I. Safety concern. However, it’s not an insolvable problem as many A.I. researchers (particularly those bold enough to think outside the black box of Deep Learning systems) have tackled this matter and some have proposed some solutions for shedding some light on the outputs of that DL network that crunches the data behind those cat pictures it is asked to process. Unfortunately, this transparency element they add is geared more towards image data since it’s easier to comprehend and interpret, when it takes the form of complex meta-features in the various layers of a DL network. Still, it is possible to have transparency in alternative A.I. systems that use a simpler architecture, perhaps non-network based.
It goes without saying that a system needs to be autonomous, even in its training, if it is to be considered intelligent. Although humans will need to play an important role in its training by providing this A.I. with data that makes sense, as well as some general directions (e.g. the terminal goal and some instrumental goals perhaps), the A.I. system needs to be able to figure out its own parameters automatically, using the data at hand. Otherwise, its effectiveness will be limited to the know-how of the “expert” involved in it, who may or may not have an in-depth understanding of the field or how data science works.
For an A.I. system to be effective, it has to be scalable, i.e. able to be deployed on a large computer network, be it in a cluster or the cloud. Otherwise, that system is bound to be of very limited scope and therefore its usefulness will be quite limited. For an A.I. system to scale well, however, its various processes need to be parallelizable, something that requires a certain design. DL networks are like that but not all A.I. systems are as easy to parallelize and scale.
This is an important aspect of our own thinking and one that hasn’t been implemented enough in A.I. systems, partly because of methodological limitations and partly because it’s not as easy for most A.I. people to wrap their heads around. In essence, it is the most down-to-earth form of intuition and what allows lateral thinking. An A.I. system having this attribute would be able to think like a human would and therefore be more easily understood and more relateable. It’s possible that this will mitigate the risks of the rigid rule-based thinking that many A.I. systems now have, even if it is concealed in complex architectures.
Of course we shouldn’t neglect efficiency in this whole design. An A.I. system has to be efficient in both its application and its training. If it takes a whole data center in order to train, that’s not efficient, not even if it is feasible for some people having access to such computational resources. An efficient A.I. system should be able to perform even in a small computer cluster, even if its effectiveness will be more limited in relation to the same system having access to a larger amount of resources.
Putting It All Together
Although A.I. systems today are fascinating and to some extent inspiring in their potential, they could be better. Namely, if we were to design them with the aforementioned principles in mind, they’d be more tasteful, if you catch my drift. Perhaps, such systems will not only be useful and practical but also safer and easier to relate with, making their integration in our society more natural and mutually beneficial.
Although lately I've been writing about the infeasibility of AGI in our current time and how an AGI can pose great threats, it is still useful to consider what would happen if an AGI actually existed and how it would see and interact with our world. Hence this novel, which through the first-person perspective of an AGI system, explores how the advent of such a technology could have noticeable consequences to our world, transcending even its creator's expectations. After all, the difference between an AGI with our level of intelligence and a super-intelligent AGI is not as large, though for the purpose of the plot of this novel, it has been shown to take place over a period of several months.
In any case, if you are into science fiction and wish to contemplate on the matter of AGI and A.I. Safety, this novel may be for you. Feel free to check it out on Amazon (currently only in Kindle format). Thanks!
Zacharias Voulgaris, PhD
Passionate data scientist with a foxy approach to technology, particularly related to A.I.