Toward the Magic Recipe to Make AI Systems “Intelligent”

(T) In 1956, a small group of scientists led by John McCarthy gathered for a Summer Research Project at Dartmouth University and established Artificial Intelligence (AI) as a scientific discipline whose goal was “to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

Since then, AI has been and still should be the endeavor of designing machines that are intelligent in the same way humans are. In other words, an AI system shall emulate or reproduce some or all of the capabilities of the human mind.


  • Do you think that there will ever be a machine that will think like human beings and be more intelligent than human beings?
  • Do you think that computers can discover new ideas and new relationships by themselves?”

What are your answers?

Those two questions were asked to Physics Nobel Prize Richard Feynman on September 26, 1985:

Having listened to the responses of Richard Feynman, surprisingly, I have the impression that computer systems have not much evolved since 1985, and that all his responses are still fundamentally pretty much well founded in 2022.

So what have we not done for over 37 years?

We are still far away from artificial intelligence (so using AI to market products is truly a misuse and an abuse of the two words “artificial intelligence”).

Expert systems, a very popular application software developed in Prolog and Lisp, that learned by rules and facts the knowledge of human experts to make decisions, failed in the 90s.

But neural networks took off in 2012, although the perceptron existed since 1951!

That was when Alex Krizhevsky and Ilya Sutskever two Ph.D. students with Geoff Hinton their Professor from the University of Toronto developed AlexNet, which won the 2012 ImageNet competition. AlexNet was a novel neural network architecture that contains five convolutional layers and three fully connected layers, and pioneered the use of GPUs.

And since 2012, machine and deep learning systems have mastered a breadth of challenging applications in particular because of the availability of huge computing power, in domains ranging from computer vision for self-driving cars, to speech recognition with Apple Siri, and natural language processing systems that can write books like humans such as is the case with OpenAI’s GPT.

But it took us a lot of time to get where we are today. Here is my brief and incomplete history:

In a recent blog post, Google robotics researcher’s Eric Jang notes that:

Like the parable of the blind men and the elephant, computer scientists have come up with different abstract frameworks to describe what it would take to make our machines smarter: equivariance algebracausal inferencedisentangled representationsBayesian uncertaintyhybrid symbolic-learning systemsexplainable predictions, to name a few.

and he also suggests:

“I’d like to throw in another take on the elephant: the aforementioned properties of generalization we seek can be understood as nothing more than the structure of human language

So how much intelligent are our present systems? Let’s dig a little more in the parable of the blind men and the elephant starting with the example of OpenAI’s GPT…

We probably would all agree with Gary Marcus in his paper “The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence“, that:

“The trouble is that GPT-2’s solution is just an approximation to knowledge, and not substitute for knowledge itself. In particular, what it acquires is an approximation to the statistics of how words co-occur with one another in large corpora—rather than a clean representation of concepts per se. To put it in a slogan, it is a model of word usage, not a model of ideas, with the former being used as an approximation to the latter. Such approximations are something like shadows to a complex three-dimensional world”

But maybe before that, we need to define “intelligence” as proposed by Francois Chollet in his paper “On the Measure of Intelligence“:

To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems, as well as comparisons with humans.”

Here is Francois being interviewed by Lex Fridman:

Professor Judea Pearl, who championed probabilistic reasoning in AI systems, and causal inference has said many times that present deep learning systems are “curve fitting”:

That sounds like sacrilege, to say that all the impressive achievements of deep learning amount to just fitting a curve to data. From the point of view of the mathematical hierarchy, no matter how skillfully you manipulate the data and what you read into the data when you manipulate it, it’s still a curve-fitting exercise, albeit complex and nontrivial. …To build truly intelligent machines, teach them cause and effect

Stanford University organized a seminar “Beyond Curve Fitting: Causation, Counterfactuals, and Imagination-based AI” to explore present views “regarding different aspects of the machine learning toolbox, however, are not a matter of speculation or personal taste, but a product of mathematical analyses concerning the intrinsic limitations of data-centric systems that are not guided by explicit models of reality. Such systems may excel in learning highly complex functions connecting input X to an output Y, but are unable to reason about cause and effect relations or environment changes, be they due to external actions or acts of imagination. Nor can they provide explanations for novel eventualities, or guarantee safety and fairness. “

Note that one of the areas of research of deep learning pioneer and recipient of a Turing award, Yoshua Bengio, has been in causal learning:

Toward causal representation learning: The two fields of machine learning and graphical causality arose and developed separately. However, there is now cross-pollination and increasing interest in both fields to benefit from the advances of the other. In the present paper, we review fundamental concepts of causal inference and relate them to crucial open problems of machine learning, including transfer and generalization, thereby assaying how causality can contribute to modern machine learning research. This also applies in the opposite direction: we note that most work in causality starts from the premise that the causal variables are given. A central problem for AI and causality is, thus, causal representation learning, the discovery of high-level causal variables from low-level observations. Finally, we delineate some implications of causality for machine learning and propose key research areas at the intersection of both communities.

Jeff Dean, the tech lead of the Google Research and Health Teams, and probably one of the most famous Silicon Valley engineers, believe that there are three ingredients to make AI closer to reach:

  • A single model trained on millions of tasks e.g. massive multi-task learning
  • That model will be multimodal e.g. multiple types of inputs
  • And it will have sparse architecture

And, he calls that model architecture Pathways:

We still have to wait for the implementation of Pathways – although most of the three ingredients that Mr. Dean described are already in place in the most advanced NLP systems.

In the meantime when I read the pioneering work “Understanding the World Through Action” from Professor Sergey Levine in robotics, I become a little more optimistic:

“…The question of how to design learning-enabled systems that match the flexibility and generality of human reasoning remains out of reach. This has prompted considerable discussion about what the “missing ingredient’’ in modern machine learning might be, with a number of hypotheses put forward as the big question that the field must resolve. Is the missing ingredient causal reasoning, inductive bias, better algorithms for self-supervised or unsupervised learning, or something else entirely?..

And the answer is…

…Self-supervised reinforcement learning combined with offline RL could enable scalable representation learning. Insofar that learned models are useful, it is because they allow us to make decisions that bring about the desired outcome in the world. Therefore, self-supervised training with the goal of bringing about any possible outcome should provide such models with the requisite understanding of how the world works. Self-supervised RL objectives, such as those in goal-conditioned RL, have a close relationship with model learning, and fulfilling such objectives is likely to require policies to gain a functional and causal understanding of the environment in which they are situated. However, for such techniques to be useful, it must be possible to apply them at scale to real-world datasets. Offline RL can play this role because it enables using large, diverse previously collected datasets. Putting these pieces together may lead to a new class of algorithms that can understand the world through action, leading to methods that are truly scalable and automated.”

Here is a lecture from Professor Levine on “Understanding the World Through Action”, who believe that offline RL will significantly push the decision making abilities of RL in robotics to a new world:) :

So what is the conclusion?

To conclude, I will mention a recent blog post from Yoshua Bengio “Superintelligence: Futorology vs Science” which is a good reading after the thoughts of Richard Feynman’s talk 37 years ago:

“Beyond the hype, there are qualitative things that we can say with a high degree of confidence:

  • There is no reason to believe that we won’t be able to build AIs at least as smart as we are. Our brains are complex machines whose workings are becoming increasingly better understood. We are living proof that some level of intelligence is possible.
  • Since humans sometimes suffer from cognitive biases that hinder their reasoning that may have helped our ancestors in the evolutionary process leading to homo sapiens, it is reasonable to assume that we will be able to build AIs without as many of these flaws (e.g., the need for social status, ego, or belonging to a group, with the unquestioning acceptance of group beliefs). In addition, they will have access to more data and memory. Therefore, we can confidently say that it will be possible to build AIs that are smarter than us. 
  • Still, it is far from certain that we will be able to build AIs wildly more intelligent than us as the article claims. All kinds of computational phenomena run into an exponential wall of difficulty (the infamous NP-hardness of computing) and we have yet to discover the limits of intelligence.
  • The more the science of intelligence (both human and artificial) advances, the more it holds the potential for great benefits and dangers to society. There will likely be an increase in applications of AI that could greatly advance science and technology in general, but the power of a tool is a double-edged sword. As the article in La Presse mentions, it is essential to put in place laws, regulations and social norms to avoid, or at least reduce, the misuse of these tools.
  • To prevent humans blinded by their desire for power, money or revenge from exploiting these tools to the detriment of other humans, we will undoubtedly need to change the laws and introduce compassion in machines (as the article suggests), but also reinforce inherent human compassion.
  • Since we don’t really know how fast technological advances in AI or elsewhere (e.g., biotechnology) will come, it’s best to get on with the task of better regulating these kinds of powerful tools right away. In fact, there are already harmful uses of AI, whether voluntarily as in the military (killer drones that can recognize someone’s face and shoot the person) or involuntarily as with AI systems that make biased decisions and discriminate against women or racialized people, for example. Computing in general is very poorly regulated, and this must be changed. We must regulate these new technologies, just as we did for aeronautics or chemistry, for example, to protect people and society.
  • Furthermore, applications of AI that are clearly beneficial to society should be encouraged, whether it be in health, in the fight against climate change, against injustice or in increasing access to knowledge and education.  In all these areas, governments have a key role to play in directing the forces of AI research and entrepreneurship towards those applications that are beneficial to society but where the desire to make a profit is not always sufficient to stimulate the needed investments.”

Note: The picture above are boats in front of the Golden Gate Bridge in San Francisco.

Copyright © 2005-2022 by Serge-Paul Carrasco. All rights reserved.
Contact Us: asvinsider at gmail dot com