At the beginning of the new year of 2022, imagine the future of artificial intelligence algorithms. In addition to the most mainstream deep learning model, what are the potential breakthroughs?
Recently, I saw in Reddit that there are 10 directions listed. The author gives a brief introduction to these 10 directions according to the reference paths listed in the Thousand Brain Theory, Free Energy Principle, Testlin Machine, Hierarchical Real Time Storage Algorithm, and other related introductions from Wikipedia, Jizhi Encyclopedia, etc.
Hierarchical real-time storage algorithm
Hierarchical temporal memory
Hierarchical real-time storage algorithm (HTM) is a bio constrained machine intelligence technology developed by Numenta. HTM was first described in On Intelligence, co authored by Jeff Hawkins and Sandra Blakeslee in 2004, and is currently mainly used for anomaly detection in streaming data. The technology is based on neuroscience and the physiology and interaction of pyramidal neurons in the neocortex of mammals (especially humans).
The core of HTM is learning algorithm, which can store, learn, infer and call high-order sequences. Unlike most other machine learning methods, HTM constantly learns (unsupervised) time-based patterns in unmarked data. HTM is robust to noise and has high capacity (it can learn multiple modes at the same time). When applied to computers, HTM is well suited for prediction, anomaly detection, classification, and ultimately sensorimotor applications.
HTM has been tested and implemented in the software through Numenta's sample applications and some commercial applications of Numenta partners. (The above is translated from Wikipedia)
Superdimensional computation
Hyperdimensional Computing
Hyperdimensional computing (HDC) is a new computing method, which is inspired by the neural activity model of human brain. This unique type of computing allows AI systems to retain memory and process new information based on previously encountered data or scenarios. In order to model neural activity patterns, HDC system uses a rich algebra, which defines a set of rules to build, bind and bind different hypervectors. Hypervectors are holographic 10000 dimensional (pseudo) random vectors with independent and equally distributed components. By using these hypervectors, HDC can create a powerful computing system that can be used to complete complex cognitive tasks, such as object detection, language recognition, voice and video classification, time series analysis, text classification, analysis and reasoning. (From: a super dimensional computing system that can perform all core calculations in memory). Here is the resource set of the super dimension computing project //github.com/HyperdimensionalComputing/collection
Impulse neural network
Spiking Neural Networks
Spiking neural networks (SNNs) are the third generation neural network models, whose simulated neurons are closer to reality. In addition, the impact of time information is also considered. The idea is that the neurons in the dynamic neural network are not activated in every iterative propagation (but in a typical multi-layer perceptron network), but are activated when its membrane potential reaches a certain value. When a neuron is activated, it will produce a signal to transmit to other neurons, increasing or decreasing its membrane potential.
In impulsive neural networks, the current activation level of neurons (modeled as some differential equation) is usually considered as the current state. An input pulse will make the current value rise for a period of time, and then gradually decline. There are many coding methods that interpret these output pulse sequences as an actual number, and these coding methods will consider both pulse frequency and pulse interval.
(Above excerpt from: process-z.com, Pulse Neural Network Spiking Neural Network Zhang Jingcheng's Blog CSDN Blog Spiking Neural Network)
This article Spiking Neural Networks (Spiking Neural Networks Simons Institute for the Theory of Computing) gives a good overview of the research history, main contributors, current status and problems, and related data sets of impulse neural networks.
Associative memory/predictive coding
Associative Memories via Predictive Coding
The idea comes from the paper Associated Memories via Predictive Coding: //arxiv.org/abs/2109.08063 。 Associative memory model refers to the mode of human brain neurons to store, correlate and retrieve information. Because of its importance in human intelligence, the computational model of associative memory has been developed for decades, including automatic associative memory, which allows the storage of data points and retrieval of stored data points. When providing noise or some variants, s and heterogeneous associative memory can store and call multimodal data.
In the above paper, the authors proposed a new neural model for associative memory, which is based on a hierarchical generation network that receives external stimuli through sensory neurons. The model uses predictive coding for training, which is an error based learning algorithm inspired by information processing in the cortex.
Fractal artificial intelligence
Ractal AI
Fractal AI can derive new mathematical tools by using a structure similar to cellular automata instead of smoothing functions to model information. These tools constitute a new basis for stochastic calculus.
This paper (see: //github.com/Guillemdb/FractalAI ), introduced a new agent (fractal Monte Carlo), which is derived from the first principle of Ractal AI (see: //github.com/Guillemdb/FractalAI/blob/master/introductiontofai.md ), can more effectively support Atari-2600 games under OpenAI Gym than other similar technologies (such as Monte Carlo tree search); In addition, the paper also claims to propose a more advanced Swarm Wave implementation, which also comes from the fractal AI principle, allowing people to solve the Markov decision process under the perfect/information model of the environment.
Thousand brain theory
The Thousand Brains Theory
How does the human brain work? The thousand brain intelligence theory is a very influential theory emerging in recent years. In November 2021, Bill Gates announced his five must read books in 2021. The first one is A Through Brains: A New Theory of Intelligence, written by Jeff Hawkins. The important paper supporting this book is A Framework for Intelligence and Cortical Function Based on Grid Cells in the Neocortex, which was published in 2018. The thousand brain theory describes a general framework for understanding the role and working principle of the neocortex, The neocortex has not only learned a model of the world, but each part has learned a complete object and conceptual model of the world. Then the long-range connections in the neocortex allow models to work together to build your overall perception of the world. This paper also predicted the emergence of a new type of neurons called "displacement cells", which interact with cortical grid cells to represent the position of objects relative to each other.
In June 2021, Jeff Hawkins once attended the Beijing Zhiyuan Conference and delivered a speech entitled The THOUSAND Brains Theory - A roadmap for creating machine intelligence. This is a detailed introduction to the content of the speech: truly realizing intelligence more similar to people! Jeff Hawkins: Road Map for Creating Machine Intelligence
Free energy principle
Free energy principle
The Free energy principle is a formal statement, which explains how biological systems and abiotic systems can maintain non-equilibrium stability by limiting themselves to a limited number of states. It indicates that the system minimizes the free energy function of the internal state, which includes the trust of the hidden state in the environment. The implicit minimization of free energy is formally related to the variational Bayesian methods. It was first introduced by Karl Friston as an explanation of embodied perception in neuroscience, where it is also called "active reasoning".
The principle of free energy explains the existence of a given system. It uses a Markov blank model to try to minimize the difference between their world model and their feelings and related perceptions. This difference can be described as "surprise", and it can be reduced by constantly modifying the world model of the system. Therefore, this principle is based on Bayesian view that the brain is a "reasoning machine". Friston added a second route to minimize: action. By actively changing the world to the expected state, the system can also minimize the free energy of the system. Friston believes that this is the principle of all biological reactions. Friston also believes that his principles apply to both mental disorders and artificial intelligence. The implementation of AI based on active reasoning principle shows advantages over other methods. (From Jizhi Encyclopedia: What is the Jizhi Encyclopedia Jizhi Club of Free Energy Principle)
The principle of free energy is recognized as a very obscure concept. The following is an example of "active reasoning" based on the principle of free energy: Learn by example: Active Inference in the brain - 1 Kaggle
Tstelin machine
Tsetlin machine
From the 1950s, the former Soviet mathematician Tsetlin proposed a learning automaton. This type of automaton can learn data by changing the probability of its own state transition function from learning data, and can use its own state to code information, which is different from neural networks, This kind of automata naturally has the characteristics of time sequence coding for data, and has good interpretability. (From: Future intelligent trend: automata and neural network Zhihu)
However, Rishad Shafik, a senior lecturer at Newcastle University, believes that "learning automata can hardly be realized on hardware because there are a lot of states to adapt to". Ole Christopher Granmo, an AI professor at Argede University in Norway, found a way to reduce the complexity of learning automata by combining learning automata with classical game theory and Boolean algebra. He applied the simplified learning automaton to the software and named it "Tsetlin machine" after the founder of the discipline. (From: Power consumption difference between Tsetlin machine and neural network - artificial intelligence - electronic enthusiast network)
This is the website about Tstelin machines set up by Ole Christoffer Granmo of Agod University in Norway: Home - An Introduction to Tstelin Machines
The new book An Introduction to Tsetlin Machines he is writing is published on it. At present, you can download Chapter 1: //tsetlinmachine.org/wp-content/uploads/2021/09/TsetlinMachineBookChapter1-4.pdf
Hyperbolic machine learning
Hyperbolic Neural Networks
See the NeuroIPS2018 paper Hyperbolic Neural Networks for details( //arxiv.org/abs/1805.09112 )In this paper, the deep neural model is generalized to the non European domain in the form of constructing hyperbolic geometric space. The advantage of hyperbolic space is that its tree shape can be used to visualize large-scale classified data or embed complex networks, and its performance in hierarchical, classified or inherited data is far better than that of European space. In the implicit hierarchical structure, disjoint subtrees are clustered well in the embedded space.
The main contribution of this paper is to establish the gap between hyperbolic space and Euclidean space in the context of neural networks and deep learning, and generalize basic operators and polynomial regression, feedforward networks, RNN and GRU into Poincare model in hyperbolic geometry in a standardized way.
In addition, the recommended readers can also watch the hyperlib library, //github.com/nalexai/hyperlib
The library implements common neural network components in hyperbolic space (using Poincare model). The implementation of the library uses Tensorflow as the back end. According to the author, it can be easily used together with Keras. It aims to help data scientists, machine learning engineers, researchers and others to implement hyperbolic neural networks.
Complex valued neural network
Complex-Valued Neural Networks
Complex valued neural network is a kind of neural network that processes information on the complex plane. Its state variables, connection weights and excitation functions are complex values. Complex valued neural networks can be regarded as a generalization of real valued neural networks, but they are different from real valued neural networks and have more complex properties. Complex valued neural networks can deal with both complex valued problems in real life and real valued problems. They are more powerful and advantageous than real valued neural networks. Some problems cannot be solved by real valued neural networks, such as XOR problems, The complex valued neural network can solve the problem easily. (Excerpt: Xie Dong, Research on Dynamic Behavior of Several Kinds of Complex valued Neural Networks, Research on Dynamic Behavior of Several Kinds of Complex valued Neural Networks - Hunan University, 2017 doctoral dissertation)
This paper Deep ComplexNetworks( //arxiv.org/abs/1705.09792 )For complex neural networks, key atomic components are designed and applied to convolutional feedforward networks and convolutional LSTM.
————————————————
Copyright notice: This is the original article of CSDN blogger Chang Zheng, which follows the CC 4.0 BY-SA copyright agreement. Please attach a link to the original source and this notice for reproduction. Original link: //blog.csdn.net/lionkingcz/article/details/122288707 , from: Zhiyuan Community
About Huiwei Intelligence
Huiwei Intelligent Medical Technology Co., Ltd. was established in June 2019, specializing in the research, development, production and sales of intelligent medical products. Our core members are all from the world's top scientific research institutions and the world's top 500 enterprises. Huiwei Intelligent, driven by its own core technology in the fields of "artificial intelligence" and "edge computing", is committed to providing medical products and services of "high standard, low cost and good experience" to global medical institutions, and helping doctors improve their diagnosis and treatment level and efficiency to the greatest extent.