# Learning as a phenomenon occurring in a critical state

^{a}Institute Computational Physics for Engineering Materials, Eidgenössiche Technische Hochschule, Schafmattstrasse 6, 8093 Zürich, Switzerland; and^{b}Department of Information Engineering and Consorzio Nazionale Interuniversitario per le Scienze Fisiche della Materia, Second University of Naples, 81031 Aversa (CE), Italy

See allHide authors and affiliations

Edited* by H. Eugene Stanley, Boston University, Boston, MA, and approved January 13, 2010 (received for review October 23, 2009)

## Abstract

Recent physiological measurements have provided clear evidence about scale-free avalanche brain activity and EEG spectra, feeding the classical enigma of how such a chaotic system can ever learn or respond in a controlled and reproducible way. Models for learning, like neural networks or perceptrons, have traditionally avoided strong fluctuations. Conversely, we propose that brain activity having features typical of systems at a critical point represents a crucial ingredient for learning. We present here a study that provides unique insights toward the understanding of the problem. Our model is able to reproduce quantitatively the experimentally observed critical state of the brain and, at the same time, learns and remembers logical rules including the exclusive OR, which has posed difficulties to several previous attempts. We implement the model on a network with topological properties close to the functionality network in real brains. Learning occurs via plastic adaptation of synaptic strengths and exhibits universal features. We find that the learning performance and the average time required to learn are controlled by the strength of plastic adaptation, in a way independent of the specific task assigned to the system. Even complex rules can be learned provided that the plastic adaptation is sufficiently slow.

Spontaneous activity is an important property of the cerebral cortex that can have a crucial role in information processing and storage. Recently it has been shown that a unique spatiotemporal form of spontaneous activity is neuronal avalanches, which can involve from a few to a very large number of neurons. These bursts of firing neurons have been first observed (1, 2) in organotypic cultures from coronal slices of rat cortex, where the size and duration of neuronal avalanches follow power law distributions with very stable exponents. The presence of a power law behavior is the typical feature of a system acting in a critical state (3), where large fluctuations are present and the response does not have a characteristic size. The same critical behavior, namely, the same power law exponents, has been recently measured also in vivo from superficial layers of cortex in anesthetized rats during early postnatal development (4), and awake adult rhesus monkeys (5), by using microelectrode array recordings. Results confirm that indeed spontaneous cortical activity adjusts in a critical state where the spatiotemporal organization of avalanches is scale invariant. Moreover, the investigation on the spontaneous activity of dissociated neurons from different networks as rat hippocampal neurons (6), rat embryos (7), or leech ganglia (6) has also confirmed the robustness of this scaling behavior. In all these cases, the emergence of power law distributions has been interpreted in terms of self-organized criticality (SOC) (8). The term SOC usually refers to a mechanism of slow energy accumulation and fast energy redistribution driving the system toward a critical state, where the avalanche extensions and durations do not have a characteristic size.

The understanding of the fundamental relations between electrophysiological activity and brain organization with respect to performing even simple tasks is a long-standing fascinating question. A number of theoretical models (9) have been proposed for learning, from the simple perceptron (10) to attractor neural networks (11) of artificial two-state neurons (12). In these models the state of the “brain” is the snapshot of the ensemble of the individual states of all neurons, which explores phase space following an appropriate dynamics and eventually recovers memories. The ability of the brain to self-organize connections in an efficient way is a crucial ingredient in biologically plausible models. The breakthrough of Hebbian plasticity, postulating synapse strengthening for correlated activity at the pre- and postsynaptic neuron and synapse weakening for decorrelated activity, triggered the development of algorithms for neuronal learning and memory, as, for instance, “reinforcement learning” (13) or error back-propagation (14), leading to the exclusive OR (XOR) rule learning. Recent results have shown that extremal dynamics, where only the neuron with the largest input fires, and uniform negative feedback are sufficient ingredients to learn the following task: to identify the right connection between an input and an output node (15, 16). Similarly, low activity probabilistic firing rules, where again a single neuron fires at each step of the iteration, together with a uniform negative feedback plastic adaptation acting on time scales slower than the neuron firing time scale, enables learning the XOR rule without error back-propagation (17). Both results suggest that the system learns by mistakes; namely, depression rather than enhancement of synaptic strength is the crucial mechanism for learning. However, in both studies a single neuron fires at each step of the evolution, not in complete agreement with recent experimental discoveries. Cooperative effects leading to self-organization and learning are completely neglected in the aforementioned approaches.

Operating at a critical level, far from an uncorrelated subcritical or a too correlated supercritical regime, may optimize information management and transmission in real brains (1, 18–20), as recently confirmed by experiments (21). To confirm this point, a recent study of visual perceptual learning has evidenced that training to a specific task induces dynamic changes in the functional connectivity able to “sculpt” the spontaneous activity of the resting human brain and to act as a form of “system memory” (22). It is therefore tempting to investigate the role that critical behavior plays in the most important task of neuronal networks, namely, learning and memory. The emergence of a critical state with the same critical behavior found experimentally has been recently reproduced by a neuronal network model on the basis of SOC ideas (23, 24). The model implements several physiological properties of real neurons: a continuous membrane potential, firing at threshold, synaptic plasticity, and pruning. Extensive numerical studies on regular, small world, and scale-free networks have shown that indeed the system exhibits a robust critical behavior. The distributions of avalanche size and duration scale with exponents independent of model parameters and in excellent agreement with experimental data (Fig. 1). More precisely, the distribution of avalanche sizes, measured experimentally in terms of either number of active electrodes or summed local field potentials in a microelectrode array (1, 2), decreases with an exponent -1.5, whereas the distribution of avalanche temporal durations decreases with an exponent close to -2.0. A critical avalanche activity also has been found on fully connected (25) and random networks (26). Moreover, the temporal signal for electrical activity and the power spectrum of the resulting time series have been compared with EEG data (23, 24). The spectrum exhibits a power law behavior *P*(*f*) ∼ *f*^{-0.8}, with an exponent in good agreement with EEG medical data (27) and physiological signal spectra for other brain-controlled activities (28). This model therefore seems to capture many of the essential ingredients of spontaneous activity, as measured in cortical networks.

Here we study the learning performance of a neuronal network acting in a critical state. The response of the system to external stimuli is therefore scale-free; i.e., no characteristic size in the number of firing neurons exists. The approach reproduces closely the physiological mechanisms of neuronal behavior and is implemented on a plausible network having topological properties similar to the brain functionality network. Neuronal activity is a collective process where all neurons at threshold can fire and self-organize an efficient path for information transmission. Plastic adaptation is introduced via a nonuniform negative feedback procedure with no error back-propagation.

## The Model

We consider *N* neurons positioned at random in a two-dimensional space. Each neuron is characterized by the potential *v*_{i}. Connections among neurons are established by assigning to each neuron *i* a random outgoing connectivity degree *k*_{outi}. The distribution of the number of out-connections is then chosen in agreement with the experimentally measured properties of the functionality network (29) in human adults. Functional magnetic resonance imaging has indeed shown that this network has universal scale-free properties; namely, it exhibits a scaling behavior , independent of the different tasks performed by the patient. We adopt this distribution for the number of presynaptic terminals of each neuron, over the range of possible values between and , as in experimental data. Two neurons are then connected according to a distance dependent probability *p*(*r*) ∝ *e*^{-r/r0}, where *r* is their spatial distance (30) and *r*_{0} a typical edge length. To each synaptic connection we then assign an initial random strength *g*_{ij}, where *g*_{ij} ≠ *g*_{ji}, and an excitatory or inhibitory character, with a fraction *p*_{in} of inhibitory synapses. An example of such a network is shown in Fig. 2.

The firing dynamics implies that, whenever at time *t* the value of the potential at a site *i* is above a certain threshold *v*_{i}≥*v*_{max} = 6.0, approximately equal to -55 mV for real neurons, the neuron sends action potentials leading to the production of an amount of neurotransmitter proportional to *v*_{i}. As a consequence, the total charge released by a neuron is proportional to the number of synaptic connections, *q*_{i} ∝ *v*_{i}*k*_{outi}. Each connected neuron receives charge in proportion to the strength of the synapse *g*_{ij}: [1]where *k*_{inj} is the in-degree of neuron *j* and the sum is extended to all outgoing connections of *i*. In Eq. **1** it is assumed that the received charge is distributed over the surface of the soma of the postsynaptic neuron, proportional to the number of ingoing terminals *k*_{inj}. The plus or minus sign in Eq. **1** is for excitatory or inhibitory synapses, respectively. After firing, a neuron is set to a zero resting potential and in a refractory state lasting *t*_{ref} = 1 time step, during which it is unable to receive or transmit any charge. We wish to stress that the unit time step in Eq. **1** does not correspond to a real time scale; it is simply the time unit for charge propagation from one neuron to the connected ones. In a real system this time could vary and be as large as 100 ms for longer firing periods. The synaptic strengths have initially a random value *g*_{ij}∈[0.5,1.0], whereas the neuron potentials are uniformly distributed random numbers between *v*_{max} - 1 and *v*_{max}. Moreover, a small random fraction (10%) of neurons are chosen to be boundary sites, with a potential fixed to zero, playing the role of sinks for the charge.

In order to start activity we identify input neurons at which the imposed signal is applied and the output neuron at which the response is monitored. These nodes are randomly placed inside the network under the condition that they are not boundary sites and they are mutually separated on the network by *k*_{d} nodes. *k*_{d} represents the chemical distance on the network and plays the role of the number of hidden layers in a perceptron. We test the ability of the network to learn different rules: AND, OR, XOR, and a random rule RAN that associates to all possible combinations of binary states at three inputs a random binary output. More precisely, the AND, OR, and XOR rules are made of three input–output relations (we disregard the double zero input, which is a trivial test leading to zero output), whereas the RAN rule with three input sites implies a sequence of seven input–output relations. A single learning step requires the application of the entire sequence of states at the input neurons, monitoring the state of the output neuron. For each rule the binary value 1 is identified with the output neuron firing, namely, the neuron membrane potential at a value greater or equal to *v*_{max} at some time during the activity. Conversely, the binary state 0 at the output neuron corresponds to the physiological state of a real neuron that has been depolarized by incoming ions but fails to reach the firing threshold membrane potential during the entire avalanche propagation. Once the input sites are stimulated, their activity may bring to threshold other neurons and therefore lead to avalanches of firings. We impose no restriction on the number of firing neurons in the propagation and let the avalanche evolve to its end according to Eq. **1**. If at the end of the avalanche the propagation of charge did not reach the output neuron, we consider that the state of the system was unable to respond to the given stimulus and, as a consequence, to learn. We therefore increase uniformly the potential of all neurons by units of a small quantity *β* = 0.01, until the configuration reaches a state where the output neuron is first perturbed. We then compare the state of the output neuron with the desired output. Namely, we follow the evolution in phase space of the initial state of the system and verify if the nonergodic dynamics has led to an attractor associated with the right answer.

### Plastic Adaptation.

Plastic adaptation is applied to the system according to a nonuniform negative feedback algorithm. Namely, if the output neuron is in the correct state according to the rule, we keep the value of synaptic strengths. Conversely, if the response is wrong, we modify the strengths of those synapses involved in the information propagation by ± *α*/*d*_{k}, where *d*_{k} is the chemical distance of the presynaptic neuron from the output neuron. Here α represents the ensemble of all possible physiological factors influencing synaptic plasticity. The sign of the adjustment depends on the mistake made by the system: If the output neuron fails to be in a firing state, we increase the used synapses by a small additive quantity proportional to α. Synaptic strengths are instead decreased if the expected output 0 is not fulfilled. Once the strength of a synapse is below an assigned small value *g*_{t} = 10^{-4}, we remove it, i.e., set its strength equal to zero, which corresponds to the so-called *pruning*. This ingredient is very important because for decades the crucial role of selective weakening and elimination of unneeded connections in adult learning has been recognized (31, 32). The synapses involved in the signal propagation and responsible for the wrong answer are therefore not adapted uniformly but inversely proportional to the chemical distance from the output site. Namely, synapses directly connected to the output neuron receive the strongest adaptation ± *α*. This adaptation rule intends to mimic the feedback to the wrong answer triggered locally at the output site and propagating backward toward the input sites. This could be the case, for instance, of some hormones strongly interfering with learning and memory, as dopamine suppressing long term depression (33) or adrenal hormones enhancing long term depression (34). Moreover a recently discovered class of messenger molecules as nitric oxide has been found to have an important role in plastic adaptation (35). For all these agents, released at the output neuron, the concentration is reduced with the distance from the origin. In our neuronal network simulation this nonuniform adaptation has a crucial role because it prevents, in case of successive wrong positive answers, synapses directly connected to the input sites to decrease excessively, hindering any further signal transmission. This plastic adaptation is a non-Hebbian form of plasticity and can be interpreted as a subtractive form of synaptic scaling (36), where synapses are changed by an amount independent of their strength. The procedure mimics the performance of a *good critic* who does not tell the system which neurons should have fired or not. However it tells more than just “right” or “wrong”; it expresses an evaluation on the type of error. Finally, we wish to stress that this model naturally sets the system in a critical state, and therefore the study of the response of the system in a subcritical or supercritical state requires the introduction of additional parameters. We can however suppose that in both cases learning becomes a more difficult task. For instance, in a subcritical state, the size of neuronal avalanches being smaller, it would be more complex to generate a firing state in the output site. Conversely, in a supercritical state it would be more difficult to generate a nonfiring state in the output site.

## Results and Discussion

We analyze the ability of the system to learn the different rules. Fig. 3 shows the fraction of configurations learning the XOR rule versus the number of learning steps for different values of the plastic adaptation strength α. We notice that the larger the value of α, the sooner the system starts to learn the rule; however, the final percentage of learning configurations is lower. The final rate of success increases as the strength of plastic adaptation decreases. This result is because of the highly nonlinear dynamics of the model, where firing activity is an all or none event controlled by the threshold. Very slow plastic adaptation allows one then to tune finely the role of the neurons involved in the propagation and eventually to recover the right answer. Moreover, very slow plastic adaptation also makes the system more stable with respect to noise, because too strong synaptic changes may perturb excessively the evolution hampering the recovery of the right answer. The dependence of the learning success on the plasticity strength is found consistently for different values of the parameters *k*_{d}, *k*_{min}, and *p*_{in}, where a higher percentage of success is observed in systems with no inhibitory synapses. Moreover, the dependence on the plastic adaptation α is a common feature of all tested rules. Data indicate that the easiest rule to learn is OR, where a 100% percentage of success can be obtained. AND and XOR present similar difficulties and both lead to a percentage of final success around 80%, whereas the most difficult rule to learn is the RAN rule with three inputs where only 50% of final success is obtained. This different performance is mainly because of the higher number of inputs, because the system has to organize a more complex path of connections leading to the output site.

The most striking result is that all rules give a higher percentage of success for weaker plastic adaptation. Indeed this result is in agreement with recent experimental findings on visual perceptual learning, where better performances are measured when minimal changes in the functional network occur as a result of learning (22). We characterize the learning ability of a system for different rules by the average learning time, i.e., the average number of times a rule must be applied to obtain the right answer, and the asymptotic percentage of learning configurations. This is determined as the percentage of learning configurations at the end of the teaching routine, namely, after 10^{6} applications of the rule. Fig. 4 shows that the average learning time scales as *τ* ∝ 1/*α* for all rules and independently of parameter values. Because some configurations never learn and do not contribute to the average learning time, we also evaluate the median learning time that exhibits the same scaling behavior as the average learning time. The asymptotic percentage of success increases by decreasing α as a very slow power law, ∝ *α*^{-0.05}. Because this quantity has a finite upper bound equal to unity, this scaling suggests that in a finite, even if very long, time any configuration could learn the rule by applying an extremely slow plastic adaptation. It is interesting to notice that a larger fraction of systems with no inhibitory synapses finds the right answer and the average learning time for these systems is slightly shorter. We understand this result by considering that for only excitatory synapses the system more easily selects a path of strong enough synapses connecting inputs and output sites and giving the right answer. Conversely, the presence of inhibitory synapses may lead to frustration in the system because not all local interactions contribute in the optimal way to provide the right answer and the system has to find alternative paths. We check this scaling behavior by appropriately rescaling the axes in Fig. 3. The curves corresponding to different α values indeed all collapse onto a unique scaling function. Similar collapse is observed for the OR, AND, and RAN rules and for different parameters *k*_{d}, *k*_{min}, and *p*_{in}. In fact, two different cases of distributions of inhibitory synapses, one in which they are chosen randomly among all synapses and the other where certain randomly chosen neurons have all outgoing synapses inhibitory, provide equivalent results. The learning dynamics shows therefore universal properties, independent of the details of the system or the specific task assigned.

The learning behavior is sensitive to the number of neurons involved in the propagation of the signal and therefore depends on the distance between input and output neurons and the level of connectivity in the system. We then investigate the effect of the parameters *k*_{d} and *k*_{min} on the performance of the system. Fig. 5 shows the percentage of configurations learning the XOR rule for different minimum values of the neuron out degree. Systems with larger *k*_{min} have a larger average number of synapses per neuron, producing a more branched network. The presence of several alternative paths facilitates information transmission from the inputs to the output site. However, the participation of more branched synaptic paths in the learning process may delay the time the system first gives the right answer. As expected the performance of the system improves as the minimum out-connectivity degree increases, with the asymptotic percentage of success scaling as . The dependence of the learning performance on the level of connectivity is confirmed by the analysis of systems with different numbers of neurons *N*, the same out-degree distribution, and the same set of parameters. We verify that larger systems exhibit better performances. In larger systems, in fact, the number of hubs, i.e., highly connected neurons, increases improving the overall level of connectivity. Indeed, the existence of complex patterns of activation has been recently recognized as very important in linking together large scale networks in visual perceptual learning (22).

On the other hand, also the chemical distance between the input and output sites has a very important role, as the number of hidden layers in a perceptron. Indeed, as *k*_{d} becomes larger (Fig. 5), the length of each branch in a path involved in the learning process increases. As a consequence, the system needs a higher number of tests to first give the right answer and a lower fraction of configurations learns the rule after the same number of steps. The percentage of learning configurations after 10^{6} applications is found, as expected, to decrease as , and similar behavior is detected for the OR, AND, and RAN rules.

### Learning Stability and Memory.

The existence of systems that are unable to learn, even after many learning steps, raises intriguing questions about the learning dynamics. We question what happens when a second chance is given to the configurations failing the right answer. We then restart the learning routine after imposing a small change in the initial configuration of voltages. This small perturbation leads to about 25% more configurations learning the rule. The initial state of the system can therefore influence the ability to learn, especially for complex rules such as XOR or RAN. On the other hand, the analysis of the out-degree distribution in configurations that did and did not give the right answer indicates that “dumb” configurations tend to have less highly connected nodes than the “smart” ones. Namely, giving repeatedly wrong positive answers leads to pruning of several synapses, which affects in particular the highly connected neurons that have a crucial role in identifying the right synaptic learning path. Finally we test the ability of the configurations that do learn to remember the right answer once the initial configuration is changed. The memory performance of the system is expected to depend on the intensity of the variation imposed, namely, on the extension of the basin of the attraction of states leading to the right answer. The system is able to recover the right answer in more than 50% of the configurations if a very small perturbation (of the order of 10^{-3}) is applied to all neurons or else a larger one (of the order of 10^{-2}) to 10% of neurons. The system has a different memory ability depending on the rule: Almost all configurations remember OR, whereas typically 80% remember AND and at most 70% the XOR rule.

## Conclusions

In conclusion, we investigate the learning ability of a model able to reproduce the critical avalanche activity as observed for spontaneous activity in in vitro and in vivo cortical networks. The ingredients of the model are close to most functional and topological properties of real neuronal networks. The implemented learning dynamics is a cooperative mechanism where all neurons contribute to select the right answer and negative feedback is provided in a nonuniform way. Despite the complexity of the model and the high number of degrees of freedom involved at each step of the iteration, the system can learn successfully even complex rules such as XOR or a random rule with three inputs. In fact, because the system acts in a critical state, the response to a given input can be highly flexible, adapting more easily to different rules. The analysis of the dependence of the performance of the system on the average connectivity confirms that learning is a truly collective process, where a high number of neurons may be involved and the system learns more efficiently if more branched paths are possible. The role of the plastic adaptation strength, considered as a constant parameter in most studies, provides a striking result: The neuronal network has a “universal” learning dynamics; even complex rules can be learned provided that the plastic adaptation is sufficiently slow. This important requirement for plastic adaptation is confirmed by recent experimental results (22) showing that the learning performance, in humans trained to a specific visual task, improves when minimal changes occur in the functionality network. Stronger modifications of the network do not necessarily lead to better results.

## Footnotes

^{1}To whom correspondence should be addressed. E-mail: dearcangelis{at}na.infn.it.Author contributions: L.d.A. and H.J.H. designed research; L.d.A. performed research; L.d.A. analyzed data; and L.d.A. and H.J.H. wrote the paper.

The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

## References

- ↵
- Beggs JM,
- Plenz D

- ↵
- Beggs JM,
- Plenz D

- ↵
- Stanley HE

- ↵
- Gireesh ED,
- Plenz D

- ↵
- Petermann T,
- et al.

- ↵
- Mazzoni A,
- et al.

- ↵
- ↵
- Bak P

- ↵
- Amit D

- ↵
- ↵
- Hopfield J

- ↵
- ↵
- Barto AG,
- Sutton RS,
- Anderson CW

- ↵
- ↵
- ↵
- ↵
- ↵
- Turing AM

- ↵
- ↵
- ↵
- Shew WL,
- Yang H,
- Petermann T,
- Roy R,
- Plenz D

- ↵
- Lewis CM,
- Baldassarre A,
- Committeri G,
- Romani GL,
- Corbetta M

- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- ↵
- Roerig B,
- Chen B

- ↵
- Young JZ

- ↵
- Changeux JP

- ↵
- Otmakhova NA,
- Lisman JE

- ↵
- Coussens CM,
- Kerr DS,
- Abraham WC

- ↵
- Reyes-Harde M,
- Empson R,
- Potter BVL,
- Galione A,
- Stanton PK

- ↵

## Citation Manager Formats

## Article Classifications

- Physical Sciences
- Applied Physical Sciences