Weighted Generative Network Models
Historical Background
[ Leave blank for now ]
Weighted Networks
Connectivity in real brains is characterised not just by the presence or absence of connections, but by the strength or weight of those connections. From a structural perspective, these weights are given by the number of white matter fibres connecting two brain regions, as estimated through tractography techniques applied to diffusion imaging data. From a functional perspective, weights can capture the magnitude of statistical dependencies between the activities of different brain regions. In both cases, the weight of a connection encodes the capacity for information flow between nodes, with stronger connections generally enabling more efficient communication. These differences in communication efficiency and influence between nodes are an important detail in understanding brain functioning. As such, it is natural to search for an extension of the standard binary GNM which can account for not just the presence and absence of connections, but additionally their magnitude.
Whilst binary networks are represented using an adjacency matrix containing only values of \(0\) and \(1\) indicating the absence and presence of connections respectively, weighted networks are represented using a weight matrix \(W\) where \(W_{ij}\) captures the strength of the connection between nodes \(i\) and \(j\). As discussed in the Networks and Graphs section, the weight matrix preserves all the structural information of the adjacency matrix whilst providing additional detail about connection strengths. Many network measures can be adapted from binary to weighted versions by incorporating these connection strengths into the calculations.
Weight Optimisation by Loss Minimisation
One perspective for understanding weightes within a network is a normative perspective. Normative methodologies within neuroscience attempt to make predictions about an aspect of biology by hypothesising that the system has been optimised for some objective, subject to biological constraints. For example, considerations of wiring economy and the metabolic cost associated with neural communication imply that brain networks ought to minimise the presence of long-range, high-weight connection. Alternatively, we may argue that the brain has been optimised for efficient information flow, and so ought to have a low characteristic path lenght. Different objectives correspond to different theories about the underlying organisational principles for neural systems.
These hypotheses can be formalised using a loss function, \(L(W)\), which we take the weights to minimise. Below, we discuss various choices for which loss function to use. Once we have specified a loss function, we must then optimise the weights of our network according to this loss function. We minimise the loss using a method known as gradient descent. Gradient descent finds local minima of the loss function by repeatedly updating the weights in the direction of steepest descent for the loss function. The gradient \(\frac{\partial L}{\partial W_{ij}}\) indicates the rate of change of in the loss function with respect to the weight \(W_{ij}\). To minimise the loss, weights are updated in the opposite direction: \(W_{ij} \leftarrow W_{ij} - \alpha \frac{\partial L}{\partial W_{ij}}\), where \(\alpha\) is the learning rate that controls the size of each update step. Larger values of \(\alpha\) result in greater changes to the weights. By repeatedly performing such updates, we can find weight configurations which achieve a low loss and are therefore more plausible according to our hypothesis.
Although the wGNM uses gradient descent to optimise the weights within the network, this should not be understood as articulating a theory of how weight changes are effected within the developing brain. Instead, wGNMs give a normative theory of what principles organise networks; the choice of gradient descent as an optimisation method is secondary. Indeed, the gradient descent optimisation employed in weighted GNMs should be understood as a computational method for finding weight configurations that satisfy particular normative principles, rather than as a model of the actual biological processes that modify synaptic strengths during brain development. The value of these models lies in their ability to generate networks whose final properties resemble those of found in real brain networks, not in the biological plausibility of their intermediate computational steps.
The Weighted Generative Network Model
The weighted generative network model extends the binary generative network model by incorporating both binary updates that grow the network topology and weighted updates that adjust the strengths of existing connections. The wGNM model interleaves binary and weighted updates; at each iteration, some number of binary updates are performed, followed by some number of weighted updates. In the simplest case, the ratio of binary to weighted updates is 1:1. During binary updates, new edges are added to the network according to the same mechanism as in the binary generative network model, using distance and affinity transforms to determine wiring probabilities. During weighted updates, the strengths of existing connections are modified through gradient descent on a specified loss function. After the gradient descent update, weights are clipped between minimum and maximum values. In the simplest case, \(W_{\text{lower}} = 0\) to ensure that weights remain non-negative, while \(W_{\text{upper}} = \infty\), allowing for unbounded positive weights.
Input: Seed adjacency matrix \(A_{ij}\), seed weight matrix \(W_{ij}\), binary parameters, weighted parameters
For number of iterations do:
- For number of binary updates per iteration do:
- Compute wiring probabilities $P_{ij}$ as in the binary GNM.
- Sample a new edge $(i,j)$ from the wiring probabilities $P_{ij}$.
- Add the new edge to adjacency matrix: \(A_{ij} \gets 1\) and \(A_{ji} \gets 1\).
- Initialise corresponding weights within the weight matrix: \(W_{ij} \gets 1\) and \(W_{ji} \gets 1\)
- For number of weighted updates per iteration do:
- Compute gradients of the loss function: \(\frac{\partial L}{\partial W_{ij}}\) for all \(i,j\) where \(A_{ij} = 1\)
- Update weights via gradient descent $$ W_{ij} \gets W_{ij} - \alpha \frac{\partial L}{\partial W_{ij}}, $$ (or \(+\) if maximising the loss)
- Optionally clip the weights between bounds: \(W_{ij} \gets \text{clip}(W_{ij}, W_{\text{lower}}, W_{\text{upper}})\)
End for
Return: Final adjacency matrix \(A_{ij}\) and weight matrix \(W_{ij}\)
Weight Optimisation Criteria
Several different weight optimisation criteria are available for guiding the evolution of connection strengths in weighted generative network models. However, the toolbox also provide a way to easily implement and test additional criteria. The crtieria implemented are often parametrised by a parameter, \(\omega\).
The simplest criterion implemented within the toolbox is the weight criteria, \(L(W) = \sum_{ij} W_{ij}^\omega\). For \(\omega = 1\), this is simply the total weight present in the network; minimisation of this criterion therefore corresponds to a principle of wiring efficiency whereby stronger connections are discouraged. Weighted distance augments this by penalising strong connections more the larger the distance they span is. The loss in this case is given by \(L(W) = \sum_{ij} \left( W_{ij} D_{ij} \right)^\omega\). The biological motivation for minimising weighted distance lies in the energetic costs of building and maintaining neural connections: longer connections require more resources and are therefore subject to stronger evolutionary pressure for efficiency. Normalised weight and normalised weighted distance provide further varients on this core principle.
A more sophisticated set of criteria is built upon communicability. Recall that the number of paths of length \(n\) between nodes \(i\) and \(j\) is simply \([A^n]_{ij}\), where \(A^n\) is the \(n\)-th power of the adjacency matrix \(A\). In a binary network, communicability can be understood as the sum of all possible paths between \(i\) and \(j\), with decreasing weighting assigned to paths of longer length. Specifically, a path of length \(n\) recieves a weighting of \(1/n!\) where \(n! = n \times (n-1) \times \dots \times 2 \times 1\) is the factorial of \(n\). In this manner communicability can be seen as a measure of the total information flow between nodes via any path, with lower information flow through longer paths. It is computed as $$ C_{ij} = [A^0]_{ij} + \frac{1}{1!} A_{ij} + \frac{1}{2!} [A^2]_{ij} + \frac{1}{3!} [A^3]_{ij} = \exp( A )_{ij}, $$ where \(\exp(A)\) is the matrix exponential of the adjacency matrix \(A\).
For a weighted network, we must adjust the definition of communicability. Communicability is still the matrix exponential, but now of the normalised weight matrix rather than the adjacency matrix. Specifically, we first normalising the weight matrix by the node strengths, \(S^{-1/2}W S^{-1/2} = W_{ij}/\sqrt{s_i s_j}\), where \(S\) is a diagonal matrix with \(S_{ii} = s_i\). By normalising like so, we include in our notion of information flow between nodes that information is less likely to travel down one connection if there are many other strong connections out of the same node. We then apply the matrix exponential to this normalised weight matrix, $$ C_{ij} = \exp( S^{-1/2} W S^{-1/2} )_{ij}. $$ The communicability loss is then just the sum of all communicabilities in the network, raised to the power of \(\omega\), $$ L(W) = \sum_{ij} C_{ij}^\omega. $$ The basic communicability loss can be extended in various ways, including weighting the communicabilities by the distance between nodes and normalising the total communicability by the maximum communicability in the network. See the table below.
| Criterion | Loss function, $L(W)$ | Description |
|---|---|---|
| Weight | \(\sum_{ij} W_{ij}^\omega\) | Minimises the sum of connection weights raised to power \(\omega\) |
| Normalised Weight | \(\sum_{ij} \left( \frac{W_{ij}}{\max_{ab} W_{ab}} \right)^\omega\) | Weight criterion normalised by maximum weight in the network |
| Weighted Distance | \(\sum_{ij} (W_{ij} D_{ij})^\omega\) | Penalises strong connections between distant nodes |
| Normalised Weighted Distance | \(\sum_{ij} \left( \frac{W_{ij} D_{ij}}{\max_{ab} W_{ab} D_{ab}} \right)^\omega\) | Weighted distance criterion normalised by maximum weighted distance |
| Communicability | \(\sum_{ij} C_{ij}^\omega\) | Minimises total communicability between all node pairs |
| Normalised Communicability | \(\sum_{ij} \left(\frac{C_{ij}}{\max_{ab} C_{ab}}\right)^\omega\) | Communicability criterion normalised by maximum communicability |
| Distance-Weighted Communicability | \(\sum_{ij} (C_{ij} D_{ij})^\omega\) | Combines communicability with spatial constraints |
| Normalised Distance-Weighted Communicability | \(\sum_{ij} \left(\frac{D_{ij} C_{ij}}{\max_{ab} D_{ab} C_{ab}}\right)^\omega\) | Distance-weighted communicability normalised by maximum value |