|
1 |
| -% In this manuscript, after presenting the domains related to our subject, we built a theory for convolutions on graphs, in view of using them in CNN on graph domains. On enclidean domains, convolutional layers take advantage of the translational equivariances of the convolution. Therefore, our construction on graph domains depends on a set of transformations of the vertex set for which the resulting operation is also equivariant. More precisely, we wanted to define a class of convolutional operators that are exactly the class of linear operators that are equivariant to this set of transformation. We demonstrated that this characterization holds should this set have an algebraic structure of group, groupoid or path groupoid. In particular, we proved that this amounts to search for Cayley subgraphs. We also saw that the possible abelianity of these structures is linked with the property that the convolution is supported locally. Then, we studied neural networks intended for graph domains. We adopted an approach based on graph representations of the propagation between layers of neurons. We proved that if the local receptive fields of the neurons are intertwined, then if their input have a graph structure, it can be used to define the propagation. We also discovered that the linear part of a layer can be expressed by an operator that involves three operands: the input signal, the weight kernel and the weight sharing scheme. We called it \emph{neural contraction}, in reference to the term \emph{tensor contraction}. We showed that it is associative, commutative, and generic in the sense that it can represent any kind of layer. We used this representation to see the influence of symmetries that are present in the structure of the data. We conducted experiments to learn how the weights are shared in addtion of learning the weights. In a sense, this amounts, in the case of convolutions, to try to learn its set of transformations to which it is equivariant. We saw that it attains similar performances than other state-of-the-art models. Then we made experiments for a CNN for which the convolution is based on a set of translations on graphs, which defines how the weights are shared. We proposed an alogrithm to find the translations. We defined downscaling and data augmentation from these translations, and used a residual network architecture. We showed that this model retrieves the performances of CNNs without feeding to it the underlying structure of images, and that it attains strong performances on graph signal datasets. |
| 1 | +In this manuscript, after a presentation of the fields of research, we developed a theory on convolution of graph signals, and proposed new models that extend deep learning to graph domains. |
2 | 2 |
|
3 |
| -% In conclusion, we proposed a novel layer representation for extending CNN architectures to other input domains than those for what they where intended. This, as pointed out in the introduction, participates in rending them more generic, and thus applicable to a broader range of real world problems. In the process, we also advanced our understanding of convolutions. We hope that the reader had pleasure reading this manuscript and that it gave him ideas and shed new lights. Let us thrive for a continuous effort to help advance collectively the boundaries of human knowledge at our scales and beyond ! |
| 3 | +In \chapref{chap:2}, we formulated two constructions of convolutions of signals defined on a vertex set $V$, based on a group $\Gamma$ acting on $V$. The $\varphi$-convolution can be employed when $\Gamma$ and $V$ are in one-to-one correspondence via an equivariant map $\varphi$, while the $\M$-convolution is a more convenient formulation that can be employed when $\Gamma$ is abelian. We proved that the characterization by equivariance to $\Gamma$, inherited from group convolutions holds. Then we introduced two properties that bind these convolutions with the edge set $E$ : edge-constraint (EC) and locality-preservation (LP). In view of describing operators that are used in deep learning, we proposed formulations with kernels of smaller supports, and proved that the weight sharing property holds. We demonstrated that the existence of convolutions on a graph can be characterized by the existence of Cayley subgraphs. For some graphs, their Cayley subgraphs might not be well representative of their topology. Therefore, we suggested a few strategies and we extended the previous results with convolutions based on groupoids rather than on groups. We constructed two types of groupoids, from partial transformations and paths, and were able to extend the results but also with some limitations. With the first type of groupoid this almost amounted to partition the graph into Cayley subgraphs, whereas with the second one it included degenerated cases. |
4 | 4 |
|
5 |
| -In this manuscript, we proposed a novel layer representation for extending CNN architectures to other input domains than those for what they were intended. This representation is based on a ternary operation that we called \emph{neural contraction}, from which we derived new models: \emph{Monte-Carlo Neural Networks}, \emph{Graph Contraction Networks}, \emph{Translation-Convolutional Neural Networks} ; and a new technique: \emph{graph dropout}. We also showed how to represent related models from the literature with neural contractions. As pointed out in the introduction, this work participates in making CNNs more generic, and thus applicable to a broader range of real world problems. In the process, we also advanced our understanding of convolutions, providing a thorough description with a set of expressions, mathematical results and theorems about how to extend them on graph domains while preserving key properties, and how to characterize them. |
| 5 | +In \chapref{chap:3} we proposed a novel layer representation for extending CNN architectures to other input domains. This representation is based on a ternary operation that we called \emph{neural contraction}, from which we derived new models: \emph{Monte-Carlo Neural Networks} (MCNN), \emph{Graph Contraction Networks} (GCT), \emph{Translation-Convolutional Neural Networks} (TCNN); and a new technique: \emph{graph dropout}. We also showed how to represent related models from the literature with neural contractions. The MCNN is a first idea exploiting the neural contraction. Roughly speaking, it is based on randomizing the structure that is leveraged, then averaging. So we didn't expect much from it but it fared well on a text categorization task. The GCT is based on the idea of learning how the weights are shared while learning them. It set new state-of-the-art performances on the task of semi-supervised classification of nodes in citation networks, but outperforms only by a small margin that is not statistically significant. In particular, we also observed that using graph dropout also significantly improved the results of alternative models on this type of tasks. The TCNN relies on constructing a convolution based on graph translations. It set a new state-of-the-art on classification of scrambled images by a large margin, and performs well on a fMRI dataset which is structured by a graph resembling a grid graph. |
6 | 6 |
|
| 7 | +We tested these models in two types of tasks: supervised classification of graph signals and semi-supervised classification of nodes. The first task is historically the one that gave visibility to CNNs. However, we do not know of a dataset with a very unusual graph structure that is well fitted for the first task, or if it exists, it might not be the best graph to describe the underlying structure. This is why is practice, experiments for the supervised task are done with graphs resembling grid graphs to some extent. This is not the case for the semi-supervised task. |
7 | 8 |
|
8 |
| -\paragraph{\h{0}} |
9 |
| -We hope that the reader had pleasure reading this manuscript and that it gave him ideas and shed new lights. Let us thrive for a continuous effort to help advance collectively the boundaries of human knowledge at our scales and beyond ! |
| 9 | +In the end, both task can be abstracted to a more general one. For example, let us consider a dataset represented by a matrix $X$, of shape $B \times N$, where $B$ is the number of instances and $N$ is the number of features. The linear part of the GCN layer formulation $Y = A X \Theta$ exploits both a graph on the rows (that we saw can be learned with the GCT) and a complete graph on the columns. Thus this layer amounts to two half layers, a sparse and a dense one. This idea can be generalized given a dataset represented by a tensor $X$ of rank~$r$. The formulation would then be $Y = g(X, A_1, A_2, \ldots, A_r)$ where $g$~realizes every tensor contractions between $X$ and each $A_i$ along the corresponding ranks. In that case, the $A_i$ can be seen as a learnable normalized adjacency matrix corresponding to a graph structure on the $i\powth$ rank. This idea can originate a class of neural networks that can be called \emph{multiary neural networks}. |
| 10 | + |
| 11 | +As pointed out in the introduction, this work participates in making CNNs more generic, and thus applicable to a broader range of real world problems. In the process, we also advanced our understanding of convolutions, providing a thorough description with a set of expressions, mathematical results and theorems about how to extend them on graph domains while preserving key properties, and how to characterize them. We hope that the reader had pleasure reading this manuscript and that it gave him ideas and shed new lights !%. Let us thrive for a continuous effort to help advance collectively the boundaries of human knowledge at our scales and beyond ! |
0 commit comments