Special Issue “Information Geometry for Data Science”
Data, in its many forms and across various disciplines, is becoming an essential source for research in the 21st century. In fact, data driven knowledge extraction nowadays constitutes one of the core paradigms for scientific discovery. This paradigm is supported by the many successes with universal architectures and algorithms, such as deep neural networks, which can explain observed data and, at the same time, generalise extremely well to unobserved new data. Thus, such systems are capable of revealing the intrinsic structure of the data, as an important step within the process of knowledge extraction.more info
Structure is coupled with geometry at various levels. At the lowest level, spatially or temporally extended data, such as images or audio recordings, often exhibit complex geometric features which encode their underlying structure. At the next level of description, we can interpret each data as a structureless point in a high- or infinite-dimensional vector space. Here, structure emerges if we consider a collection of such data points. This can then be modelled in terms of a manifold, leading to the notion of a data manifold, or a distribution of data points. In information geometry, one typically considers such a distribution as a single point, a point in the set of probability measures on the (measurable) space of data points. With this, we enter the next level of description. Again, a collection of points, each of them being a distribution of data points, forms a geometric object, referred to as a statistical model. Traditionally, information geometry has been concerned with the identification of natural geometric structures of such models, the Fisher-Rao metric and the Amari-Chentsov tensor being important instances of these.
Given a set of observed data points, the so-called empirical distribution, it is natural to search for a data distribution from the statistical model that optimally explains the empirical distribution and, at the same time, allows us to predict new data. Such a search within the statistical model is referred to as a learning process, a process intensively studied in statistics and machine learning. It has been demonstrated that the geometry of the statistical model has a great impact on the quality of learning. One instance of this is given by the natural gradient method, which improves the learning simply by utilising the natural geometry induced by the Fisher-Rao metric. The general geometric perspective of information geometry had already a great influence on machine learning and is expected to further influence the general field of data science.
A learning processes is, by its very nature, a data driven and therefore a stochastic process. It takes another level of description in order to interpret that process as a deterministic evolution of a distribution on the given parametrised model. This evolution is typically described in terms of a Kolmogorov equation, a partial differential equation, which can also be studied with the help of Riemannian geometry, where, this time, the Riemannian metric is naturally chosen to be the Otto metric. Here, geometry again yields important insights about the learning process within a statistical model.
The aim of this issue is to highlight recent developments within information geometry that are relevant for data science at any level of the outlined hierarchy of description. Bridging between these levels is subject of the field of optimal transport and Wasserstein geometry. Their consistent integration within information geometry is expected to contribute to the foundations of data science.
Topics of interest include but are not limited to the following themes:
- Machine Learning
- Wasserstein Geometry
- Computational Information Geometry
- Natural Gradient Method
- Canonical Divergences
- Geometry of Deep Neural Networks
- Algebraic Statistics
- Causal Inference
- Dimensionality Reduction
- Manifold Learning
Nihat Ay has co-authored a mathematics book on information geometry.
The book provides a comprehensive introduction and a novel mathematical foundation of the field of information geometry with complete proofs and detailed background material on measure theory, Riemannian geometry and Banach space theory. Parametrised measure models are defined as fundamental geometric objects, which can be both finite or infinite dimensional. Based on these models, canonical tensor fields are introduced and further studied, including the Fisher metric and the Amari-Chentsov tensor, and embeddings of statistical manifolds are investigated.
This novel foundation then leads to application highlights, such as generalizations and extensions of the classical uniqueness result of Chentsov or the Cramér-Rao inequality. Additionally, several new application fields of information geometry are highlighted, for instance hierarchical and graphical models, complexity theory, population genetics, or Markov Chain Monte Carlo.
The book will be of interest to mathematicians who are interested in geometry, information theory, or the foundations of statistics, to statisticians as well as to scientists interested in the mathematical foundations of complex systems.
Nihat Ay serves as the Editor-in-Chief of the Springer journal Information Geometry.
This journal, as the first to be dedicated to the interdisciplinary field of information geometry:
- Embraces the challenge of uncovering and synthesizing mathematical foundations of information science;
- Offers a platform for intellectual engagements with overlapping interests and diverse backgrounds in mathematical science;
- Balances both theoretical and computational approaches, with ample attention to applications;
- Covers investigations of core concepts defining and studying invariance principles such as the Fisher–Rao metric, dual connection, divergence functions, exponential and mixture geodesics, information projections, and many more areas.
The journal engages its readership in geometrizing the science of information. It connects diverse branches of mathematical sciences that deal with probability, entropy, measurement, inference, and related concepts. Coverage includes original work and synthesis exploring the foundation and application of information geometry in both mathematical and computational aspects.