Workshop: Sparse Inference on Complex Networks
We are pleased to invite you to join us for a workshop on “Sparse Inference on Complex Networks”, held at Università della Svizzera italiana on June 26-27, 2023. The workshop starts at 9:00 on Monday 26 June, and finishes after lunch on Tuesday 27 June. Participants are encouraged to arrive on Sunday evening and schedule their departure after 14:00 on Tuesday.
The workshop will focus on three key themes:
- Causal Regularization
- Random Graphical Models
- Diversification processes
The study of network structures has become increasingly important in fields such as economics, epidemiology, ecology, and biology. However, analyzing complex networks remains a challenge, even with big data. This workshop will explore recent advances in sparse inference on complex networks and delve into the themes mentioned above.
The workshop will be held at the Institute of Computing, Faculty of Informatics, Università della Svizzera italiana. Via la Santa 1, Lugano.
Prof. Dr. Ernst Wit and Dr. Francisco Richter.
- 9:15-9:30 – Introduction and Welcome
- 9:30-10:45 – Theme 1A: Causal Regularization
- The Causal Chamber: Validating causal inference algorithms with data from well-understood physical mechanisms Juan Gamella
- Causal regularization Ernst Wit
- 10:45-11:15 Coffee
- 11:15-12:30 – Theme 1B: Causal Regularization
- Some ideas for extending causal regularization to generalized linear models Alice Polinelli
- Multi-environment causal regularization Philip Kennerberg
Lunch break: 12:30 – 14:00
- 14:00-15:30 – Theme 2: Diversification modelling
- The expected degree distribution in duplication divergence models Tiffany Yin Yuan Lo
- Inferring the drivers of diversification Francisco Richter
- Bayesian time-windows inference in temporal networks Giona Casiraghi
- 15:30 – 16:00 Coffee
- 16:00-17:20 – Theme 3: Networks
- The social components of innovation: from data analysis to mathematical modelling Gabriele Di Bona
- Causal Discovery with Missing Data in a Multicentric Clinical Study Alessio Zanga
- Multiple Graphical Horseshoe estimator for modeling correlated precision matrices Claudio Busatto
- 17:20-17:36 – Poster lightning talks (2 mins each)
- Conditional Gaussian Graphical Models for Functional Variables Rita Fici
- Stochastic Gradient Relational Event Additive Model Edoardo Filippi-Mazzola
- Comparing Healthcare Fundraising Campaigns using Psychophysiological Data: a Network-Based Approach Spyros Balafas
- Causal Networks in Finance Katerina Rigana
- Evaluating the Goodness of Fit of Relational Event Models via Weighted Sums of Martingale Residuals Martina Boschi
- Latent Space models to detect complex patterns in mass spectrometry imaging of thyroid nodules Giulia Capitoli
- Nested case-control sampling for baseline hazard estimation in relational event models: preliminary ideas on theoretical formulation and a simulation study Melania Lembo
- Latent event history models for quasi-reaction systems Matteo Framba
- 17:37-19:30 – Drinks and Poster Session
- Participants will have time to network and discuss their work with others
- 09:00-09:15 Coffee
- 9:15-10:45 – Theme 3A: Graphical Models
- Random graphical modelling of cross-country cultural heterogeneity Veronica Vinciotti
- Recent advances in Bayesian Graphical Models with Varying Structure Francesco Stingo
- Learning Block Stuctured Graphs in Gaussian Graphical Models Alessandro Colombi
- 10:45-11:15 Coffee
- 11:15-12:30 – Theme 3B: Graphical Models
- Analysing Google Search Trends Data with Dynamic Bayesian Networks Marco Scutari
- Bayesian learning of network structures from interventional experimental data Directed Stefano Peluso
- Causal inference from categorical Bayesian networks Federico Castelletti
Lunch break and departures: 12:30 – 14:00
Recent Advances in Bayesian Graphical Models with Varying Structure
We then introduce Bayesian Gaussian graphical models with covariates (GGMx), a class of multivariate Gaussian distributions with covariate-dependent sparse precision matrix. We propose a general construction of a functional mapping from the covariate space to the cone of sparse positive definite matrices, that encompasses many existing graphical models for heterogeneous settings. The flexible formulation of GGMx allows both the strength and the sparsity pattern of the precision matrix (hence the graph structure) change with the covariates. Extensive simulations and a case study in cancer genomics demonstrate the utility of the proposed models. Joint work with Yang Ni and Veerabhadran Baladandayuthapani.
Analysing Google Search Trends Data with Dynamic Bayesian Networks
Google search data provides a wealth of information that would be difficult to collect from other sources. In particular, they contain long-term multivariate time series data on topics that would normally not be investigated together. This makes them ideal to learn dynamic Bayesian network models that connect different domains and that can be interpreted as causal models under both Pearl’s and Granger’s causality. As an example, I will illustrate a recent infodemiology work I have done for L’Oréal to explore the causal effects and the feedback loops between 12 skin diseases and mental illnesses. There are no available clinical trials for such a large and varied set of conditions, so we choose to rely on Google Search keyword frequencies as a proxy: they are screened and categorised by Google’s advanced NLP deep learning models which ensure data quality and consistency. (This work is currently submitted and under peer review.)
The Causal Chamber: Validating causal inference algorithms with data from well-understood physical mechanisms
A fundamental difficulty in the field of causal inference is the absence of good validation datasets collected from real systems or phenomena. This is partly due to there being few incentives to collect and publish data from real systems that are already well understood, although such systems would be the ideal test-bed for a large spectrum of causal inference algorithms. To address this problem, we have constructed two physical devices that allow measuring and manipulating different variables of simple but well-understood physical phenomena. The devices enable the inexpensive collection of large amounts of multivariate observational and interventional data, which together with the presence of a justified causal ground-truth, make them suitable to validate a wide range of causal inference algorithms. I will present the devices, their ground truths, and some of the data we have already collected. Since this is ongoing work meant to serve to the community, your input is very welcome, and this will be the main purpose of my talk. What kind of data would be useful to you?
Random graphical modelling of cross-country cultural heterogeneity
Cultural values vary significantly around the world, as evidenced by numerous quantitative survey data. The changes can be seen both at the level of the marginal distributions of cultural traits for the different countries and of the dependency structure between these traits. The latter has only recently been taken into consideration and has been shown to play a key role in locating countries within a cultural spectrum. As the interest is to discover the dependencies between the cultural traits that are specific to each country as well as structural similarities between countries, we propose a computational approach for the joint inference of graphical models from the different environments. To this end, a random graph generative model is introduced. I will present a formulation of the model with a latent space, that captures relatedness across the different countries at the structural level, and a second formulation where potential drivers, such as geographical distance between countries, are directly included in the generative model. In addition, the model allows for the inclusion of external covariates at both the node and edge levels, further adapting to the richness and complexity of high dimensional data from diverse application areas.
The expected degree distribution in duplication divergence models
Tiffany Yin Yuan Lo
We study the degree distribution of a randomly chosen vertex in a duplication–divergence graph, paying particular attention to what happens when a non-trivial proportion of the vertices have large degrees, establishing a central limit theorem for the logarithm of the degree distribution. Our approach, as in Jordan (2018) and Hermann and Pfaffelhuber (2016), relies heavily on the analysis of related birth–catastrophe processes. This is joint work with A. D. Barbour.
Some ideas for extending causal regularization to generalized linear models
Causal regularization was originally introduced in the context of a linear structural equation model. As such, its derivation and technical details were bounded up with the specific linear context. Nevertheless, the idea of using causal invariance in order to improve out-of-sample prediction stability can be extended to other contexts. In this talk, we present an extension of the linear structural equation model to the case of observations from an exponential family by means of a copula approach. Then we will show what type of causal invariance holds for a generalized linear model by means of the Pearson residuals. Using this invariance, we define an extension of causal regularization. Finally, we will show how this regularization helps to reduce the out-of-sample prediction error.
The social components of innovation: from data analysis to mathematical modelling
Gabriele Di Bona
Novelties drive societal progress, yet we lack a comprehensive understanding of the factors that generate them. Recent evidence suggests that innovation emerges from the balance between exploiting past discoveries and exploring new possibilities, the so-called “adjacent possible”. In this talk, I will hence develop a general framework to investigate how people navigate the seemingly infinite space of possibilities. After defining what is a novelty and how we measure the rate of discovery, I will go over some basic models of innovation that reproduce the empirical findings, based on extractions from urn and random walks on networks. Then, I will explore the role of social interactions in enhancing discoveries. In particular, I consider agents that extend their adjacent possible through the social links of a complex network, capitalizing on opportunities from peers. The addition of a social dimension reveals that an individual’s discovery potential is influenced by their position in the network. Finally, I will show some results combining the mechanisms described to create a data-driven model of music exploration on online platforms.
In recent decades, several data analytic ways of dealing with causality have been introduced, such as propensity score matching, the PC algorithm, and Causal Dantzig. Although originally hailed for their interpretational appeal, here we study the identification of causal-like models from in-sample data that provide out-of-sample risk guarantees when predicting a target variable from a set of covariates. Whereas ordinary least squares provides the best in-sample risk with limited out-of-sample guarantees, causal models have the best out-of-sample guarantees by sacrificing in-sample risk performance. We introduce causal regularization, by defining a trade-off between these properties. As the regularization increases, causal regularization provides estimators whose risk is more stable at the cost of increasing their overall in-sample risk. The increased risk stability is shown to result in out-of-sample risk guarantees. We provide finite sample risk bounds for all models and prove the adequacy of cross-validation for attaining these bounds.
Multi-environment causal regularization
In this talk I will introduce some of our new results in casual regularization in a multi-environment setting. This will cover two papers, ”Convergence properties of multi-environment causal regularization” (paper A) and ”Optimal multi-environment causal regularization” (paper B). In both papers we start out with a linear structural equation model (SEM) for our target and covariates, over the observational and several shifted environments. In paper A we define our casual regularizer as the argmin solution of the worst risk among all weighted (with weights from the unit sphere) environments dominated by some given weighted environment. We solve this argmin equation explicitly and study the plug-in estimator of the solution. We then show concentration in measure results, find bounds for the conditional variance (and more generally so-called q-variances) and bounds for the determinant of the covariance matrix.
In paper B we change the setting slightly by not having fixed weights but instead we study what happens when we consider all weighted environments that are dominated by any weighted environment, with weights from the unit sphere. We show that for every argument β, we have a worst-risk decomposition that is somewhat analogous to the one in paper A, but that now actually depends on β. Then we define our casual regularizer as the argmin solution for this worst risk over all β. We show that there is at least one such solution (but there can be several) and we consider the corresponding plug-in estimators from this family of solutions. The main result is that if we take any such estimator, it will be consistent in the almost sure sense. A practical obstacle that arises however is that the argmin solutions are not on closed form, since they involve solving polynomials of general degree. Therefore we study another family of approximate plug-in estimators which solve these polynomials by the bisection method and show that also these estimators are consistent.
Causal Discovery with Missing Data in a Multicentric Clinical Study
Causal inference for testing clinical hypotheses from observational data presents many difficulties because the underlying data-generating model and the associated causal graph are not usually available. Furthermore, observational data may contain missing values, which impact the recovery of the causal graph by causal discovery algorithms: a crucial issue often ignored in clinical studies. In this work, we use data from a multi-centric study on endometrial cancer to analyze the impact of different missingness mechanisms on the recovered causal graph. This is achieved by extending state-of-the-art causal discovery algorithms to exploit expert knowledge without sacrificing theoretical soundness. We validate the recovered graph with expert physicians, showing that our approach finds clinically-relevant solutions. Finally, we discuss the goodness of fit of our graph and its consistency from a clinical decision-making perspective using graphical separation to validate causal pathways.
Causal inference from categorical Bayesian networks
We consider a collection of categorical variables whose joint distribution encodes a set of conditional independencies that can be represented through a Directed Acylic Graph (DAG). Focusing on one variable in the system, we are interested in evaluating the causal effect following an hypothetical intervention on another variable. The latter crucially depends on the underlying DAG structure which is typically unknown and accordingly must be inferred from the available data. We propose a Bayesian methodology which combines structure learning of DAGs and causal effect estimation.
Longitudinal Network Models, Multigraphs, and Implications to Estimation and Testing
Consider longitudinal networks whose edges turn on and off according to a discrete-time Markov chain with exponential-family transition probabilities. We characterize when their joint distributions are also exponential families with the same parameter, and show that the permutation-uniform subclass of these chains permit interpretation as an independent, identically distributed sequence on the same state space. We apply these ideas to temporal exponential random graph models, for which permutation uniformity is well suited, and discuss mean-parameter convergence, dyadic independence, and exchangeability. The framework facilitates applying standard tools to longitudinal-network Markov chains from either asymptotics or single-observation exponential random graph models. The latter are often in log-linear form, allowing us to frame the problem of testing model fit through an exact conditional test whose p-value can be approximated efficiently in networks of both small and moderately large sizes. An important extension of this theory is to latent-variable blockmodels, an application which will be briefly discussed. This talk is based on joint work with William K. Schwartz, Hemanshu Kaul, Despina Stasi, Elizabeth Gross, Debdeep Pati, and Vishesh Karwa.
Bayesian time-windows inference in temporal networks
Recent years have witnessed a surge in the development of methods for analysing temporal networks. For many applications, though, we rely on methods which cannot account for time-stamped edges. These require aggregating temporal data into static networks leading to a considerable loss of temporal information. A common strategy to mitigate this issue is to divide the overall timespan of the temporal network into distinct time windows of size $\delta$, thereby transforming the temporal network into a sequence of static ones. This methodology allows an approximation of the system’s evolution while preserving a degree of temporal granularity. Yet, the optimal selection of the time-window size, $\delta$, remains an open question. To address this, our study presents a novel Bayesian configuration model tailored for temporal networks. Leveraging the minimum description length principle, we propose a non-parametric method for determining the optimal time window size for temporal network aggregation. Furthermore, thanks to the inherent simplicity of configuration models, our approach scales efficiently to large networks.
Learning Block Stuctured Graphs in Gaussian Graphical Models
Motivated by the analysis of spectrometric data, we introduce a Gaussian graphical model for learning the dependence structure among frequency bands of the infrared absorbance spectrum. The spectra are modeled as continuous functional data through a B-spline basis expansion and a Gaussian graphical model is assumed as a prior specification for the smoothing coefficients. The structure of the graph is unknown and is learned from the data. Inference is carried out to simultaneously smooth the curves and to estimate the conditional independence structure between portions of the functional domain. To improve the interpretability of such relationships, we introduce a prior distribution that imposes a block structure in the adjacency matrix of the graph. Conditionally on this choice, the structure of the precision matrix is constrained by the graph through a G-Wishart prior distribution. Finally, we develop a Double Reversible Jump Monte Carlo Markov chain that avoids any G-Wishart normalizing constant calculation when comparing graphical models. The novel algorithm looks for block structured graphs, hence proposing moves that add or remove not just a single link but an entire group of them. The proposed model is applied to the analysis of infrared absorbance spectra of strawberry purees.
Bayesian learning of network structures from interventional experimental data Directed
Acyclic Graphs (DAGs) provide an effective framework for learning causal relationships among variables given multivariate observations. Under pure observational data, DAGs encoding the same conditional independencies cannot be distinguished and are collected into Markov equivalence classes. In many contexts however, observational measurements are supplemented by interventional data that improve DAG identifiability and enhance causal effect estimation. We propose a Bayesian framework for multivariate data partially generated after stochastic interventions. To this end, we introduce an effective prior elicitation procedure leading to a closed-form expression for the DAG marginal likelihood and guaranteeing score equivalence among DAGs that are Markov equivalent post intervention. Under the Gaussian setting we show, in terms of posterior ratio consistency, that the true network will be asymptotically recovered, regardless of the specific distribution of the intervened variables and of the relative asymptotic dominance between observational and interventional measurements. We validate our theoretical results in simulation and we implement on both synthetic and biological protein expression data a Markov chain Monte Carlo sampler for posterior inference on the space of DAGs.
Multiple Graphical Horseshoe estimator for modeling correlated precision matrices
We develop a novel full-Bayesian approach for multiple correlated precision matrices, called multiple Graphical Horseshoe (mGHS). The proposed approach relies on a novel multivariate shrinkage prior based on the Horseshoe prior that borrows strength and shares sparsity.
Abstract Submission and Registration
We invite attendees to present their work at the workshop, either as a poster or as an oral presentation (few slots available). Please use the link below.
Registration for the workshop is free and includes lunches and dinner. However, we kindly ask you to indicate which days you will attend during the registration process, so we can make the proper reservations. To register for the workshop, please follow this registration link by 5th June.
We look forward to your participation in this exciting workshop!