Upcoming CI Seminars

CI Seminar

Thursday 11th February 2025, 12:00 - 13:30 CET

Speaker: Jack Jewson, Monash University

Location: USI East Campus, Room D5.01

Exact Sampling of Gibbs Measures with Estimated Losses

Abstract:

In recent years, the shortcomings of Bayesian posteriors as inferential devices have received increased attention. A popular strategy for fixing them has been to instead target a Gibbs measure based on losses that connect a parameter of interest to observed data. However, existing theory for such inference procedures assumes these losses are analytically available, while in many situations these losses must be stochastically estimated using pseudo-observations. In such cases, we show that when standard Markov Chain Monte Carlo algorithms are used to produce posterior samples, the resulting posterior exhibits a strong dependence on the number of pseudo-observations: unless the number of pseudo-observations diverge sufficiently fast the resulting posterior will concentrate very slowly. However, we show that in certain situations it is feasible to alleviate this dependence entirely using modified versions of piecewise deterministic Markov process (PDMP) samplers, and we formally and empirically show that these samplers produce posterior draws that have no dependence on the number of pseudo-observations used to estimate the loss within the Gibbs Measure. We apply our results to three Gibbs measures that have been proposed to deal with intractable likelihoods and model misspecification by Cherief Abdellatif and Alquier (2020) and Alquier and Gerber (2024).

Speaker Biography:

Jack is a Senior Lecturer in the Department of Econometrics and Business Statistics at Monash University, Melbourne, Australia. Jack's research focuses on methodological and computational challenges in modern Bayesian Inference. Particular areas of interest include Model Misspecification, Robustness, Inference with Loss Functions, and Variable/Model Selection

Past CI Seminars

CI Seminar

Thursday 30 January 2025, 12:00 - 13:30 CET

Speaker: Vincent Rivoirard, University Paris Dauphine

Location: USI East Campus, Room D5.01

PCA for point processes

Abstract:

We introduce a novel statistical framework for the analysis of replicated point processes that allows for the study of point pattern variability at a population level. By treating point process realizations as random measures, we adopt a functional analysis perspective and propose a form of functional Principal Component Analysis (fPCA) for point processes. The originality of our method is to base our analysis on the cumulative mass functions of the random measures which gives us a direct and interpretable analysis. Key theoretical contributions include establishing a Karhunen-Loève expansion for the random measures and a Mercer Theorem for covariance measures. We establish convergence in a strong sense, and introduce the concept of principal measures, which can be seen as latent processes governing the dynamics of the observed point patterns. We propose an easy-to-implement estimation strategy of eigenelements for which parametric rates are achieved. We fully characterize the solutions of our approach to Poisson and Hawkes processes and validate our methodology via simulations and diverse applications in seismology, single-cell biology and neurosiences, demonstrating its versatility and effectiveness.

Speaker Biography:

Vincent Rivoirard has been Professor at the University Paris Dauphine since 2010 after having been Associate Professor at the University of Paris Sud Orsay between 2003 and 2010. He defended his thesis in statistics in 2002 under the supervision of Dominique Picard. His research interests cover non-parametric and high dimension statistics for Bayesian and frequentist estimation. He is interested in statistical applications in neuroscience, genetics and biology. He was Director of Ceremade between November 1, 2016 and December 31, 2022

CI Lunch Seminar

Friday 6 December 2024, 11:00 - 12:00 CET

Speaker: Hélène Ruffieux, University of Cambridge

Location: USI East Campus, Room D5.01

A Bayesian functional factor model for high-dimensional curves

Abstract:

The increasing availability of longitudinal measurements is set to yield important scientific discoveries in domains such as healthcare, medicine, economics and social sciences. While functional data analysis is an active area of research, methods for modelling complex multivariate functional dependencies remain limited. Motivated by a COVID-19 study conducted at Addenbrooke’s and Papworth Hospitals in Cambridge, which will serve as an illustrative thread throughout this talk, we propose a Bayesian approach for representing high-dimensional curves, combining latent factor modelling and functional principal component analysis (FPCA). This approach captures correlations across variables (e.g., biomarkers) and time, by positing that subsets of variables contribute to a small number of FPCA expansions (e.g., representing latent disease processes) through variable-specific loadings. Subject variability is modelled using a small number of functional principal components, each characterised by a smoothly varying temporal function. We develop a variational inference algorithm, with analytical updates, that couples efficiency and principled parameter uncertainty quantification, and we introduce a model selection procedure for learning the number of factors. Extensive numerical experiments illustrate the ability of the approach to (i) accurately estimate variable-specific loadings, FPCA latent functions and subject-specific component scores, and (ii) scale to high-dimensional datasets (e.g., with panels of 20,000 genes measured longitudinally for a few hundred subjects). Through the COVID-19 study, we illustrate how our framework helps disentangle disease heterogeneity. It clarifies which biomarkers coordinate over time, pointing to key biological pathways, and further enables prediction of molecular trajectories at the subject level, towards targeted interventions and personalised treatments.

This is joint work with Salima Jaoua and Daniel Temko.

Speaker Biography:

Hélène Ruffieux is a Senior Research Fellow at the MRC Biostatistics Unit of the University of Cambridge. She holds a PhD in Mathematics from EPFL. Her research concerns the development of Bayesian methods and their application to open problems in biomedicine, with a focus on scalable hierarchical modelling approaches for variable selection, latent structure discovery and network estimation, in high-dimensional or temporal data settings.

CI Seminar

Tuesday 21 February 2023, 10:30 - 11:30 CET

Speaker: Albert-Jan N. Yzelman, Huawei Research Zurich

Location: USI East Campus

Algebraic and Humble Programming

Abstract:

We first recall the concept of a "humble programmer" and contrast it to that of a "hero programmer". Classically, the former focuses on achieving high productivity, while the latter focuses on extracting peak performance on a given system. Given the complexity of programming novel architectures increases, and the trend of producing heterogeneous systems that combine multiple complex architectures, humble programming models that automatically achieve scalable and high performance regardless of system architecture details is becoming a necessity. This talk introduces the free and open-source Algebraic Programming (ALP) paradigm as one such a candidate solution. With ALP, programmers must annotate their code with algebraic information, which the compiler then 1) exploits for optimization including auto-parallelisation, 2) to detect programmer errors, and 3) to disallow expressions that do not scale. Beyond ALP/GraphBLAS which provides sparse linear algebraic programming, ALP/Dense covers dense linear algebra while ALP/Pregel simulates vertex-centric programming on top of ALP/GraphBLAS. We demonstrate the ALP paradigm’s efficiency and scalability for diverse workloads from numerical linear algebra, graph algorithms, machine learning, and beyond.

Speaker Biography:

Albert-Jan N. Yzelman is a Research Scientist and Expert at the Computing Systems Laboratory in the Huawei Zürich Research Center, and previously held Senior and Principal research positions at Huawei Paris. He obtained his doctorate from Utrecht University, and has held post-doctoral positions at KU Leuven and the Intel ExaScience Labs. His research interests center around paradigms for irregular and parallel computing, focusing on easy to use, yet high-performance, scalable, as well as portable programming principles and associated system design.

CI Lunch Seminar

Tuesday 21 June, 12:30 - 14:00 CET

Speaker: Andreas Wächter, Northwestern University

Location: USI East Campus D5.01 or Online on Microsoft Teams

The ARPA-E Grid Optimization Competition

Abstract:

In recent years, the US Advanced Research Projects Agency-Energy (ARPA-E) has been organizing the “Grid Optimization Competition”. To participate, teams from academia and industry submitted computer program implementations of specialized algorithms for solving large nonconvex AC Security-Constrained Optimal Power Flow (SCOPF) problems. The performance of the solvers was tested and ranked independently by the organizers, using large-scale realistic instances. The goal of SCOPF is the determination of the most cost-efficient operation of an electrical power grid in a such way that it can withstand contingencies in the form of outages of any its components. Mathematically, this is an extremely large-scale two-stage nonlinear and nonconvex optimization problem. In this presentation, we describe the setup of the competition, the computational challenges caused by the nonlinearity of the AC power flow equations, as well as the solution approaches of some of the teams, including our own GO-SNIP team that placed second in the first challenge.

Speaker Biography:

Andreas Wächter is a Professor in the Department of Industrial Engineering and Management Sciences at Northwestern University. His research interests include the design, analysis, implementation, and application of numerical algorithms for nonlinear continuous and mixed-integer optimization, scientific computing, power systems, and sustainability. He obtained his master's degree in Mathematics at the University of Cologne, Germany, and this Ph.D. in Chemical Engineering at Carnegie Mellon University. Before joining Northwestern University in 2011, he was a Research Staff Member in the Department of Mathematical Sciences at IBM Research in Yorktown Heights, NY. He is a recipient of the 2011 Wilkinson Prize for Numerical Software and the 2009 Informs Computing Society Prize for his work on the open-source optimization package Ipopt.

CI Lunch Seminar

Tuesday 7 June, 12:00 - 13:30 CET

Speaker: Cornelius Fritz, University of Munich

Location: USI East Campus D3.01 - Free Pizza!

Temporal Exponential Random Graph Models for signed networks and Relational Event Models for spurious events

Abstract:

Substantive research in the Social Sciences regularly investigates signed networks, where edges between actors are either positive or negative. For instance, schoolchildren can be friends or rivals, just as countries can cooperate or fight each other. This research often builds on structural balance theory, one of the earliest and most prominent network theories, making signed networks one of the most frequently studied matters in social network analysis. While the theorization and description of signed networks have thus made significant progress, the inferential study of tie formation within them remains limited in the absence of appropriate statistical models. We fill this gap by proposing the Signed Exponential Random Graph Model (SERGM), extending the well-known Temporal Exponential Random Graph Model (TERGM) to networks where ties are not binary but negative or positive if a tie exists. Our empirical application uses the SERGM to analyze cooperation and conflict between countries within the international state system.

At the same time, relational event models are an increasingly popular model for studying relational structures; hence the reliability of large-scale event data collection becomes more and more important. Automated or human-coded events often suffer from non-negligible false-discovery rates in event identification. And most sensor data is primarily based on actors' spatial proximity for predefined time windows; hence, the observed events could relate either to a social relationship or random co-location. Both examples imply spurious events that may bias estimates and inference. We propose the Relational Event Model for Spurious Events (REMSE), an extension to existing approaches for interaction data. The model provides a flexible solution for modeling data while controlling for spurious events. We employ this model to combat events from the Syrian civil war.

Speaker Biography:

Cornelius Fritz is a Ph.D. student in statistics under the supervision of Göran Kauermann. In this context, his research mainly revolves around analyzing dynamic networks to answer questions posed within substantive sciences, e.g., Political Science and Sociology, through novel data analysis techniques that combine statistical and machine learning thinking. He is also an active member of the CODAG (COVID-19 Data Analysis Group), where he focuses on analyzing the interplay of mobility patterns and COVID-19 infections.

CI Lunch Seminar

Thursday 17 March, 12:15 - 13:30 CET

Speaker: Ernst Wit

Location: USI East Campus D5.01 - Free Pizza!

2 Data Science cases studies on the COVID-19 pandemic

Abstract:

Although the Covid-19 pandemic is far from over and in fact is resurging in many places across the world, including Switzerland, our response to the pandemic has clearly changed dramatically – as can be seen from this in-person lunch seminar! Which of our responses to the pandemic is more sensible and did we get it wrong then or now, or both? We focus on 2 cases studies.

Knowing the infection fatality ratio (IFR) is of crucial importance for evidence-based epidemic management: for immediate planning; for balancing the life years saved against the life years lost due the consequences of management; and for evaluating the ethical issues associated with the tacit willingness to pay substantially more for life years lost to the epidemic, than for those to other diseases. We focus on the situation at the beginning of the pandemic, to see if we could have known how deadly the disease was.

One preferred management technique of many countries has been the lockdown. It was claimed that this was the only way to get the pandemic under control. Many Asian and pacific countries have introduced severe lockdowns at great economic cost. However, is it really true that only lockdowns could bring the situation under control? We focus our second case study on the situation during the second wave.

CI Seminar with Lars Ruthotto

Wednesday 23 February, 16:30 - 18:00 CET

Location: USI East Campus D1.14

Neural-network approaches for high-dimensional optimal control problems

Abstract: We consider neural network approaches to solving high-dimensional optimal control problems with deterministic and randomly perturbed dynamics. The training process simultaneously approximates the value function of the control problem and identifies the part of the state space likely to be visited by optimal trajectories. The latter is important to avoid the curse of dimensionality associated with solving these problems globally (that is, for all states). We, therefore, consider our approach an approximate semi-global method.

Our network design and the training problem leverage insights from optimal control theory. We approximate the value function of the control problem using a neural network and use the Pontryagin maximum principle to express the optimal control (and therefore the sampling) in terms of the value function. Our training loss consists of a weighted sum of the objective functional of the control problem and penalty terms that enforce the Hamilton Jacobi Bellman equations along the sampled trajectories. Importantly, training is unsupervised in that it does not require solutions of the control problem.

Our approach reduces to the method of characteristics when the dynamics is deterministic. Hence, it can thus be seen as a generalization of recent approaches for solving high-dimensional deterministic control problems. In our numerical experiments, we compare our method to existing solvers for a more general class of semilinear PDEs. Using a two-dimensional toy problem, we demonstrate the importance of the stochastic PMP to inform the sampling. For a 100-dimensional benchmark problem, we demonstrate that approach improves accuracy and time-to-solution.

Bio: Lars Ruthotto is an applied mathematician developing computational methods for machine learning and inverse problems. He is an Associate Professor in the Department of Mathematics and the Department of Computer Science at Emory University and a member of Emory’s Scientific Computing Group. He leads the Emory REU/RET site for Computational Mathematics for Data Science. Prior to joining Emory, he was a postdoc at the University of British Columbia and held PhD positions at the University of Lübeck and the University of Münster.

CI Seminar with Lorenzo Pacchiardi and Ritabrata Dutta

Tuesday 14 December, 1:00pm - 2:30pm CET

Room SI-004 (black building) USI West Campus, or online on Microsoft Teams

Probabilistic weather prediction with generative neural networks trained with scoring rules.

Speaker: Lorenzo Pacchiardi

Numerical weather prediction makes use of physics-based models to provide forecasts. Usually, the model is run repeatedly with perturbed initial conditions and model parameters to provide a probabilistic forecast. The forecast performance is assessed using functions of probability distributions and observed values, called scoring rules. With specific scoring rules, both calibration and sharpness of the probabilistic prediction can be assessed at the same time; additionally, the true data generating process minimizes the expectation of the scoring rule over observation values.
Recently, there have been attempts to develop data-driven weather forecast tools based on neural networks. Most approaches however provide deterministic predictions; some instead obtain a probabilistic distribution via neural network ensembles or by exploiting generative adversarial networks. In comparative studies, however, the probabilistic prediction is often assessed with scoring rules. Here, we propose to train generative conditional neural networks by using the scoring rules as training objective. By doing this, we directly optimize for the measure of performance which interests practitioners. In our training objective, the prediction for each time step is conditioned on a suitable observation window. Additionally, the spatial structure is considered by forecasting the weather for each location according to the previous state in a localized region around it.

Spatial copula with temporal compound Poisson for High-resolution Probabilistic Precipita-tion Prediction

Speaker: Ritabrata Dutta

The accurate prediction of precipitation is important to allow for reliable warnings of flood or drought risk in a changing climate. However, to make trust-worthy predictions of precipitation, at a local scale, is one of the most difficult challenges for today's weather and climate models. This is because important features, such as individual clouds and high-resolution topography, cannot be resolved explicitly within simulations due to the significant computational cost of high-resolution simulations.
Climate models are typically run at ~50-100 km resolution which is insufficient to represent local precipitation events in satisfying detail.
Here, we develop a Bayesian method to make probabilistic precipitation predictions based on features that climate models can resolve well and that is not highly sensitive to the approximations used in individual models. To predict, we will use a temporal compound Poisson distribution with a tail modelled using a generalised extreme value distribution, whose parameters will be temporally dependent on the output of climate models at a location. Further the spatial dependency of the rainfall prediction would be modelled using a spatial copula. We will use the output of Earth System models at coarse resolution (~5 kilometre) as input and train the statistical models towards precipitation observations over Wales at ~10 kilometre resolution. We illustrate the prediction performance of our model by training over 5 years of the data up to 31st December 1999 and predicting precipitation for 20 years afterwards for Cardiff and Wales.

CI Lunch Seminar

Wednesday 17 November, 12:30 - 13:30 CET

Room D2.19 USI East Campus - Free Pizza!

Computational methods for a sustainable Swiss power grid

Timothy Holt, PhD Student, CI

Aggressive climate goals and the renunciation of nuclear power are threatening the reliability of Swiss and European electricity grids. Together with distributed energy generation and storage, peer-to-peer energy trading will be one of the key technologies enabling grid stability with the increasing adoption of intermittent renewable generation. Such trading increases the frequency and improves the fidelity of price signals in the market, reducing risk for consumers and producers, and delivering better information to grid controllers. This technology is enabled by price prediction models employed by market participants, which require cluster-scale compute power. Given the data-intensity of solving these models, parallel execution on many-core HPC systems causes memory bottlenecks, severely impacting computational throughput. In this research we investigate techniques to maximize the throughput of massively parallel model solution on HPC systems. A procedure to find the optimal parallelization technique, and the tools to execute the application using this optimal parallelization while maximizing cache locality will be presented.

Towards scalable and robust approximate learning methods for small data problems

Edoardo Vecchi, PhD Student, CI

Classification problems in the small data regime (with small data statistic T and relatively large feature space dimension D) impose challenges for the common machine learning (ML) and deep learning (DL) tools. The standard learning methods from these areas tend to show a lack of robustness when applied to datasets with significantly less data points than dimensions and quickly reach the overfitting bound, thus leading to a poor performance beyond the training set. To tackle this issue, we briefly discuss the SPA algorithm and its extensions, which address the small data challenge by simultaneously solving the joint optimizational formulation of optimal discretization, feature selection and Bayesian prediction problems.

CI Lunch Seminar with LUT Finland

Monday 18 October, 12:00 - 13:30 CET

Room D5.01 USI East Campus

Gaussian likelihoods for ’intractable’ situations

Prof. Heikki Haario, LUT (Finland)

Various modelling situations – including chaotic dynamics, synchronization, stochastic differential equations, random patterns such as produced by the Turing reaction-diffusion systems, or the Cahn-Hilliard equation – share the analogy that a fixed model parameter corresponds to a family of solutions rather than a fixed deterministic one. This may be due to extreme sensitivity with respect to the initial values, randomized or unknown initial values, or the explicit stochasticity of the system. As a result, standard methods based on directly measuring the distance between model output and data are no more available. We discuss an approach that allows a unified construction of likelihoods for such ‘intractable’ situations. The starting point is the Donsker theorem stating that the cumulative distribution function of i.i.d scalar samples tends to a Gaussian distribution. But the approach can be extended also to weakly dependent, and vector-valued data. Several cases from the above list are presented as examples.

On recently developed non-Gaussian priors and sampling methods with application to industrial tomography

Prof. Lassi Roininen, LUT (Finland)

We consider two sets of new priors for Bayesian inversion and machine learning: The first one is based on mixture of experts models with Gaussian processes. The target is to estimate the number of experts and their parameters, and to make state estimation. For sampling, we use SMC^2. For non-Gaussian priors, we continue the discussion on Cauchy priors and the generalisation to high-order Cauchy fields and further generalisation to alpha-stable fields. For sampling, we use a selection of modern MCMC tools. Finally, we apply some of the methods and models to an industrial tomography problem on estimating log internal structure, measured at sawmills, based on X-ray, RGB camera and laser scanning.

CI Seminar with Michela Taufer

Wednesday 13 October, 4:30pm - 5:30pm CET

Room D1.13 USI East Campus or Online on Microsoft Teams

AI4IO: A SUITE OF AI-BASED TOOLS FOR IO-AWARE HPC RESOURCE MANAGEMENT

High performance computing (HPC) is undergoing many changes at the system level. While scientific applications can reach petaflops or more in computing performance, potentially resulting in larger data generation rates and more frequent checkpointing, the data movement to the parallel file system remains costly due to constraints imposed by HPC centers on the IO bandwidth. In other words, the bandwidth to file systems is outpaced by the rate of data generation; the associated IO contention increases job runtime and delays execution. This situation is aggravated by the fact that when users submit their jobs to a HPC system, they rely on resource managers and job schedulers to monitor and manage the computing resources (i.e., nodes). Bothresource managers and job schedulers remain blind to the impact of IO contention on the overall simulation performance.

In this talk we discuss how Artificial Intelligence (AI) can augment HPC systems to prevent and mitigate IO contention while dealing with IO bandwidth constraints. Our solution, called Analytics for IO (AI4IO), consists of a suite of AI-based tools that enable IO-awareness on HPC systems. Specifically, we present two AI4IO tools: PRIONN and CanarIO. PRIONN automates predictions about user-submitted job resource usage, including per-job IO bandwidth; CanarIO detects, in real-time, the presence of IO contention on HPC systems and predicts which jobs are affected by that contention (e.g., because of their frequent checkpointing). By working in concert, PRIONN and CanarIO predict the a priori knowledge necessary to prevent and mitigate IO contention with IO-aware scheduling. We integrate AI4IO in the Flux scheduler and show how A4IO produce improvements in simulation performance: we observe up to 6.2% improvement in makespan of HPC job workloads, which amounts to more than 18,000 node-hours saved per week on a production-size cluster. Our work is the first step to implementing IO-aware scheduling on production HPC systems.

About the speaker

Michela Taufer is an ACM Distinguished Scientist and holds the Jack Dongarra Professorship in High Performance Computing in the Department of Electrical Engineering and Computer Science at the University of Tennessee Knoxville (UTK). She earned her undergraduate degrees in Computer Engineering from the University of Padova (Italy) and her doctoral degree in Computer Science from the Swiss Federal Institute of Technology or ETH (Switzerland). From 2003 to 2004 she was a La Jolla Interfaces in Science Training Program (LJIS) Postdoctoral Fellow at the University of California San Diego (UCSD) and The Scripps Research Institute (TSRI), where she worked on interdisciplinary projects in computer systems and computational chemistry.

Michela has a long history of interdisciplinary work with scientists. Her research interests include scientific applications on heterogeneous platforms (i.e., multi-core platforms and accelerators); performance analysis, modeling and optimization; Artificial Intelligence (AI) for cyberinfrastructures (CI); AI integration into scientific workflows, computer simulations, and data analytics. She has been serving as the principal investigator of several NSF collaborative projects. She also has significant experience in mentoring a diverse population of students on interdisciplinary research. Michela's training expertise includes efforts to spread high-performance computing participation in undergraduate education and research as well as efforts to increase the interest and participation of diverse populations in interdisciplinary studies.