# Upcoming CI Seminars

## CI Lunch Seminar

Tuesday 21 June, 12:30 - 14:00 CET

Speaker: Andreas Wächter, Northwestern University

Location: USI East Campus D5.01 or Online on Microsoft Teams

### The ARPA-E Grid Optimization Competition

**Abstract**:

**Speaker Biography:**

# Past CI Seminars

## CI Lunch Seminar

Tuesday 7 June, 12:00 - 13:30 CET

Speaker: Cornelius Fritz, University of Munich

Location: USI East Campus D3.01 - Free Pizza!

### Temporal Exponential Random Graph Models for signed networks and Relational Event Models for spurious events

**Abstract**:

At the same time, relational event models are an increasingly popular model for studying relational structures; hence the reliability of large-scale event data collection becomes more and more important. Automated or human-coded events often suffer from non-negligible false-discovery rates in event identification. And most sensor data is primarily based on actors' spatial proximity for predefined time windows; hence, the observed events could relate either to a social relationship or random co-location. Both examples imply spurious events that may bias estimates and inference. We propose the Relational Event Model for Spurious Events (REMSE), an extension to existing approaches for interaction data. The model provides a flexible solution for modeling data while controlling for spurious events. We employ this model to combat events from the Syrian civil war.

**Speaker Biography:**

## CI Lunch Seminar

Thursday 17 March, 12:15 - 13:30 CET

Speaker: Ernst Wit

Location: USI East Campus D5.01 - Free Pizza!

### 2 Data Science cases studies on the COVID-19 pandemic

**Abstract**:

Although the Covid-19 pandemic is far from over and in fact is resurging in many places across the world, including Switzerland, our response to the pandemic has clearly changed dramatically – as can be seen from this in-person lunch seminar! Which of our responses to the pandemic is more sensible and did we get it wrong then or now, or both? We focus on 2 cases studies.

Knowing the infection fatality ratio (IFR) is of crucial importance for evidence-based epidemic management: for immediate planning; for balancing the life years saved against the life years lost due the consequences of management; and for evaluating the ethical issues associated with the tacit willingness to pay substantially more for life years lost to the epidemic, than for those to other diseases. We focus on the situation at the beginning of the pandemic, to see if we could have known how deadly the disease was.

One preferred management technique of many countries has been the lockdown. It was claimed that this was the only way to get the pandemic under control. Many Asian and pacific countries have introduced severe lockdowns at great economic cost. However, is it really true that only lockdowns could bring the situation under control? We focus our second case study on the situation during the second wave.

## CI Seminar with Lars Ruthotto

Wednesday 23 February, 16:30 - 18:00 CET

Location: USI East Campus D1.14

### Neural-network approaches for high-dimensional optimal control problems

**Abstract**: We consider neural network approaches to solving high-dimensional optimal control problems with deterministic and randomly perturbed dynamics. The training process simultaneously approximates the value function of the control problem and identifies the part of the state space likely to be visited by optimal trajectories. The latter is important to avoid the curse of dimensionality associated with solving these problems globally (that is, for all states). We, therefore, consider our approach an approximate semi-global method.

**Bio**: Lars Ruthotto is an applied mathematician developing computational methods for machine learning and inverse problems. He is an Associate Professor in the Department of Mathematics and the Department of Computer Science at Emory University and a member of Emory’s Scientific Computing Group. He leads the Emory REU/RET site for Computational Mathematics for Data Science. Prior to joining Emory, he was a postdoc at the University of British Columbia and held PhD positions at the University of Lübeck and the University of Münster.

## CI Seminar with Lorenzo Pacchiardi and Ritabrata Dutta

Tuesday 14 December, 1:00pm - 2:30pm CET

Room SI-004 (black building) USI West Campus, or online on Microsoft Teams

### Probabilistic weather prediction with generative neural networks trained with scoring rules.

Speaker: Lorenzo Pacchiardi

Numerical weather prediction makes use of physics-based models to provide forecasts. Usually, the model is run repeatedly with perturbed initial conditions and model parameters to provide a probabilistic forecast. The forecast performance is assessed using functions of probability distributions and observed values, called scoring rules. With specific scoring rules, both calibration and sharpness of the probabilistic prediction can be assessed at the same time; additionally, the true data generating process minimizes the expectation of the scoring rule over observation values.

Recently, there have been attempts to develop data-driven weather forecast tools based on neural networks. Most approaches however provide deterministic predictions; some instead obtain a probabilistic distribution via neural network ensembles or by exploiting generative adversarial networks. In comparative studies, however, the probabilistic prediction is often assessed with scoring rules. Here, we propose to train generative conditional neural networks by using the scoring rules as training objective. By doing this, we directly optimize for the measure of performance which interests practitioners. In our training objective, the prediction for each time step is conditioned on a suitable observation window. Additionally, the spatial structure is considered by forecasting the weather for each location according to the previous state in a localized region around it.

### Spatial copula with temporal compound Poisson for High-resolution Probabilistic Precipita-tion Prediction

Speaker: Ritabrata Dutta

The accurate prediction of precipitation is important to allow for reliable warnings of flood or drought risk in a changing climate. However, to make trust-worthy predictions of precipitation, at a local scale, is one of the most difficult challenges for today's weather and climate models. This is because important features, such as individual clouds and high-resolution topography, cannot be resolved explicitly within simulations due to the significant computational cost of high-resolution simulations.

Climate models are typically run at ~50-100 km resolution which is insufficient to represent local precipitation events in satisfying detail.

Here, we develop a Bayesian method to make probabilistic precipitation predictions based on features that climate models can resolve well and that is not highly sensitive to the approximations used in individual models. To predict, we will use a temporal compound Poisson distribution with a tail modelled using a generalised extreme value distribution, whose parameters will be temporally dependent on the output of climate models at a location. Further the spatial dependency of the rainfall prediction would be modelled using a spatial copula. We will use the output of Earth System models at coarse resolution (~5 kilometre) as input and train the statistical models towards precipitation observations over Wales at ~10 kilometre resolution. We illustrate the prediction performance of our model by training over 5 years of the data up to 31st December 1999 and predicting precipitation for 20 years afterwards for Cardiff and Wales.

## CI Lunch Seminar

Wednesday 17 November, 12:30 - 13:30 CET

Room D2.19 USI East Campus - Free Pizza!

### Computational methods for a sustainable Swiss power grid

Timothy Holt, PhD Student, CI

Aggressive climate goals and the renunciation of nuclear power are threatening the reliability of Swiss and European electricity grids. Together with distributed energy generation and storage, peer-to-peer energy trading will be one of the key technologies enabling grid stability with the increasing adoption of intermittent renewable generation. Such trading increases the frequency and improves the fidelity of price signals in the market, reducing risk for consumers and producers, and delivering better information to grid controllers. This technology is enabled by price prediction models employed by market participants, which require cluster-scale compute power. Given the data-intensity of solving these models, parallel execution on many-core HPC systems causes memory bottlenecks, severely impacting computational throughput. In this research we investigate techniques to maximize the throughput of massively parallel model solution on HPC systems. A procedure to find the optimal parallelization technique, and the tools to execute the application using this optimal parallelization while maximizing cache locality will be presented.

### Towards scalable and robust approximate learning methods for small data problems

Edoardo Vecchi, PhD Student, CI

Classification problems in the small data regime (with small data statistic T and relatively large feature space dimension D) impose challenges for the common machine learning (ML) and deep learning (DL) tools. The standard learning methods from these areas tend to show a lack of robustness when applied to datasets with significantly less data points than dimensions and quickly reach the overfitting bound, thus leading to a poor performance beyond the training set. To tackle this issue, we briefly discuss the SPA algorithm and its extensions, which address the small data challenge by simultaneously solving the joint optimizational formulation of optimal discretization, feature selection and Bayesian prediction problems.

## CI Lunch Seminar with LUT Finland

Monday 18 October, 12:00 - 13:30 CET

Room D5.01 USI East Campus

*Gaussian likelihoods for ’intractable’ situations*

Prof. Heikki Haario, LUT (Finland)

Various modelling situations – including chaotic dynamics, synchronization, stochastic differential equations, random patterns such as produced by the Turing reaction-diffusion systems, or the Cahn-Hilliard equation – share the analogy that a fixed model parameter corresponds to a family of solutions rather than a fixed deterministic one. This may be due to extreme sensitivity with respect to the initial values, randomized or unknown initial values, or the explicit stochasticity of the system. As a result, standard methods based on directly measuring the distance between model output and data are no more available. We discuss an approach that allows a unified construction of likelihoods for such ‘intractable’ situations. The starting point is the Donsker theorem stating that the cumulative distribution function of i.i.d scalar samples tends to a Gaussian distribution. But the approach can be extended also to weakly dependent, and vector-valued data. Several cases from the above list are presented as examples.

*On recently developed non-Gaussian priors and sampling methods with application to industrial tomography*

Prof. Lassi Roininen, LUT (Finland)

We consider two sets of new priors for Bayesian inversion and machine learning: The first one is based on mixture of experts models with Gaussian processes. The target is to estimate the number of experts and their parameters, and to make state estimation. For sampling, we use SMC^2. For non-Gaussian priors, we continue the discussion on Cauchy priors and the generalisation to high-order Cauchy fields and further generalisation to alpha-stable fields. For sampling, we use a selection of modern MCMC tools. Finally, we apply some of the methods and models to an industrial tomography problem on estimating log internal structure, measured at sawmills, based on X-ray, RGB camera and laser scanning.

## CI Seminar with Michela Taufer

Wednesday 13 October, 4:30pm - 5:30pm CET

Room D1.13 USI East Campus or Online on Microsoft Teams

**AI4IO: A SUITE OF AI-BASED TOOLS FOR IO-AWARE HPC RESOURCE MANAGEMENT**

High performance computing (HPC) is undergoing many changes at the system level. While scientific applications can reach petaflops or more in computing performance, potentially resulting in larger data generation rates and more frequent checkpointing, the data movement to the parallel file system remains costly due to constraints imposed by HPC centers on the IO bandwidth. In other words, the bandwidth to file systems is outpaced by the rate of data generation; the associated IO contention increases job runtime and delays execution. This situation is aggravated by the fact that when users submit their jobs to a HPC system, they rely on resource managers and job schedulers to monitor and manage the computing resources (i.e., nodes)**. **Bothresource managers and job schedulers remain blind to the impact of IO contention on the overall simulation performance.

In this talk we discuss how Artificial Intelligence (AI) can augment HPC systems to prevent and mitigate IO contention while dealing with IO bandwidth constraints. Our solution, called Analytics for IO (AI4IO), consists of a suite of AI-based tools that enable IO-awareness on HPC systems. Specifically, we present two AI4IO tools: PRIONN and CanarIO. PRIONN automates predictions about user-submitted job resource usage, including per-job IO bandwidth; CanarIO detects, in real-time, the presence of IO contention on HPC systems and predicts which jobs are affected by that contention (e.g., because of their frequent checkpointing). By working in concert, PRIONN and CanarIO predict the a priori knowledge necessary to prevent and mitigate IO contention with IO-aware scheduling. We integrate AI4IO in the Flux scheduler and show how A4IO produce improvements in simulation performance: we observe up to 6.2% improvement in makespan of HPC job workloads, which amounts to more than 18,000 node-hours saved per week on a production-size cluster. Our work is the first step to implementing IO-aware scheduling on production HPC systems.

*About the speaker*** **

Michela Taufer is an ACM Distinguished Scientist and holds the Jack Dongarra Professorship in High Performance Computing in the Department of Electrical Engineering and Computer Science at the University of Tennessee Knoxville (UTK). She earned her undergraduate degrees in Computer Engineering from the University of Padova (Italy) and her doctoral degree in Computer Science from the Swiss Federal Institute of Technology or ETH (Switzerland). From 2003 to 2004 she was a La Jolla Interfaces in Science Training Program (LJIS) Postdoctoral Fellow at the University of California San Diego (UCSD) and The Scripps Research Institute (TSRI), where she worked on interdisciplinary projects in computer systems and computational chemistry.

Michela has a long history of interdisciplinary work with scientists. Her research interests include scientific applications on heterogeneous platforms (i.e., multi-core platforms and accelerators); performance analysis, modeling and optimization; Artificial Intelligence (AI) for cyberinfrastructures (CI); AI integration into scientific workflows, computer simulations, and data analytics. She has been serving as the principal investigator of several NSF collaborative projects. She also has significant experience in mentoring a diverse population of students on interdisciplinary research. Michela's training expertise includes efforts to spread high-performance computing participation in undergraduate education and research as well as efforts to increase the interest and participation of diverse populations in interdisciplinary studies.