Table of Contents

**Nelly Litvak, University of Twente**

The tendency of nodes in a network to be connected to nodes of similar large or small degree, called network assortativity, degree mixing or degree-degree dependency, is an important characterization of the topology of the network, influencing many processes on the network, including network stability, attacks on P2P networks, and epidemics. In order to evaluate the degree mixing in networks, it is natural to compare the actual graph data to a corresponding configuration model (CM), because CM preserves the degree distribution but introduces a completely neutral random wiring between the nodes. However, it has been found in the literature that CM usually exhibits negative degree-degree correlations, so-called structural correlations that arise under the assumption that the graph is simple. In this talk we analyse the negative correlations in directed CM. In particular, we find the connection between the structural correlations and the number of erased edges in the erased configuration model. For the latter, we establish new upper bounds in terms of the size of the graph. This is a joint work with Pim van der Hoorn, Remco van der Hofstad and Clara Stegehuis.

**Steffen Lauritzen - Denmark**

We investigate connections between exponential random network models (ERGM), bidirected and undirected graphical models, and issues of invariance and exchangeability. Each point of view gives rise to its own class of parametrization of network models and we investigate how they are related to each other and to the theory of graph limits and graphons. In particular we provide insight into possibilities for formulating specific submodels which partially obey properties of the type mentioned.

The lecture is based upon joint work with A. Rinaldo and K. Sadeghi.

**Anders Madsen, HUGIN**

Probabilistic graphical models such as Bayesian networks are well suited for supporting decision making under uncertainty. They provide us with a flexible framework for representing dependence and independence relations of a problem domain using a set of random variables represented as nodes in an acyclic, directed graph and quantifying the strengths of the dependence relations using (conditional) probability distributions.

In this presentation, we will discuss how Bayesian networks have been applied to capture and represent uncertainty in three different application domains: 1) Bayesian networks applied for analysis of massive data streams in finance and the automotive industry; 2) Bayesian networks applied for operationalization of ecosystem services; and 3) Bayesian networks applied for health monitoring and life-long capability management for self-sustaining manufacturing systems. For each application example, we will demonstrate the use of Bayesian networks and discuss some of the challenges involved in the real-world application of Bayesian networks.

**Göran Kauermann, Ludwig-Maximilians-Universität München**

We give an introduction to statistical models for network data. Starting from ‘simple’ models we present Exponential Random Graph Models (ERGM) as well as Stochastic Actor Model (SAM). A particular focus is put on estimation and interpretation and the behavior of the models for large network data.

Though ERGMs (as well as SAMs) have achieved a kind of standard in statistical network data analysis, the models are not yet completely consolidated. In fact ERGMs mirror several practical as well as technical problems. For instance the models suffer from instability and nearly infeasible estimation if the network has more than a couple of hundred actors (nodes). Moreover, models for valued or multi-edge networks are missing or in its infancy.

The talk intends to show weaknesses and problems of available models in order to sketch a blueprint where research is required and innovative new ideas are necessary. The COSTNET initiative will thereby be the perfect forum to work on some of these problems.

**Sonja Petrovic, Illinois Institute of Technology**

The ubiquity of network data in the world around us does not imply that the statistical modeling and fitting techniques have been able to catch up with the demand. This talk will discuss some of the basic modeling questions that every statistician knows are fundamental, some of the recent advances toward answering them, and the challenges that remain. The specific focus of the talk will be on goodness of fit testing for random graph models.

Recent joint work with Despina Stasi and Elizabeth Gross developed a new testing framework for graphs that is based on combinatorics of hypergraphs. More broadly, the talk will summarize a few lines of research that are intimately connected to discrete mathematics and computer science, where sampling algorithms, hypergraph degree sequences, and polytopes play a crucial role in the general family of statistical models for networks called exponential random graph models.

**Alan Whitmore, e-Therapeutics**

Drug discovery is hard and it is characterised by failure rather than success. It is this failure that accounts for the high cost of pharmaceutical development. There has been a one hundred fold decline in the productivity of pharmaceutical R&D since 1950 despite major advances in biological science and technology and the expenditure of literally hundreds of billions of dollars. This is very bad for patients, for drug companies and for the health economies of the world.

A principal reason appears to be a failure to engage with the true complexity of biological systems alongside incomplete consideration of the realities of chemical biology. I shall offer up Network Pharmacology as an alternative paradigm for drug discovery. One that embraces the complexity of pathophysiology and chemical biology and which has already started to demonstrate its potential in the search for new therapies.

By leveraging network science alongside modern chemoinformatic techniques I shall argue that it is possible to provide a new framework for finding novel medicines that takes the best from the current system and places it in the context of biological complexity.

**Gesine Reinert , University of Oxford**

Community detection has been a main topic in the analysis of networks. While there exist a range of powerful and flexible methods for dividing a network into a specified number of communities, it is an open question how to determine exactly how many communities one should use. We answer this question based on a combination of methods from Bayesian analysis and statistical physics. We demonstrate the approach on a range of real-world examples with known community structure, finding that it is able to determine the number of communities correctly in every case. This is joint work with Mark Newman (University of Michigan).

**Mathisca de Gunst, Department of Mathematics, Vrije Universiteit Amsterdam**

In my talk I will present an example of the estimation of connectivity in the brain based on co-registered EEG-fMRI data. I will introduce a hierarchical Gaussian graphical model in which the two different data modalities are integrated in a novel way. I will discuss a procedure for estimation of the unknown parameters of this model and will illustrate its performance on the basis of a simulation study. Results of the application of model and method to experimental data will also be shown.

**Tom A.B. Snijders, University of Groningen, University of Oxford**

For the statistical analysis of network panel data even with as little as 2 waves, it is very fruitful to use models that assume a continuous-time Markov network process, observed only at the moments of observation for the panel. This is analogous to the use of continuous-time models for classical (non-network) panel data proposed by Bergstrom, Singer, and others. For network data such an approach was proposed already by Coleman in 1964. The advantage of this approach is that it provides a simple way to represent the feedback that is inherent in network dynamics, and the model can be defined by just specifying the conditional probability of a tie change, given the current state of the network.

This approach is used in the Stochastic Actor-Oriented Model of Snijders (2001) and in the Longitudinal Exponential Random Graph Model of Snijders & Koskinen (2013). The first of these is actor-oriented, i.e., tie changes are modelled as choices by actors, which among their outgoing tie variables to toggle; the second is tie-oriented, i.e., tie changes are modelled as toggles of single tie variables. Both are generalized linear models for the (unobserved) continuous-time process, with all the practical modelling flexibility of such models. Estimation for panel data is more involved, requiring a simulation approach. Estimators have been developed along several lines, including Method of Moments, Generalized Method of Moments, Maximum Likelihood, and Bayesian, and are available in the R package RSiena. This package is widely applied in empirical social network studies in the social sciences.

This presentation treats the basic definition of the model and some of its extensions, e.g., for multivariate networks. Some open problems, from a mathematical and from an applied perspective, will be mentioned.

**References**

- Ruth M. Ripley, Tom A.B. Snijders, Zsófia Boda, András Vörös, and Paulina Preciado, 2016. Manual for SIENA version 4.0. Oxford: University of Oxford, Department of Statistics; Nuffield College. http://www.stats.ox.ac.uk/siena/
- Tom A.B. Snijders, 2001. The statistical evaluation of social network dynamics. Sociological Methodology, 31, 361-395.
- Tom A.B. Snijders and Johan Koskinen, 2013. “Longitudinal Models”. Chapter 11 (pp. 130-140) in D. Lusher, J. Koskinen, and G. Robins, Exponential Random Graph Models for Social Networks, Cambridge: Cambridge University Press.
- Tom A.B. Snijders, Gerhard G. van de Bunt, G. G., and Christian E.G. Steglich, 2010. Introduction to actor-based models for network dynamics. Social Networks, 32, 44–60.

**Avratchenkov Konstantin, INRIA**

One of the basic questions arising in the analysis of social networks is the estimation of averages of network characteristics. For instance, one would like to know how young a given social network is, or how many friends an average network member has, or what proportion of a population supports a given political party. The answers to all the above questions can be mathematically formulated as the solutions to a problem of estimating an average of a function defined on the network nodes. An efficient way to do function estimation on large networks is to use random walk based estimators. In particular, the random walk based estimators allow to deal with severe constraints imposed by a limit on Application Programming Interface request rate in online social networks. We can then defer to the extensive theory of Markov chains to perform the error analysis of such estimators. In this presentation we overview and compare a number of random walk based estimators. We discuss well-known approaches such as Metropolis-Hastings MCMC and Respondent-Driven Sampling as well as new approaches based on adaptation and reinforcement learning.

**Kimmo Soramäki, Financial Network Analytics**

Regulators globally are increasingly relying on Stress Tests to assess if the banking system is sufficiently capitalized. Also the upcoming IFRS9 impairment calculations will necessitate forward looking stress tests. While there are many types of stress tests, many rely on correlations. Historically calculated correlations, however, tend to break down in large stress events. The presentation will discuss scenario development and the construction of relevant subjective correlation structures to accompany the scenarios, and showcases visual correlation analysis for China hard landing and US real estate crisis.

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported