Enhancing Multi-Edge Models with Zero-Inflation (2024)

\externaldocument

SUPPLEMENTARY_sparse_multi_graph\titlealternativeEmpirical Networks are Sparse: Enhancing Multi-Edge Models with Zero-Inflation\authoralternativeGiona Casiraghi, Georges Andres\wwwhttp://www.sg.ethz.ch

Giona Casiraghi***Corresponding author, gcasiraghi@ethz.ch   Georges AndresChair of Systems Design, ETH ZΓΌrich, Weinbergstrasse 56/58, 8092 ZΓΌrich, Switzerland

Abstract

Real-world networks are sparse.As we show in this article, even when a large number of interactions is observed most node pairs remain disconnected.We demonstrate that classical multi-edge network models, such as the G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ), configuration models, and stochastic block models, fail to accurately capture this phenomenon.To mitigate this issue, zero-inflation must be integrated into these traditional models.Through zero-inflation, we incorporate a mechanism that accounts for the excess number of zeroes (disconnected pairs) observed in empirical data.By performing an analysis on all the datasets from the Sociopatterns repository, we illustrate how zero-inflated models more accurately reflect the sparsity and heavy-tailed edge count distributions observed in empirical data.Our findings underscore that failing to account for these ubiquitous properties in real-world networks inadvertently leads to biased models which do not accurately represent complex systems and their dynamics.

Keywords: sparsity, zero-inflation, multi-edges, SBM, configuration models, statistical modeling, complex networks

1 Introduction

Networks are foundational for understanding complex systems.Accurate modelling of these networks can significantly impact various domains, such as optimising distribution systems[3], understanding the spread of diseases[10], and analysing social behaviours[19].However, a critical challenge lies in developing models that correctly capture the complex patterns observed in real-world networks.

Consider the interactions between students in a high-school.Students from different classes have limited opportunities to meet due to distinct schedules and social circles.When they don’t share common spaces or schedules, they don’t interact.However, when they do meet, the frequency of their interactions can vary independently of their initial chance of meeting.This dynamic, characteristic of many real-world networks, often results in a phenomenon known as β€œzero-inflation” in the distribution of multi-edges.The number of observed zeroesβ€”i.e., the number of disconnected node pairsβ€”exceeds what we would expect, given the large number of potential interactions, as illustrated in Fig.1(a).

In other words, empirical networks are typically sparse[16, 12, 25]:only a limited number of node pairs have multiple interactions (multi-edges), while the majority remains disconnected.Traditional network models, such as the G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p )[15, 20], configuration models[8, 18, 6], and stochastic block models[23, 33], have been instrumental in advancing our understanding of complex networks.However, these models fall short in representing the inherent sparsity observed in real multi-edge network data.They usually assume a proportional relationship between the growth of edges and the number of connected node pairs, which is not always accurate.

Our article emphasises the necessity of incorporating zero-inflation[26] into network models.To bridge this gap, we systematically extend a broad family of multi-edge network models to effectively address the challenges posed by sparse empirical networks.

Recently, there has been an increased focus on the issue of zero-inflation in network data.Krivitsky [24] briefly discussed a zero-inflated version of exponential random graph models for multi-edge networks that accounts for sparsity by modelling dyad-wise distributions.Choi and Ni [7] highlighted the challenges posed by zero-inflation when modelling sparse networks and proposed methods to address this issue.Similarly, Ebrahimi etal. [14] and Motalebi etal. [28] explored zero-inflated and hurdle models to better capture the inherent sparsity in social and biological networks.Furthermore, Dong etal. [13] and Motalebi etal. [29] specifically focused on adapting stochastic block models to account for excess zeroes, underscoring the importance of accurately modelling sparsity for realistic network analysis.Collectively, these works emphasise the necessity of incorporating zero-inflation into network models to enhance their applicability to real-world, sparse network data.

This work aims to achieve two main objectives.First, we highlight the ubiquity of zero-inflation in real-world multi-edge network data by analysing all datasets from the Sociopatterns repository[27].Second, we demonstrate how zero-inflation can be integrated into traditional multi-edge network models, providing a more accurate representation of the sparse nature of empirical networks.Adopting zero-inflated models in network science holds promise for improving the analysis of intrinsically sparse structures, such as higher-order networks[21, 35] and hypergraphs[11].

Enhancing Multi-Edge Models with Zero-Inflation (1)
Enhancing Multi-Edge Models with Zero-Inflation (2)

1.1 Multi-edge network models

Multi-edge network models serve as generative stochastic frameworks that not only capture the presence of edges between nodes but also quantify the number of edges observed for each node pair.These models are crucial for understanding the complex structures and dynamics of real-world networks.

Broadly, multi-edge network models fall into three main categories: micro-canonical[32, 18], Poisson[31, 23], and Hypergeometric[6, 4].Each category offers a different level of constraint and flexibility, catering to various types of network data.

Micro-canonical models are the most constrained, defining an ensemble of networks that share specific fixed attributes, such as degree sequences or intra-block edge counts, and assigning equal probability to each network within the ensemble.In contrast, Poisson and Hypergeometric models operate within more flexible sample spaces, preserving these attributes only in expectation.The critical difference between Poisson and Hypergeometric models lies in their independence assumptions: the former treat node pairs as independent, whereas the latter do not[4].This distinction is crucial, as it influences the applicability and performance of the models on different types of network data.

In this article, we focus on Poisson network models.The independence assumption in Poisson models simplifies the introduction of zero-inflation, which is our primary aim.

Poisson multi-edge network models rely on Poisson count processes to describe phenomena such as edge formation and node interactions.This is the case for classic extensions of the G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ) model to multi-edge networks[15, 20], the Chung-Lu configuration model[8, 31], the classical stochastic block model and the degree-corrected stochastic block model[23], and even count-ERGM models[24].

In a Poisson count process, the probability of observing n𝑛nitalic_n events in a fixed time interval is governed by a Poisson distribution with parameter Ξ»πœ†\lambdaitalic_Ξ»:

Pr⁑(X=n)=Ξ»nn!⁒eβˆ’Ξ».Pr𝑋𝑛superscriptπœ†π‘›π‘›superscriptπ‘’πœ†\Pr(X=n)=\frac{\lambda^{n}}{n!}e^{-\lambda}\,.roman_Pr ( italic_X = italic_n ) = divide start_ARG italic_Ξ» start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_ARG start_ARG italic_n ! end_ARG italic_e start_POSTSUPERSCRIPT - italic_Ξ» end_POSTSUPERSCRIPT .(1)

This Poisson model assumes that the distribution of the random variable X𝑋Xitalic_X will be centred around its mean Ξ»πœ†\lambdaitalic_Ξ», with a low probability of zero occurrences, especially as Ξ»πœ†\lambdaitalic_Ξ» increases.

Poisson network models are defined as n2superscript𝑛2n^{2}italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT independent Poisson count processes, one for every (directed) pair of nodes in the network:

Pr⁑(𝒒)=∏i⁒jPr⁑(Ai⁒j)=∏i⁒jΞ»i⁒jAi⁒jAi⁒j!⁒exp⁑(βˆ’Ξ»i⁒j),Pr𝒒subscriptproduct𝑖𝑗Prsubscript𝐴𝑖𝑗subscriptproduct𝑖𝑗superscriptsubscriptπœ†π‘–π‘—subscript𝐴𝑖𝑗subscript𝐴𝑖𝑗subscriptπœ†π‘–π‘—\Pr(\mathcal{G})=\prod_{ij}\Pr(A_{ij})=\prod_{ij}\frac{\lambda_{ij}^{A_{ij}}}{%A_{ij}!}\exp(-\lambda_{ij}),roman_Pr ( caligraphic_G ) = ∏ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT roman_Pr ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) = ∏ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT divide start_ARG italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! end_ARG roman_exp ( - italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ,(2)

where Ai⁒jsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT denotes the edge count between i𝑖iitalic_i and j𝑗jitalic_j and Ξ»i⁒jsubscriptπœ†π‘–π‘—\lambda_{ij}italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT are parameters that can be functions of node attributes, edge attributes, or block memberships, among others.Appropriately specifying these parameters is key for capturing the heterogeneous nature of interactions within the network.

Several well-known network models are special cases of this general Poisson framework.For example, the G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ) model for multi-edge networks corresponds to a Poisson network model where Ξ»i⁒j=psubscriptπœ†π‘–π‘—π‘\lambda_{ij}=pitalic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_p for all i,j𝑖𝑗i,jitalic_i , italic_j.The Stochastic Block Model is a generalisation that introduces B𝐡Bitalic_B different Ξ»bsubscriptπœ†π‘\lambda_{b}italic_Ξ» start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT parameters, one for each block in the network[23].The Chung-Lu Configuration Model, foundational to many concepts in network science like modularity[30] and degree-correction[23], also fits within this framework, featuring node-dependent parameters Ξ»i⁒j=ΞΈi⁒θjsubscriptπœ†π‘–π‘—subscriptπœƒπ‘–subscriptπœƒπ‘—\lambda_{ij}=\theta_{i}\theta_{j}italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_ΞΈ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT[8, 31].Further, the Degree-Corrected Stochastic Block Model combines aspects of both the Chung-Lu and Stochastic Block Models, with Ξ»i⁒j=ΞΈi⁒θj⁒λbsubscriptπœ†π‘–π‘—subscriptπœƒπ‘–subscriptπœƒπ‘—subscriptπœ†π‘\lambda_{ij}=\theta_{i}\theta_{j}\lambda_{b}italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_ΞΈ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT[23].

1.2 Modelling zero-inflation

While Poisson network models are standard tools in multiple disciplines[23, 24, 31], they struggle to accurately represent sparse multi-edge networks[26, 7, 13, 28, 14, 29].The expected number of connected pairs M𝑀Mitalic_M according to a Poisson network model is given by:

𝔼⁒[M]=βˆ‘i⁒j(1βˆ’exp⁑(βˆ’Ξ»i⁒j)).𝔼delimited-[]𝑀subscript𝑖𝑗1subscriptπœ†π‘–π‘—\mathbb{E}[M]=\sum_{ij}\left(1-\exp\left(-\lambda_{ij}\right)\right).roman_𝔼 [ italic_M ] = βˆ‘ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 1 - roman_exp ( - italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) ) .(3)

However, βˆ‘i⁒jΞ»i⁒jsubscript𝑖𝑗subscriptπœ†π‘–π‘—\sum_{ij}\lambda_{ij}βˆ‘ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT gives the expected number of multi-edges mπ‘šmitalic_m in the network.Consequently, 𝔼⁒[M]𝔼delimited-[]𝑀\mathbb{E}[M]roman_𝔼 [ italic_M ] saturates exponentially with 𝔼⁒[m]𝔼delimited-[]π‘š\mathbb{E}[m]roman_𝔼 [ italic_m ] to a fully-connected network.Hence, with a larger observed mπ‘šmitalic_m, the model tends to yield fully-connected networks.

Figure1(b) compares the sparsity of empirical multi-edge networks with the predictions of Poisson network models.We use interaction networks from the Sociopatterns repository as an example[27] , where different datasets record contacts between individuals over short time frames.With increasing observation time and data collection, the number of multi-edges mπ‘šmitalic_m grows rapidly.Yet, the growth of the number of connected node pairs M𝑀Mitalic_M is considerably slower.This suggests that individuals predominantly interact within their existing social circles, limiting the number of new interactions over short periods.The dashed line in the plot shows the predicted number of connected pairs according to Eq.3.As discussed above, with an increasing number of multi-edges mπ‘šmitalic_m, the number of connected pairs quickly saturates to a fully connected network, deviating from the empirical data.

The observed count distribution, instead, shows a strong bimodality, characterised by peaks at zero and around some Ξ»^^πœ†\hat{\lambda}over^ start_ARG italic_Ξ» end_ARG value.Even the well-known example of Zachary’s Karate Club network exhibits this bimodal distribution, as illustrated in Fig.1(a).Zachary’s Karate Club represents the social interactions within a karate club.When examining the distribution of interactions in this network, we observe a significant number of disconnected node pairs, even when considering the existence of smaller groups, exemplifying the zero-inflation phenomenon.

Zero-inflated models[26] have been developed to mitigate this issue.These models are a mixture of a binary process for generating zeros and a Poisson count process for generating the counts.The probability mass function is given by:

Pr⁑(X=n)=(1βˆ’q)⁒δ0⁒(n)+q⁒λnn!⁒eβˆ’Ξ».Pr𝑋𝑛1π‘žsubscript𝛿0π‘›π‘žsuperscriptπœ†π‘›π‘›superscriptπ‘’πœ†\Pr(X=n)=(1-q)\delta_{0}(n)+q\frac{\lambda^{n}}{n!}e^{-\lambda}\,.roman_Pr ( italic_X = italic_n ) = ( 1 - italic_q ) italic_Ξ΄ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_n ) + italic_q divide start_ARG italic_Ξ» start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_ARG start_ARG italic_n ! end_ARG italic_e start_POSTSUPERSCRIPT - italic_Ξ» end_POSTSUPERSCRIPT .(4)

Here, q∈[0,1]π‘ž01q\in[0,1]italic_q ∈ [ 0 , 1 ] is the mixture weight, and Ξ΄0⁒(n)subscript𝛿0𝑛\delta_{0}(n)italic_Ξ΄ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_n ) is the Kronecker delta function, which is 1 when n=0𝑛0n=0italic_n = 0 and 0 otherwise.The term (1βˆ’q)1π‘ž(1-q)( 1 - italic_q ) accounts for the excess zeros that are not explained by the Poisson process, providing a more accurate model for data like the one in Fig.1(a).

In addition to zero-inflated models, hurdle models provide another approach to address count data with excess zeroes[17].While zero-inflated models combine a binary process for zero counts with a Poisson process for positive counts, hurdle models treat zeros and positive counts as outcomes of two distinct processes.Specifically, a hurdle model first uses a binary process to determine whether an interaction occurs (i.e., whether the count is zero or positive), and then applies a truncated Poisson process for positive counts.This separation ensures that zeros are generated differently than positive counts, which can offer advantages in terms of model identifiability and interpretability[29].However, this assumption implies that sparsity is only generated by the hurdle process, and not by low interaction rates.Such a strict assumption may not always hold true, especially in cases where modelling the interplay between low interaction rates and zero-inflation is critical[17].

In this article, we choose to focus only on zero-inflation as a way to model sparsity in complex networks.In the following section, we detail how zero-inflation is incorporated in Poisson network models and how to perform the inference of the parameters from data.Nevertheless, most of the results shown apply in a similar way to hurdle models[17].

Parameter Estimation

Throughout this article, we consistently employ a variation on the method of moments for parameter inference.This approach involves setting the first moments of the respective models to observed quantitiesβ€”such as the number of edges, number of links, and degree sequencesβ€”from a given network.This consistency in methodology facilitates a coherent comparison across different models and allows for a clear understanding of their individual and comparative characteristics.

Often, though, we would like to obtain the maximum likelihood estimate (MLE) of the model parameters.Unfortunately, MLE presents significant challenges for zero-inflated models, a complexity underscored by the existing literature[34, 2, 9].The inherent difficulty arises particularly in the context of network models, given the increased dimensionality.The primary reason for this complexity is that a maximum likelihood estimate of the mixture probability in Eq.4 involves a logarithm of sums, a form that does not lend itself to simplification or easy manipulation with standard tools.Consequently, optimising the log-likelihood function usually requires a numerical approach.In these cases, the estimates obtained by the methods-of-moments serve as initial values for the numerical MLE optimisation.

2 Zero-Inflating Poisson Network Models

2.1 Zero-Inflated G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ) Model (zi-G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ))

The G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ) model is one of the foundational generative models for networks and among the simplest.It is characterised solely by a parameter p𝑝pitalic_p, which determines the expected number of edges in a network realisation[20].In this model, interactions between different node pairs are assumed to be independent and identically distributed.

For the multi-edge variant of the G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ) model, we set Ξ»i⁒j=psubscriptπœ†π‘–π‘—π‘\lambda_{ij}=pitalic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_p in Eq.2 for all node pairs.The expected number of connected node pairsβ€”henceforth denoted as linksβ€”in a network realisation is then given by:

𝔼⁒(M|p)=βˆ‘i⁒j(1βˆ’eβˆ’p)=N2βˆ’N2⁒eβˆ’p,𝔼conditional𝑀𝑝subscript𝑖𝑗1superscript𝑒𝑝superscript𝑁2superscript𝑁2superscript𝑒𝑝\mathbb{E}(M|p)=\sum_{ij}(1-e^{-p})=N^{2}-N^{2}e^{-p}\,,roman_𝔼 ( italic_M | italic_p ) = βˆ‘ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( 1 - italic_e start_POSTSUPERSCRIPT - italic_p end_POSTSUPERSCRIPT ) = italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - italic_p end_POSTSUPERSCRIPT ,(5)

where N𝑁Nitalic_N represents the number of nodes, considering a loopy directed network.This model reveals its limitation in representing sparse multi-edge networks, as 𝔼⁒(M|p)𝔼conditional𝑀𝑝\mathbb{E}(M|p)roman_𝔼 ( italic_M | italic_p ) approaches N2superscript𝑁2N^{2}italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT with increasing p𝑝pitalic_p.

To better model sparse networks, we integrate zero-inflation into the edge probabilities using Eq.4.The zero-inflated G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ) model is defined by:

Pr⁑(𝒒|p,q)=∏i⁒j((1βˆ’q)⁒δ0⁒(Ai⁒j)+q⁒pAi⁒jAi⁒j!⁒eβˆ’p).Prconditionalπ’’π‘π‘žsubscriptproduct𝑖𝑗1π‘žsubscript𝛿0subscriptπ΄π‘–π‘—π‘žsuperscript𝑝subscript𝐴𝑖𝑗subscript𝐴𝑖𝑗superscript𝑒𝑝\Pr(\mathcal{G}|p,q)=\prod_{ij}\left((1-q)\delta_{0}(A_{ij})+q\frac{p^{A_{ij}}%}{A_{ij}!}e^{-p}\right).roman_Pr ( caligraphic_G | italic_p , italic_q ) = ∏ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( ( 1 - italic_q ) italic_Ξ΄ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) + italic_q divide start_ARG italic_p start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! end_ARG italic_e start_POSTSUPERSCRIPT - italic_p end_POSTSUPERSCRIPT ) .(6)

Incorporating zero-inflation, we calculate the expected number of interactions (edges) 𝔼⁒(m|p,q)𝔼conditionalπ‘šπ‘π‘ž\mathbb{E}(m|p,q)roman_𝔼 ( italic_m | italic_p , italic_q ) and the expected number of links 𝔼⁒(M|p,q)𝔼conditionalπ‘€π‘π‘ž\mathbb{E}(M|p,q)roman_𝔼 ( italic_M | italic_p , italic_q ) as follows:

𝔼⁒(m|p,q)𝔼conditionalπ‘šπ‘π‘ž\displaystyle\mathbb{E}(m|p,q)roman_𝔼 ( italic_m | italic_p , italic_q )=q⁒p⁒N2,absentπ‘žπ‘superscript𝑁2\displaystyle=qpN^{2}\,,= italic_q italic_p italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ,(7)
𝔼⁒(M|p,q)𝔼conditionalπ‘€π‘π‘ž\displaystyle\mathbb{E}(M|p,q)roman_𝔼 ( italic_M | italic_p , italic_q )=q⁒N2βˆ’N2⁒q⁒eβˆ’p.absentπ‘žsuperscript𝑁2superscript𝑁2π‘žsuperscript𝑒𝑝\displaystyle=qN^{2}-N^{2}qe^{-p}\,.= italic_q italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_q italic_e start_POSTSUPERSCRIPT - italic_p end_POSTSUPERSCRIPT .(8)

To estimate the parameters p𝑝pitalic_p and qπ‘žqitalic_q from observed data, we use the method of moments, matching these expected values to the observed counts of interactions and links, m^^π‘š\widehat{m}over^ start_ARG italic_m end_ARG and M^^𝑀\widehat{M}over^ start_ARG italic_M end_ARG, respectively:

𝔼⁒(m|p,q):=m^,𝔼⁒(M|p,q):=M^.formulae-sequenceassign𝔼conditionalπ‘šπ‘π‘ž^π‘šassign𝔼conditionalπ‘€π‘π‘ž^𝑀\mathbb{E}(m|p,q):=\widehat{m}\,,\qquad\mathbb{E}(M|p,q):=\widehat{M}\,.roman_𝔼 ( italic_m | italic_p , italic_q ) := over^ start_ARG italic_m end_ARG , roman_𝔼 ( italic_M | italic_p , italic_q ) := over^ start_ARG italic_M end_ARG .(9)

Solving for p𝑝pitalic_p and qπ‘žqitalic_q, we derive:

p^=m^N2⁒q,q^=m^⁒M^N2⁒(m^+M^⁒𝒲⁒[βˆ’m^⁒eβˆ’m^M^M^]),formulae-sequence^𝑝^π‘šsuperscript𝑁2π‘ž^π‘ž^π‘š^𝑀superscript𝑁2^π‘š^𝑀𝒲delimited-[]^π‘šsuperscript𝑒^π‘š^𝑀^𝑀\widehat{p}=\frac{\widehat{m}}{N^{2}q}\,,\qquad\widehat{q}=\frac{\widehat{m}%\widehat{M}}{N^{2}(\widehat{m}+\widehat{M}\mathcal{W}\left[-\frac{\widehat{m}e%^{-\frac{\widehat{m}}{\widehat{M}}}}{\widehat{M}}\right])}\,,over^ start_ARG italic_p end_ARG = divide start_ARG over^ start_ARG italic_m end_ARG end_ARG start_ARG italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_q end_ARG , over^ start_ARG italic_q end_ARG = divide start_ARG over^ start_ARG italic_m end_ARG over^ start_ARG italic_M end_ARG end_ARG start_ARG italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( over^ start_ARG italic_m end_ARG + over^ start_ARG italic_M end_ARG caligraphic_W [ - divide start_ARG over^ start_ARG italic_m end_ARG italic_e start_POSTSUPERSCRIPT - divide start_ARG over^ start_ARG italic_m end_ARG end_ARG start_ARG over^ start_ARG italic_M end_ARG end_ARG end_POSTSUPERSCRIPT end_ARG start_ARG over^ start_ARG italic_M end_ARG end_ARG ] ) end_ARG ,(10)

where 𝒲⁒[z]𝒲delimited-[]𝑧\mathcal{W}[z]caligraphic_W [ italic_z ] represents Lambert’s W function, solving w⁒ew=z𝑀superscript𝑒𝑀𝑧we^{w}=zitalic_w italic_e start_POSTSUPERSCRIPT italic_w end_POSTSUPERSCRIPT = italic_z[26].

2.2 Zero-Inflated Stochastic Block Model (zi-SBM)

Building on the G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ) model, the Stochastic Block Model (SBM) introduces a richer representation of network structures by distinguishing distinct blocks in the network, each characterised by a unique edge probability[22, 1].This model captures community structures within networks.Nodes are assigned to one of B𝐡Bitalic_B different groups, and the probability of interactions between nodes i𝑖iitalic_i and j𝑗jitalic_j in groups bisubscript𝑏𝑖b_{i}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and bjsubscript𝑏𝑗b_{j}italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, respectively, is determined by a block-specific parameter Ξ»bi⁒bjsubscriptπœ†subscript𝑏𝑖subscript𝑏𝑗\lambda_{b_{i}b_{j}}italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT.

To accommodate potential sparsity of interactions within different blocks, we detail the zero-inflated version of the SBM, denoted as zi-SBM.The model is defined by the following probability distribution:

Pr⁑(𝒒|𝝀,𝒒)=∏i⁒j((1βˆ’qbi⁒bj)⁒δ0⁒(Ai⁒j)+qbi⁒bj⁒λbi⁒bjAi⁒jAi⁒j!⁒exp⁑(βˆ’Ξ»bi⁒bj)),Prconditional𝒒𝝀𝒒subscriptproduct𝑖𝑗1subscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗subscript𝛿0subscript𝐴𝑖𝑗subscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗superscriptsubscriptπœ†subscript𝑏𝑖subscript𝑏𝑗subscript𝐴𝑖𝑗subscript𝐴𝑖𝑗subscriptπœ†subscript𝑏𝑖subscript𝑏𝑗\Pr(\mathcal{G}|\boldsymbol{\lambda},\boldsymbol{q})=\prod_{ij}\left((1-q_{b_{%i}b_{j}})\delta_{0}(A_{ij})+q_{b_{i}b_{j}}\frac{\lambda_{b_{i}b_{j}}^{A_{ij}}}%{A_{ij}!}\exp(-\lambda_{b_{i}b_{j}})\right),roman_Pr ( caligraphic_G | bold_italic_Ξ» , bold_italic_q ) = ∏ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( ( 1 - italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_Ξ΄ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) + italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! end_ARG roman_exp ( - italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ) ,(11)

where each block (b,d)𝑏𝑑(b,d)( italic_b , italic_d ) is associated with a unique mixture parameter qb⁒dsubscriptπ‘žπ‘π‘‘q_{bd}italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT, allowing for block-specific zero-inflation.

The expected number of interactions mb⁒dsubscriptπ‘šπ‘π‘‘m_{bd}italic_m start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT and links Mb⁒dsubscript𝑀𝑏𝑑M_{bd}italic_M start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT in the zi-SBM are given by:

𝔼⁒(mb⁒d|𝝀,𝒒)𝔼conditionalsubscriptπ‘šπ‘π‘‘π€π’’\displaystyle\mathbb{E}(m_{bd}|\boldsymbol{\lambda},\boldsymbol{q})roman_𝔼 ( italic_m start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT | bold_italic_Ξ» , bold_italic_q )=qb⁒d⁒λb⁒d⁒Nb⁒Nd,absentsubscriptπ‘žπ‘π‘‘subscriptπœ†π‘π‘‘subscript𝑁𝑏subscript𝑁𝑑\displaystyle=q_{bd}\lambda_{bd}N_{b}N_{d},= italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT ,(12)
𝔼⁒(M|𝝀,𝒒)𝔼conditional𝑀𝝀𝒒\displaystyle\mathbb{E}(M|\boldsymbol{\lambda},\boldsymbol{q})roman_𝔼 ( italic_M | bold_italic_Ξ» , bold_italic_q )=βˆ‘b⁒dNb⁒Nd⁒qb⁒d⁒(1βˆ’eβˆ’Ξ»b⁒d),absentsubscript𝑏𝑑subscript𝑁𝑏subscript𝑁𝑑subscriptπ‘žπ‘π‘‘1superscript𝑒subscriptπœ†π‘π‘‘\displaystyle=\sum_{bd}N_{b}N_{d}q_{bd}(1-e^{-\lambda_{bd}}),= βˆ‘ start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT ( 1 - italic_e start_POSTSUPERSCRIPT - italic_Ξ» start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) ,(13)

where Nbsubscript𝑁𝑏N_{b}italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT denotes the number of nodes in group b𝑏bitalic_b.

For parameter inference, we equate the first moments of the distribution to the observed values m^b⁒dsubscript^π‘šπ‘π‘‘\widehat{m}_{bd}over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT and M^b⁒dsubscript^𝑀𝑏𝑑\widehat{M}_{bd}over^ start_ARG italic_M end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT:

𝔼⁒(mb⁒d|Ξ»b⁒d,qb⁒d):=m^b⁒d,𝔼⁒(Mb⁒d|Ξ»b⁒d,qb⁒d):=M^b⁒d.formulae-sequenceassign𝔼conditionalsubscriptπ‘šπ‘π‘‘subscriptπœ†π‘π‘‘subscriptπ‘žπ‘π‘‘subscript^π‘šπ‘π‘‘assign𝔼conditionalsubscript𝑀𝑏𝑑subscriptπœ†π‘π‘‘subscriptπ‘žπ‘π‘‘subscript^𝑀𝑏𝑑\mathbb{E}(m_{bd}|\lambda_{bd},q_{bd}):=\widehat{m}_{bd},\qquad\mathbb{E}(M_{%bd}|\lambda_{bd},q_{bd}):=\widehat{M}_{bd}.roman_𝔼 ( italic_m start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT | italic_Ξ» start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT ) := over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT , roman_𝔼 ( italic_M start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT | italic_Ξ» start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT ) := over^ start_ARG italic_M end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT .(14)

Solving Eq.14, we find:

Ξ»^b⁒d=m^b⁒dqb⁒d⁒Nb⁒Nd.subscript^πœ†π‘π‘‘subscript^π‘šπ‘π‘‘subscriptπ‘žπ‘π‘‘subscript𝑁𝑏subscript𝑁𝑑\widehat{\lambda}_{bd}=\frac{\widehat{m}_{bd}}{q_{bd}N_{b}N_{d}}.over^ start_ARG italic_Ξ» end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT = divide start_ARG over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_ARG .(15)

A closed-form solution for q^b⁒dsubscript^π‘žπ‘π‘‘\widehat{q}_{bd}over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT is not readily available.Nonetheless, the values of q^b⁒dsubscript^π‘žπ‘π‘‘\widehat{q}_{bd}over^ start_ARG italic_q end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT can be determined by numerically solving the set of B2superscript𝐡2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT independent equations given in Eq.14.

Dong etal. [13] proposed a variational-EM algorithm for parameter estimation in the special case of a multilayer zero-inflated stochastic block model.This algorithm efficiently estimates the community labels and model parameters, handling the sparsity and correlations in multilayer networks.The method proposed demonstrates effectiveness in capturing complex interaction patterns through extensive simulations and real-world case studies.

2.3 Zero-Inflated Configuration Model (zi-CLCM)

While both the G⁒(N,p)𝐺𝑁𝑝G(N,p)italic_G ( italic_N , italic_p ) and SBM models offer valuable insights, they fall short in encoding node heterogeneities.Configuration models fill this gap by introducing a parameterisation that accounts for degree heterogeneities[18].In the framework of Poisson models, the Chung-Lu configuration model (CLCM) achieves this by expressing the general parameters Ξ»i⁒jsubscriptπœ†π‘–π‘—\lambda_{ij}italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT as ΞΈiout⁒θjinsubscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗\theta^{\text{out}}_{i}\theta^{\text{in}}_{j}italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, a product node-parameters[8, 31].For undirected networks, we have 𝜽out=𝜽in=𝜽superscript𝜽outsuperscript𝜽in𝜽\boldsymbol{\theta^{\text{out}}}=\boldsymbol{\theta^{\text{in}}}=\boldsymbol{\theta}bold_italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT = bold_italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT = bold_italic_ΞΈ.

Incorporating zero-inflation into the CLCM results in the Zero-Inflated Chung-Lu Configuration Model (zi-CLCM).The model is described by the probability distribution:

Pr⁑(𝒒|𝜽,q)=∏i⁒j((1βˆ’q)⁒δ0⁒(Ai⁒j)+q⁒(ΞΈiout⁒θjin)Ai⁒jAi⁒j!⁒exp⁑(βˆ’ΞΈiout⁒θjin)).Prconditionalπ’’πœ½π‘žsubscriptproduct𝑖𝑗1π‘žsubscript𝛿0subscriptπ΄π‘–π‘—π‘žsuperscriptsubscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗subscript𝐴𝑖𝑗subscript𝐴𝑖𝑗subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗\Pr(\mathcal{G}|\boldsymbol{\theta},q)=\prod_{ij}\left((1-q)\,\delta_{0}(A_{ij%})+q\,\frac{(\theta^{\text{out}}_{i}\theta^{\text{in}}_{j})^{A_{ij}}}{A_{ij}!}%\exp(-\theta^{\text{out}}_{i}\theta^{\text{in}}_{j})\right).roman_Pr ( caligraphic_G | bold_italic_ΞΈ , italic_q ) = ∏ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( ( 1 - italic_q ) italic_Ξ΄ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) + italic_q divide start_ARG ( italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! end_ARG roman_exp ( - italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) .(16)

It is important to note that the model parameters appear always in pairs.This means that they are defined modulo constants.This characteristic requires fixing a constraint that ensures a unique solution to the inference problem.A common approach is to fix the expected number of interactions in the network:

𝔼⁒(m|𝜽,q)=qβ’βˆ‘i⁒jΞΈiout⁒θjin.𝔼conditionalπ‘šπœ½π‘žπ‘žsubscript𝑖𝑗subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗\mathbb{E}(m|\boldsymbol{\theta},q)=q\,\sum_{ij}\theta^{\text{out}}_{i}\theta^%{\text{in}}_{j}\,.roman_𝔼 ( italic_m | bold_italic_ΞΈ , italic_q ) = italic_q βˆ‘ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .(17)

To infer the parameters of the network model qπ‘žqitalic_q, 𝜽outsuperscript𝜽out\boldsymbol{\theta^{\text{out}}}bold_italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT, and 𝜽insuperscript𝜽in\boldsymbol{\theta^{\text{in}}}bold_italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT, we set four different moments to observed values:

𝔼⁒(m|𝜽,q):=m^,𝔼⁒(M|𝜽,q):=M^,βˆ€i:𝔼⁒(kiout|𝜽,q):=kiout^,βˆ€i:𝔼⁒(kiin|𝜽,q):=kiin^,:formulae-sequenceassign𝔼conditionalπ‘šπœ½π‘ž^π‘šassign𝔼conditionalπ‘€πœ½π‘ž^𝑀for-all𝑖assign𝔼conditionalsubscriptsuperscriptπ‘˜outπ‘–πœ½π‘ž^subscriptsuperscriptπ‘˜out𝑖for-all𝑖:assign𝔼conditionalsubscriptsuperscriptπ‘˜inπ‘–πœ½π‘ž^subscriptsuperscriptπ‘˜in𝑖\mathbb{E}(m|\boldsymbol{\theta},q):=\widehat{m},\quad\mathbb{E}(M|\boldsymbol%{\theta},q):=\widehat{M},\quad\forall i:\;\mathbb{E}(k^{\text{out}}_{i}|%\boldsymbol{\theta},q):=\widehat{k^{\text{out}}_{i}},\quad\forall i:\;\mathbb{%E}(k^{\text{in}}_{i}|\boldsymbol{\theta},q):=\widehat{k^{\text{in}}_{i}},roman_𝔼 ( italic_m | bold_italic_ΞΈ , italic_q ) := over^ start_ARG italic_m end_ARG , roman_𝔼 ( italic_M | bold_italic_ΞΈ , italic_q ) := over^ start_ARG italic_M end_ARG , βˆ€ italic_i : roman_𝔼 ( italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_italic_ΞΈ , italic_q ) := over^ start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG , βˆ€ italic_i : roman_𝔼 ( italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_italic_ΞΈ , italic_q ) := over^ start_ARG italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ,(18)

where m^^π‘š\widehat{m}over^ start_ARG italic_m end_ARG denotes the number of interactions in an observed network, M^^𝑀\widehat{M}over^ start_ARG italic_M end_ARG represents its number of links, and kiout^^subscriptsuperscriptπ‘˜out𝑖\widehat{k^{\text{out}}_{i}}over^ start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG and kiin^^subscriptsuperscriptπ‘˜in𝑖\widehat{k^{\text{in}}_{i}}over^ start_ARG italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG the observed out- and in-degrees of i𝑖iitalic_i.

The expected degree sequences and the number of links from the model are given by:

𝔼⁒(kiout|𝜽,q)=q⁒θioutβ’βˆ‘jΞΈjin,𝔼⁒(kiin|𝜽,q)=q⁒θiinβ’βˆ‘jΞΈjout,formulae-sequence𝔼conditionalsubscriptsuperscriptπ‘˜outπ‘–πœ½π‘žπ‘žsubscriptsuperscriptπœƒout𝑖subscript𝑗subscriptsuperscriptπœƒin𝑗𝔼conditionalsubscriptsuperscriptπ‘˜inπ‘–πœ½π‘žπ‘žsubscriptsuperscriptπœƒin𝑖subscript𝑗subscriptsuperscriptπœƒout𝑗\mathbb{E}(k^{\text{out}}_{i}|\boldsymbol{\theta},q)=q\,\theta^{\text{out}}_{i%}\sum_{j}\theta^{\text{in}}_{j},\qquad\mathbb{E}(k^{\text{in}}_{i}|\boldsymbol%{\theta},q)=q\,\theta^{\text{in}}_{i}\sum_{j}\theta^{\text{out}}_{j},roman_𝔼 ( italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_italic_ΞΈ , italic_q ) = italic_q italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , roman_𝔼 ( italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_italic_ΞΈ , italic_q ) = italic_q italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ,(19)
𝔼⁒(M|𝜽,q)=q⁒N2βˆ’qβ’βˆ‘i⁒jeβˆ’ΞΈiout⁒θjin.𝔼conditionalπ‘€πœ½π‘žπ‘žsuperscript𝑁2π‘žsubscript𝑖𝑗superscript𝑒subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗\mathbb{E}(M|\boldsymbol{\theta},q)=q\,N^{2}-q\,\sum_{ij}e^{-\theta^{\text{out%}}_{i}\theta^{\text{in}}_{j}}\,.roman_𝔼 ( italic_M | bold_italic_ΞΈ , italic_q ) = italic_q italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - italic_q βˆ‘ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT .(20)

Fixing the expected number of interactions to m^^π‘š\widehat{m}over^ start_ARG italic_m end_ARG, we obtain a constraint for the L1-norm of the parameter vectors 𝜽outsuperscript𝜽out\boldsymbol{\theta^{\text{out}}}bold_italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT and 𝜽insuperscript𝜽in\boldsymbol{\theta^{\text{in}}}bold_italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT:

βˆ‘iΞΈi⋆=m^q.subscript𝑖subscriptsuperscriptπœƒβ‹†π‘–^π‘šπ‘ž\sum_{i}\theta^{\star}_{i}=\sqrt{\frac{\widehat{m}}{q}}.βˆ‘ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = square-root start_ARG divide start_ARG over^ start_ARG italic_m end_ARG end_ARG start_ARG italic_q end_ARG end_ARG .(21)

This constraint ties the expected number of interactions in the network to the node-specific parameters, allowing their inference.

From Eq.21 and Eq.19, we derive expressions for ΞΈi⋆^^subscriptsuperscriptπœƒβ‹†π‘–\widehat{\theta^{\star}_{i}}over^ start_ARG italic_ΞΈ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG as functions of qπ‘žqitalic_q and the observed degrees kiout^^subscriptsuperscriptπ‘˜out𝑖\widehat{k^{\text{out}}_{i}}over^ start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG and kiin^^subscriptsuperscriptπ‘˜in𝑖\widehat{k^{\text{in}}_{i}}over^ start_ARG italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG:

ΞΈi⋆^=ki⋆^m^⁒q.^subscriptsuperscriptπœƒβ‹†π‘–^subscriptsuperscriptπ‘˜β‹†π‘–^π‘šπ‘ž\widehat{\theta^{\star}_{i}}=\frac{\widehat{k^{\star}_{i}}}{\sqrt{\widehat{m}q%}}\,.over^ start_ARG italic_ΞΈ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG = divide start_ARG over^ start_ARG italic_k start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG end_ARG start_ARG square-root start_ARG over^ start_ARG italic_m end_ARG italic_q end_ARG end_ARG .(22)

Lastly, substituting Eq.22 into Eq.20 leads to an explicit equation for q^^π‘ž\widehat{q}over^ start_ARG italic_q end_ARG:

q^⁒(N2βˆ’βˆ‘i⁒jeβˆ’kiout^⁒kjin^m^⁒q^)=M^.^π‘žsuperscript𝑁2subscript𝑖𝑗superscript𝑒^subscriptsuperscriptπ‘˜out𝑖^subscriptsuperscriptπ‘˜in𝑗^π‘š^π‘ž^𝑀\widehat{q}\,\left(N^{2}-\sum_{ij}e^{-\frac{\widehat{k^{\text{out}}_{i}}%\widehat{k^{\text{in}}_{j}}}{\widehat{m}\widehat{q}}}\right)=\widehat{M}.over^ start_ARG italic_q end_ARG ( italic_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - βˆ‘ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - divide start_ARG over^ start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG over^ start_ARG italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG end_ARG start_ARG over^ start_ARG italic_m end_ARG over^ start_ARG italic_q end_ARG end_ARG end_POSTSUPERSCRIPT ) = over^ start_ARG italic_M end_ARG .(23)

Solving numerically for q^^π‘ž\widehat{q}over^ start_ARG italic_q end_ARG allows the estimation of all model parameters.

2.4 Zero-Inflated Degree-Corrected Stochastic Block Model (zi-DCSBM)

We finally discuss the Zero-Inflated Degree Corrected Stochastic Block Model (zi-DCSBM).As for the standard degree-corrected stochastic block model[23], it incorporates both node heterogeneity and block structure into a single comprehensive model, combining the features of the zi-CLCM and the zi-SBM.The introduction of block-specific zero-inflation parameters qb⁒dsubscriptπ‘žπ‘π‘‘q_{bd}italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT, constituting the vector 𝒒𝒒\boldsymbol{q}bold_italic_q, allows the model to account for varying levels of zero-inflation across different blocks.By parametrising Eq.2 with Ξ»i⁒j=ΞΈiout⁒θjin⁒λbi⁒bjsubscriptπœ†π‘–π‘—subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗subscriptπœ†subscript𝑏𝑖subscript𝑏𝑗\lambda_{ij}=\theta^{\text{out}}_{i}\theta^{\text{in}}_{j}\lambda_{b_{i}b_{j}}italic_Ξ» start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT, we derive the following probability distribution:

Pr⁑(𝒒|𝜽,𝝀,𝒒)=∏i⁒j((1βˆ’qbi⁒bj)⁒δ0⁒(Ai⁒j)+qbi⁒bj⁒(ΞΈiout⁒θjin⁒λbi⁒bj)Ai⁒jAi⁒j!⁒exp⁑(βˆ’ΞΈiout⁒θjin⁒λbi⁒bj)).Prconditionalπ’’πœ½π€π’’subscriptproduct𝑖𝑗1subscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗subscript𝛿0subscript𝐴𝑖𝑗subscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗superscriptsubscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗subscriptπœ†subscript𝑏𝑖subscript𝑏𝑗subscript𝐴𝑖𝑗subscript𝐴𝑖𝑗subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗subscriptπœ†subscript𝑏𝑖subscript𝑏𝑗\Pr(\mathcal{G}|\boldsymbol{\theta},\boldsymbol{\lambda},\boldsymbol{q})=\prod%_{ij}\left((1-q_{b_{i}b_{j}})\,\delta_{0}(A_{ij})+q_{b_{i}b_{j}}\,\frac{(%\theta^{\text{out}}_{i}\theta^{\text{in}}_{j}\lambda_{b_{i}b_{j}})^{A_{ij}}}{A%_{ij}!}\exp(-\theta^{\text{out}}_{i}\theta^{\text{in}}_{j}\lambda_{b_{i}b_{j}}%)\right).roman_Pr ( caligraphic_G | bold_italic_ΞΈ , bold_italic_Ξ» , bold_italic_q ) = ∏ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( ( 1 - italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) italic_Ξ΄ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) + italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG ( italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! end_ARG roman_exp ( - italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ) .(24)

In order to infer the model parameters, we employ the method of moments, setting the first moments of the model to their observed counterparts in a given network.The expected values of the number of interactions, node degrees, and links for each block are defined as follows:

𝔼⁒(mb⁒d|𝜽,𝝀,𝒒)=qb⁒d⁒λb⁒dβ’βˆ‘i∈bΞΈioutβ’βˆ‘j∈dΞΈjin,𝔼conditionalsubscriptπ‘šπ‘π‘‘πœ½π€π’’subscriptπ‘žπ‘π‘‘subscriptπœ†π‘π‘‘subscript𝑖𝑏subscriptsuperscriptπœƒout𝑖subscript𝑗𝑑subscriptsuperscriptπœƒin𝑗\mathbb{E}(m_{bd}|\boldsymbol{\theta},\boldsymbol{\lambda},\boldsymbol{q})=q_{%bd}\lambda_{bd}\sum_{i\in b}\theta^{\text{out}}_{i}\sum_{j\in d}\theta^{\text{%in}}_{j},roman_𝔼 ( italic_m start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT | bold_italic_ΞΈ , bold_italic_Ξ» , bold_italic_q ) = italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_i ∈ italic_b end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_j ∈ italic_d end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ,(25)
𝔼⁒(kiout|𝜽,𝒒)=ΞΈioutβ’βˆ‘dqbi⁒d⁒λbi⁒dβ’βˆ‘j∈dΞΈjin,𝔼⁒(kiin|𝜽,𝒒)=ΞΈiinβ’βˆ‘dqbi⁒d⁒λbi⁒dβ’βˆ‘j∈dΞΈjout,formulae-sequence𝔼conditionalsubscriptsuperscriptπ‘˜outπ‘–πœ½π’’subscriptsuperscriptπœƒout𝑖subscript𝑑subscriptπ‘žsubscript𝑏𝑖𝑑subscriptπœ†subscript𝑏𝑖𝑑subscript𝑗𝑑subscriptsuperscriptπœƒin𝑗𝔼conditionalsubscriptsuperscriptπ‘˜inπ‘–πœ½π’’subscriptsuperscriptπœƒin𝑖subscript𝑑subscriptπ‘žsubscript𝑏𝑖𝑑subscriptπœ†subscript𝑏𝑖𝑑subscript𝑗𝑑subscriptsuperscriptπœƒout𝑗\mathbb{E}(k^{\text{out}}_{i}|\boldsymbol{\theta},\boldsymbol{q})=\theta^{%\text{out}}_{i}\sum_{d}q_{b_{i}d}\lambda_{b_{i}d}\sum_{j\in d}\theta^{\text{in%}}_{j},\qquad\mathbb{E}(k^{\text{in}}_{i}|\boldsymbol{\theta},\boldsymbol{q})=%\theta^{\text{in}}_{i}\sum_{d}q_{b_{i}d}\lambda_{b_{i}d}\sum_{j\in d}\theta^{%\text{out}}_{j},roman_𝔼 ( italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_italic_ΞΈ , bold_italic_q ) = italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_j ∈ italic_d end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , roman_𝔼 ( italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_italic_ΞΈ , bold_italic_q ) = italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_j ∈ italic_d end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ,(26)
𝔼⁒(Mb⁒d|𝜽,𝝀,𝒒)=qb⁒d⁒(Nb⁒Ndβˆ’βˆ‘i∈b,j∈deβˆ’Ξ»b⁒d⁒θiout⁒θjin),𝔼conditionalsubscriptπ‘€π‘π‘‘πœ½π€π’’subscriptπ‘žπ‘π‘‘subscript𝑁𝑏subscript𝑁𝑑subscriptformulae-sequence𝑖𝑏𝑗𝑑superscript𝑒subscriptπœ†π‘π‘‘subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗\mathbb{E}(M_{bd}|\boldsymbol{\theta},\boldsymbol{\lambda},\boldsymbol{q})=q_{%bd}\left(N_{b}N_{d}-\sum_{i\in b,j\in d}e^{-\lambda_{bd}\theta^{\text{out}}_{i%}\theta^{\text{in}}_{j}}\right),roman_𝔼 ( italic_M start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT | bold_italic_ΞΈ , bold_italic_Ξ» , bold_italic_q ) = italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT ( italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT - βˆ‘ start_POSTSUBSCRIPT italic_i ∈ italic_b , italic_j ∈ italic_d end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - italic_Ξ» start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) ,(27)

where Nbsubscript𝑁𝑏N_{b}italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT denotes the number of nodes in group b𝑏bitalic_b.

Just like in the CLCM and zi-CLCM, the parameters 𝜽outsuperscript𝜽out\boldsymbol{\theta^{\text{out}}}bold_italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT and 𝜽insuperscript𝜽in\boldsymbol{\theta^{\text{in}}}bold_italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT in the DCSBM and zi-DCSBM are defined modulo constants, necessitating the establishment of a constraint to ensure a unique solution to the inference problem.A common approach is to constrain the L1-norm of the node-specific parameters, i.e., βˆ‘iΞΈi⋆⁒δb⁒(bi)=1subscript𝑖subscriptsuperscriptπœƒβ‹†π‘–subscript𝛿𝑏subscript𝑏𝑖1\sum_{i}\theta^{\star}_{i}\delta_{b}(b_{i})=1βˆ‘ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_Ξ΄ start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = 1.By incorporating this constraint into Eqs.25 and26, we obtain:

𝔼⁒(mb⁒d|𝜽,𝝀,𝒒)=qb⁒d⁒λb⁒d,𝔼⁒(ki⋆|𝜽,𝒒)=ΞΈiβ‹†β’βˆ‘dqbi⁒d⁒λbi⁒d.formulae-sequence𝔼conditionalsubscriptπ‘šπ‘π‘‘πœ½π€π’’subscriptπ‘žπ‘π‘‘subscriptπœ†π‘π‘‘π”Όconditionalsubscriptsuperscriptπ‘˜β‹†π‘–πœ½π’’subscriptsuperscriptπœƒβ‹†π‘–subscript𝑑subscriptπ‘žsubscript𝑏𝑖𝑑subscriptπœ†subscript𝑏𝑖𝑑\mathbb{E}(m_{bd}|\boldsymbol{\theta},\boldsymbol{\lambda},\boldsymbol{q})=q_{%bd}\lambda_{bd},\quad\mathbb{E}(k^{\star}_{i}|\boldsymbol{\theta},\boldsymbol{%q})=\theta^{\star}_{i}\sum_{d}q_{b_{i}d}\lambda_{b_{i}d}.roman_𝔼 ( italic_m start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT | bold_italic_ΞΈ , bold_italic_Ξ» , bold_italic_q ) = italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT , roman_𝔼 ( italic_k start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | bold_italic_ΞΈ , bold_italic_q ) = italic_ΞΈ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT βˆ‘ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT .(28)

Solving the set of equations defined by Eqs.25 and26 for 𝝀𝝀\boldsymbol{\lambda}bold_italic_Ξ», 𝜽outsuperscript𝜽out\boldsymbol{\theta^{\text{out}}}bold_italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT, and 𝜽insuperscript𝜽in\boldsymbol{\theta^{\text{in}}}bold_italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT yields expressions for their estimators that depend on the mixture parameters 𝒒𝒒\boldsymbol{q}bold_italic_q and observed quantities:

Ξ»^b⁒d=m^b⁒dqb⁒d,ΞΈi⋆=ki⋆^ΞΊ^bi⋆,formulae-sequencesubscript^πœ†π‘π‘‘subscript^π‘šπ‘π‘‘subscriptπ‘žπ‘π‘‘subscriptsuperscriptπœƒβ‹†π‘–^subscriptsuperscriptπ‘˜β‹†π‘–subscriptsuperscript^πœ…β‹†subscript𝑏𝑖\widehat{\lambda}_{bd}=\frac{\widehat{m}_{bd}}{q_{bd}},\quad\theta^{\star}_{i}%=\frac{\widehat{k^{\star}_{i}}}{\widehat{\kappa}^{\star}_{b_{i}}},over^ start_ARG italic_Ξ» end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT = divide start_ARG over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT end_ARG , italic_ΞΈ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = divide start_ARG over^ start_ARG italic_k start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG end_ARG start_ARG over^ start_ARG italic_ΞΊ end_ARG start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ,(29)

where ΞΊbiout=βˆ‘dmbi⁒dsubscriptsuperscriptπœ…outsubscript𝑏𝑖subscript𝑑subscriptπ‘šsubscript𝑏𝑖𝑑\kappa^{\text{out}}_{b_{i}}=\sum_{d}m_{b_{i}d}italic_ΞΊ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = βˆ‘ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT and ΞΊbiin=βˆ‘dmd⁒bisubscriptsuperscriptπœ…insubscript𝑏𝑖subscript𝑑subscriptπ‘šπ‘‘subscript𝑏𝑖\kappa^{\text{in}}_{b_{i}}=\sum_{d}m_{db_{i}}italic_ΞΊ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT = βˆ‘ start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT italic_d italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT denote the out- and in-degree of block bisubscript𝑏𝑖b_{i}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, respectively.

Finally, substituting these expressions into Eq.27 provides a set of equations for each qb⁒dsubscriptπ‘žπ‘π‘‘q_{bd}italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT and the number of links in each block M^b⁒dsubscript^𝑀𝑏𝑑\widehat{M}_{bd}over^ start_ARG italic_M end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT:

qb⁒d⁒(Nb⁒Ndβˆ’βˆ‘i∈b,j∈deβˆ’kiout^⁒kjin^ΞΊ^biout⁒κ^bjin⁒m^bi⁒bjqb⁒d)=M^b⁒d.subscriptπ‘žπ‘π‘‘subscript𝑁𝑏subscript𝑁𝑑subscriptformulae-sequence𝑖𝑏𝑗𝑑superscript𝑒^subscriptsuperscriptπ‘˜out𝑖^subscriptsuperscriptπ‘˜in𝑗subscriptsuperscript^πœ…outsubscript𝑏𝑖subscriptsuperscript^πœ…insubscript𝑏𝑗subscript^π‘šsubscript𝑏𝑖subscript𝑏𝑗subscriptπ‘žπ‘π‘‘subscript^𝑀𝑏𝑑q_{bd}\,\left(N_{b}N_{d}-\sum_{i\in b,j\in d}e^{-\frac{\widehat{k^{\text{out}}%_{i}}\widehat{k^{\text{in}}_{j}}}{\widehat{\kappa}^{\text{out}}_{b_{i}}%\widehat{\kappa}^{\text{in}}_{b_{j}}}\frac{\widehat{m}_{b_{i}b_{j}}}{q_{bd}}}%\right)=\widehat{M}_{bd}.italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT ( italic_N start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT - βˆ‘ start_POSTSUBSCRIPT italic_i ∈ italic_b , italic_j ∈ italic_d end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - divide start_ARG over^ start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG over^ start_ARG italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG end_ARG start_ARG over^ start_ARG italic_ΞΊ end_ARG start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT over^ start_ARG italic_ΞΊ end_ARG start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG divide start_ARG over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT end_ARG end_POSTSUPERSCRIPT ) = over^ start_ARG italic_M end_ARG start_POSTSUBSCRIPT italic_b italic_d end_POSTSUBSCRIPT .(30)

Numerically solving this set of B2superscript𝐡2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT independent equations yields estimates for all model parameters.

Motalebi etal. [29] discussed the zi-DCSBM in comparison with a hurdle version of the DCSBM.The probability distribution of the hurdle-DCSBM can be written as

Pr⁑(𝒒|𝜽,𝝀,𝒒)=∏i⁒j{(1βˆ’qbi⁒bj)if⁒Ai⁒j=0,qbi⁒bj⁒(ΞΈiout⁒θjin⁒λbi⁒bj)Ai⁒j⁒exp⁑(βˆ’ΞΈiout⁒θjin⁒λbi⁒bj)Ai⁒j!⁒(1βˆ’exp⁑(βˆ’ΞΈiout⁒θjin⁒λbi⁒bj))if⁒Ai⁒j>0,Prconditionalπ’’πœ½π€π’’subscriptproduct𝑖𝑗cases1subscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗ifsubscript𝐴𝑖𝑗0otherwisesubscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗superscriptsubscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗subscriptπœ†subscript𝑏𝑖subscript𝑏𝑗subscript𝐴𝑖𝑗subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗subscriptπœ†subscript𝑏𝑖subscript𝑏𝑗subscript𝐴𝑖𝑗1subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗subscriptπœ†subscript𝑏𝑖subscript𝑏𝑗ifsubscript𝐴𝑖𝑗0otherwise\Pr(\mathcal{G}|\boldsymbol{\theta},\boldsymbol{\lambda},\boldsymbol{q})=\prod%_{ij}\begin{cases}(1-q_{b_{i}b_{j}})\quad\text{if }A_{ij}=0\,,\\q_{b_{i}b_{j}}\,\frac{(\theta^{\text{out}}_{i}\theta^{\text{in}}_{j}\lambda_{b%_{i}b_{j}})^{A_{ij}}\exp(-\theta^{\text{out}}_{i}\theta^{\text{in}}_{j}\lambda%_{b_{i}b_{j}})}{A_{ij}!\left(1-\exp(-\theta^{\text{out}}_{i}\theta^{\text{in}}%_{j}\lambda_{b_{i}b_{j}})\right)}\quad\text{if }A_{ij}>0\,,\end{cases}roman_Pr ( caligraphic_G | bold_italic_ΞΈ , bold_italic_Ξ» , bold_italic_q ) = ∏ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT { start_ROW start_CELL ( 1 - italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) if italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 0 , end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG ( italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT roman_exp ( - italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) end_ARG start_ARG italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ! ( 1 - roman_exp ( - italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ) end_ARG if italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT > 0 , end_CELL start_CELL end_CELL end_ROW(31)

where the role of the model parameters is the same as in the case of the zi-DCSBM.The primary difference between these two models lies in their approach to handling sparsity:zero-inflation accounts for excess zeros by introducing a separate zero-generating process,while hurdle models treat zeros and positive counts as generated by two distinct processes.Hurdle models offer advantages in terms of identifiability because they eliminate the ambiguity between zero counts and low interaction rates.However, this means that sparsity, or disconnected pairs, is solely due to the hurdle process and not from inherently low interaction rates, which may limit their appropriateness for many applications[17].

Considerations about community detection

So far, we have presupposed that the partitioning of nodes into distinct groups is given.Nevertheless, community detection can be performed using the zero-inflated model, thereby assigning labels endogenously to the nodes based on interaction and link data.Community detection methodologies can be broadly categorised into two types:those based on quality functions[30] and those rooted in Maximum Likelihood Estimation (MLE) principles[33].

The former can be seamlessly integrated with zero-inflated models.This integration necessitates the definition of a suitable partition quality function, which can then be optimised using the parameter estimation defined herein.A classical example of this approach is modularity optimisation[30], where the partition quality function Q𝑄Qitalic_Q is given by the equation:

Q=1mβ’βˆ‘i⁒j[Ai⁒jβˆ’kiout⁒kjinm]⁒δ⁒(ci,cj),𝑄1π‘šsubscript𝑖𝑗delimited-[]subscript𝐴𝑖𝑗subscriptsuperscriptπ‘˜out𝑖subscriptsuperscriptπ‘˜inπ‘—π‘šπ›Ώsubscript𝑐𝑖subscript𝑐𝑗Q=\frac{1}{m}\sum_{ij}\left[A_{ij}-\frac{k^{\text{out}}_{i}k^{\text{in}}_{j}}{%m}\right]\delta(c_{i},c_{j}),italic_Q = divide start_ARG 1 end_ARG start_ARG italic_m end_ARG βˆ‘ start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT [ italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT - divide start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_m end_ARG ] italic_Ξ΄ ( italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ,(32)

where Ai⁒jsubscript𝐴𝑖𝑗A_{ij}italic_A start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT represents the adjacency matrix, kisubscriptπ‘˜π‘–k_{i}italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and kjsubscriptπ‘˜π‘—k_{j}italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT are the degrees of nodes i𝑖iitalic_i and j𝑗jitalic_j, mπ‘šmitalic_m is the total number of edges, cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and cjsubscript𝑐𝑗c_{j}italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT are the communities of nodes i𝑖iitalic_i and j𝑗jitalic_j, and δ𝛿\deltaitalic_Ξ΄ is the Kronecker delta function.In this scenario, one would replace the expectation term kiout⁒kjinmsubscriptsuperscriptπ‘˜out𝑖subscriptsuperscriptπ‘˜inπ‘—π‘š\frac{k^{\text{out}}_{i}k^{\text{in}}_{j}}{m}divide start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_m end_ARG within the modularity function originating from the Chung-Lu model with that from the zi-CLCM.However, a close inspection reveals that the expected number of edges between a pair remains consistent across both models, as qbi⁒bj⁒θiout⁒θjin⁒λbi⁒bj=qbi⁒bj⁒kiout⁒kjinm⁒qbi⁒bj=kiout⁒kjinmbi⁒bjsubscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗subscriptsuperscriptπœƒout𝑖subscriptsuperscriptπœƒin𝑗subscriptπœ†subscript𝑏𝑖subscript𝑏𝑗subscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗subscriptsuperscriptπ‘˜out𝑖subscriptsuperscriptπ‘˜inπ‘—π‘šsubscriptπ‘žsubscript𝑏𝑖subscript𝑏𝑗subscriptsuperscriptπ‘˜out𝑖subscriptsuperscriptπ‘˜in𝑗subscriptπ‘šsubscript𝑏𝑖subscript𝑏𝑗q_{b_{i}b_{j}}\theta^{\text{out}}_{i}\theta^{\text{in}}_{j}\lambda_{b_{i}b_{j}%}=q_{b_{i}b_{j}}\frac{k^{\text{out}}_{i}k^{\text{in}}_{j}}{mq_{b_{i}b_{j}}}=%\frac{k^{\text{out}}_{i}k^{\text{in}}_{j}}{m_{b_{i}b_{j}}}italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ΞΈ start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_Ξ» start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT divide start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_m italic_q start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG = divide start_ARG italic_k start_POSTSUPERSCRIPT out end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_k start_POSTSUPERSCRIPT in end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_m start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG.Consequently, community detection through modularity optimisation in zero-inflated models yields analogous results to scenarios where zero-inflation is not considered.

The latter category of community detection instead, exemplified by methods utilising information theory to assess community quality, requires the computation of the MLE of the model and potentially the integration of the likelihood function to eliminate continuous parameters like θ⋆superscriptπœƒβ‹†\theta^{\star}italic_ΞΈ start_POSTSUPERSCRIPT ⋆ end_POSTSUPERSCRIPT and qπ‘žqitalic_q (see e.g., [33]).Unfortunately, performing these integrals analytically is not possible, because of the structure of the mixture probability.A numerical solution may exist, but the development of such methods and the assessment of their divergence from the standard non-zero-inflated DCSBM fall outside the scope of this article and warrant dedicated exploration in future research.

3 Performance of Zero-Inflated Multi-Edge Models

DatasetTypeN𝑁Nitalic_NM𝑀Mitalic_Mmπ‘šmitalic_md𝑑ditalic_dρ𝜌\rhoitalic_ρkurtosis
HS13High-school Contacts32758181885080.113.541244.05
SFHHConference Interactions4039565702610.120.874109.62
HS12High-school Contacts1802220450470.142.80712.33
WPWorkplace Contacts9275598270.182.35880.08
WP15Workplace Contacts2174274782490.183.34695.74
HS11High-school Contacts1261709285610.223.63725.30
Thiers11High-school Contacts1261709285610.223.63725.30
LyonSchoolPrimary School Contacts24283171257730.294.31237.41
HT09Conference Interactions1132196208180.353.291771.89
HOHospital Contacts751139324240.4111.68152
KHHousehold Contacts47504326430.4730.2038.38
BBAnimal Interactions1378630951.00808.9110.62

The aim of this section is to investigate the limitations of Poisson multi-edge models in dealing with empirical data and to demonstrate how zero-inflation can address these issues.We benchmark our models using classical network datasets from the Sociopatterns repository[27].These datasets, which report contacts among individuals over short time frames, typically result in sparse multi-edge networks despite a large number of recorded interactions.

The datasets encompass various social interaction scenarios, including interactions among high-school students, conference attendees, and hospital staff.Each dataset varies in terms of the number of nodes (N𝑁Nitalic_N), unique links (M𝑀Mitalic_M), total multi-edges (mπ‘šmitalic_m), density (d𝑑ditalic_d), and multi-edge density (ρ𝜌\rhoitalic_ρ).Nevertheless, most datasets exhibit low link density, i.e., d=M/(N2)β‰ͺ1𝑑𝑀binomial𝑁2much-less-than1d=M/\binom{N}{2}\ll 1italic_d = italic_M / ( FRACOP start_ARG italic_N end_ARG start_ARG 2 end_ARG ) β‰ͺ 1, and large multi-edge density ρ=m/(N2)πœŒπ‘šbinomial𝑁2\rho=m/\binom{N}{2}italic_ρ = italic_m / ( FRACOP start_ARG italic_N end_ARG start_ARG 2 end_ARG ).This indicates a sparse network structure despite the large number of recorded interactions.Such characteristics make these datasets a prominent example where classical multi-edge models are sub-optimal.As shown in Fig.1(b), naively modelling these datasets would quickly yield fully connected network realisations, in stark contrast with the sparse structure exhibited by the empirical data.

Interestingly, sparsity is often observed together with a β€œheavy-taillness” of the edge count distribution[37].The empirical distribution of edge counts displays a considerable number of outliers, i.e., unexpectedly large edge counts.We quantify this by computing the excess kurtosis of the edge count distribution.

Excess kurtosis denotes the tails’ heaviness relative to a normal distribution.Values close to 0 indicate a distribution with similar tail behaviour to the normal distribution.Positive values signify heavier tails, indicating more extreme outliers, while negative values suggest lighter tails.In other words, large positive excess kurtosis values imply a higher probability of extreme events or outliers compared to a normal distribution.The sample excess kurtosis and other basic statistics of the empirical data analysed are reported in Table1.

3.1 DCSBM versus zi-DCSBM

Enhancing Multi-Edge Models with Zero-Inflation (3)

Enhancing Multi-Edge Models with Zero-Inflation (4)

Enhancing Multi-Edge Models with Zero-Inflation (5)

Enhancing Multi-Edge Models with Zero-Inflation (6)

Enhancing Multi-Edge Models with Zero-Inflation (7)

Among the models considered in this study, degree-corrected stochastic block models (DCSBM) are the most general due to their capability to accommodate both group-level and node-level heterogeneity.Consequently, our comparison focuses on the zero-inflated and classical variants of DCSBM.To maintain simplicity, we opt to infer the blocks in the models utilising the Leiden modularity maximisation algorithm [36].Modularity, as previously mentioned, relies on the expected edge count derived from an underlying null-model, typically a configuration model.Given that classical and zero-inflated model pairs share the same expectation, modularity assumes an identical form for both.Hence, we can utilise identical blocks for both models.While maximum likelihood optimisation offers the potential for superior models overall, the blocks identified for classical and zero-inflated models are not necessarily the same.For the purpose of comparing how DCSBM and zi-DCSBM contend with sparse multi-edge networks, employing the same blocks allows us to better understand the differences between the two models.Therefore, we proceed with modularity-inferred blocks.

Excess Kurtosis

In Fig.2, we plot the percentage of sample excess kurtosis in the edge count distribution captured by the DCSBM (red) and zi-DCSBM (blue).The plot primarily serves as a means to compare the excess kurtosis between the two models.Generally, zero-inflated model variants allow for a shift in part of the weight of the edge count distribution from the centre towards the tail, thereby (i) increasing the kurtosis and (ii) more closely following the empirical data.In the plot, we can see that zero-inflated models mostly yield higher kurtosis where the empirical data has particularly high sample kurtosis.This indicates the presence of more extreme edge counts compared to the standard model variants, better reproducing the empirical data.

Sparsity

When sparsity mainly arises between groups, block modelsβ€”even without zero-inflationβ€”may appear sufficient to capture sparsity.However, they are nevertheless outperformed by their zero-inflated variants.An illustrative example is provided by the HS13 dataset in Fig.3(a).Where the number of multi-edges is large, DCSBM tends to produce numerous connected pairs with low edge counts.This contrasts with the empirical data, where fewer pairs tend to be connected but with larger edge counts.

In Fig.3(a) (top), the empirical distribution of edge counts is depicted as a bar plot alongside the expected edge count distribution of DCSBM (shown in red).It is notable that the bulk of the weight in the distribution of DCSBM is concentrated at small positive counts, thus significantly underrepresenting large counts.Conversely, the zi-DCSBM, fitted with the same blocks, is capable of shifting the distribution (shown in blue in the plot) towards larger counts.

Figure3(a) (bottom) illustrates the cumulative error over the different edge count values.It quantifies the discrepancy between the observed distribution and the expected distribution according to the models.The cumulative error represents the percentage of node pairs with a given edge count in the data compared to those expected in the model.In the case of HS13, for low counts there is a considerable difference between the DCSBM and the zi-DCSBM.This difference does not grow further for larger counts.

Additionally, we can use the chi-squared goodness-of-fit statistic to quantitatively compare the distributions.The chi-squared statistic for the DCSBM is 7199.87199.87199.87199.8, while the chi-squared statistic for the zi-DCSBM is 4125.24125.24125.24125.2.The smaller statistic for the zi-DCSBM shows that the zero-inflated model provides a considerably closer match to the empirical distribution.Nevertheless, the large value of both statistics indicates that both models deviate significantly from the empirical distribution.

In other examples, multi-edges are bundled over a small number of pairs both within and between groups.In these cases, the DCSBM performs particularly poorly, as it greatly underestimates the sparsity of the graph.An example of this is provided by the KH dataset in Fig.3(b).Again, Fig.3(b) (top) shows the empirical distribution of edge counts as a bar plot and the expected distribution of the DCSBM in red.Here, the DCSBM is unable to even approximate the empirical distribution.The zi-DCSBM fitted with the same blocks is instead able to better follow the empirical distribution (in blue in the plot).Such a large difference can be easily seen in Fig.3(b) (bottom).The cumulative error for the DCSBM starts higher and grows much faster than in the zi-DCSBM case.

These results can be confirmed quantitatively by computing the chi-squared goodness-of-fit statistics.In the case of the DCSBM, we get 1442.71442.71442.71442.7.The zi-DCSBM gives 175.91175.91175.91175.91, nearly an order of magnitude smaller than the non zero-inflated model.This supports the qualitative assessment obtained from Fig.3(b).In the supplementary information A, we provide as supplementary figures the equivalents of Figs.3(a) and3(b) for the remaining 10 Sociopatterns datasets.

The KH dataset allows for further investigation of the role of zero-inflation on network models.On the left of Fig.4, the empirical network is visualised as a multi-edge graph.The β€˜lens’ plots in the centre of the figure show two realisations from the DCSBM (bottom) and zi-DCSBM (top).It is easy to visually glean what is quantified in Fig.3.The DCSBM yields much denser realisations, i.e., with a higher fraction of connected pairs, compared to both its zero-inflated variant and the empirical data.The adjacency matrices on the right side of Fig.4 further confirm this.The adjacency matrix of the DCSBM realisation is much denser than either its zero-inflated counterpart or the empirical one.

Finally, the structure of a network has a significant impact on the dynamics running on it, influencing processes such as information diffusion and opinion formation.In denser networks, diffusion processes occur more rapidly.In the supplementary information B, we show that the DCSBM greatly overestimates the diffusion speed compared to the empirical data.Conversely, its zero-inflated counterpart consistently mitigates this issue by better preserving the structure of the empirical networks.

Enhancing Multi-Edge Models with Zero-Inflation (8)

4 Discussion

Our study reveals significant limitations of classical multi-edge network models and the potential of zero-inflated models to overcome these challenges.We have shown that empirical multi-edge networks tend to be sparse despite having a large number of edges.This sparsity means that many edges are concentrated on a few node pairs, resulting in edge count distributions with heavy tails.These characteristics pose considerable difficulties for traditional multi-edge network models.They often fail to capture both the sparse nature and the high heterogeneity observed in real-world networks.

To mitigate these limitations, we show how classical multi-edge network models can be extended to incorporate zero-inflation.This involves introducing a mechanism that accounts for the excess number of zeroes (disconnected pairs) observed in empirical data.Zero-inflation not only helps in reproducing network sparsity but also improves the modelling of heavy-tailed edge count distributions.It achieves this by reducing the formation of many pairs with low edge counts in favour of a few pairs with high edge counts.This effectively shifts part of the weight of the edge count distribution towards larger counts.Failing to account for this ubiquitous property of real-world networks can lead to significant misrepresentation of network structures and dynamics.

Our results indicate that while zero-inflation significantly enhances the fit of the models to empirical data, there remain areas for improvement.In this article, we have chosen to employ modularity maximisation to infer β€œmodel-independent” node labels.This approach allowed us to evaluate the advantages of zero-inflated models compared to their classical counterparts without the confounding effects of differing block structure.However, developing block inference algorithms tailored for zero-inflated models will yield more accurate representations of network structures.This is particularly important because the blocks derived from modularity maximisation, while useful, do not always capture the full complexity of empirical networks nor can they fully exploit the advantages of zero-inflation.Moreover, zero-inflated Poisson models are limited by the restrictive nature of the Poisson distribution.In particular, Poisson models may not adequately account for over-dispersionβ€”a common feature of count data[37].Employing more general distributions, such as negative binomial or generalised hypergeometric distributions, can better handle over-dispersion and improve model fit[7, 13].

Our findings suggest that zero-inflated models provide a more detailed understanding of network structures, capturing both the sparsity and the extreme events reflected in the data.This aligns with the need for accurate models in various applications, such as optimising distribution systems, understanding disease spread, and analysing social behaviours.In our study, the example networks are so sparse that the necessity of zero-inflation is evident.However, in less extreme cases, determining the need for zero-inflation versus an appropriate choice of block structure becomes essential.To address this, appropriate likelihood-ratio tests and model comparison techniques have been developed for various models[7, 29, 13, 5].These methods can help ascertain the necessity of zero-inflation by comparing the fit and performance of different models on the same dataset.

Our work bridges a critical gap in network modelling by systematically introducing zero-inflation to classical multi-edge models.This offers a more accurate framework for analysing empirical networks, thus enhancing the potential for network science to contribute to a wide range of complex systems.

Aknowledgements

The authors thank Prof. Frank Schweitzer for his support and Dr. Giacomo Vaccario for useful discussions.G.A. acknowledges funding from SNF Grant n.192746.

References

  • Aicher etal. [2015]Aicher, C.; Jacobs, A.Z.; Clauset, A. (2015).Learning latent block structure in weighted networks.Journal of Complex Networks 3, 221–248.
  • Ali [2022]Ali, E. (2022).A simulation-based study of ZIP regression with various zero-inflated submodels.Communications in Statistics - Simulation and Computation 53, 642–657.
  • Amico etal. [2024]Amico, A.; Verginer, L.; Casiraghi, G.; Vaccario, G.; Schweitzer, F. (2024).Adapting to disruptions: Managing supply chain resilience through product rerouting.Science Advances 10, eadj1194.
  • Casiraghi [2019]Casiraghi, G. (2019).The block-constrained configuration model.Applied Network Science 4, 1–22.
  • Casiraghi [2021]Casiraghi, G. (2021).The likelihood-ratio test for multi-edge network models.Journal of Physics: Complexity 2, 035012.
  • Casiraghi and Nanumyan [2021]Casiraghi, G.; Nanumyan, V. (2021).Configuration models as an urn problem.Scientific Reports 11, 13416.
  • Choi and Ni [2023]Choi, J.; Ni, Y. (2023).Model-based Causal Discovery for Zero-Inflated Count Data.Journal of Machine Learning Research 24, 1–32.
  • Chung and Lu [2002]Chung, F.; Lu, L. (2002).Connected Components in Random Graphs with Given Expected Degree Sequences.Annals of Combinatorics 6, 125–145.
  • Consul and Famoye [1992]Consul, P.; Famoye, F. (1992).Generalized poisson regression model.Communications in Statistics - Theory and Methods 21, 89–109.
  • Danon etal. [2011]Danon, L.; Ford, A.P.; House, T.; Jewell, C.P.; Keeling, M.J.; Roberts, G.O.; Ross, J.V.; Vernon, M.C. (2011).Networks and the Epidemiology of Infectious Disease.Interdisciplinary Perspectives on Infectious Diseases 2011, 284909.
  • Darling and Norris [2005]Darling, R. W.R.; Norris, J.R. (2005).Structure of large random hypergraphs.The Annals of Applied Probability 15, 125–152.
  • Decelle etal. [2011]Decelle, A.; Krzakala, F.; Moore, C.; ZdeborovΓ‘, L. (2011).Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications.Physical Review E 84, 066106.
  • Dong etal. [2019]Dong, H.; Chen, N.; Wang, K. (2019).Modeling and Change Detection for Count-Weighted Multilayer Networks.Technometrics 62, 184–195.
  • Ebrahimi etal. [2021]Ebrahimi, S.; Reisi-Gahrooei, M.; Paynabar, K.; Mankad, S. (2021).Monitoring sparse and attributed networks with online Hurdle models.IISE Transactions 54, 91–104.
  • ErdΕ‘s and RΓ©nyi [1959]ErdΕ‘s, P.; RΓ©nyi, A. (1959).On random graphs. I.Publicationes Mathematicae Debrecen 6, 290–297.
  • Faloutsos etal. [2004]Faloutsos, C.; McCurley, K.S.; Tomkins, A. (2004).Connection subgraphs in social networks.In: SIAM International Conference on Data Mining, Workshop on Link Analysis, Counterterrorism and Security. vol.2.
  • Feng [2021]Feng, C.X. (2021).A comparison of zero-inflated and hurdle models for modeling zero-inflated count data.Journal of Statistical Distributions and Applications 8, 8.
  • Fosdick etal. [2018]Fosdick, B.K.; Larremore, D.B.; Nishimura, J.; Ugander, J. (2018).Configuring Random Graph Models with Fixed Degree Sequences.SIAM Review 60, 315–355.
  • Freeman etal. [2004]Freeman, L.; etal. (2004).The development of social network analysis.A Study in the Sociology of Science 1, 159–167.
  • Gilbert [1959]Gilbert, E.N. (1959).Random Graphs.The Annals of Mathematical Statistics 30, 1141–1144.
  • Gote etal. [2023]Gote, C.; Casiraghi, G.; Schweitzer, F.; Scholtes, I. (2023).Predicting variable-length paths in networked systems using multi-order generative models.Applied Network Science 8, 68.
  • Holland etal. [1983]Holland, P.W.; Laskey, K.B.; Leinhardt, S. (1983).Stochastic blockmodels: First steps.Social Networks 5, 109–137.
  • Karrer and Newman [2011]Karrer, B.; Newman, M. E.J. (2011).Stochastic blockmodels and community structure in networks.Physical Review E 83, 16107.
  • Krivitsky [2012]Krivitsky, P.N. (2012).Exponential-family random graph models for valued networks.Electronic Journal of Statistics 6, 1100–1128.
  • Krivitsky etal. [2011]Krivitsky, P.N.; Handco*ck, M.S.; Morris, M. (2011).Adjusting for network size and composition effects in exponential-family random graph models.Statistical Methodology 8, 319–339.
  • Lambert [1992]Lambert, D. (1992).Zero-Inflated Poisson Regression, with an Application to Defects in Manufacturing.Technometrics 34, 1–14.
  • Mastrandrea etal. [2015]Mastrandrea, R.; Fournet, J.; Barrat, A. (2015).Contact Patterns in a High School: A Comparison between Data Collected Using Wearable Sensors, Contact Diaries and Friendship Surveys.PLOS ONE 10, e0136497.
  • Motalebi etal. [2021a]Motalebi, N.; Owlia, M.S.; Amiri, A.; Fallahnezhad, M.S. (2021a).Monitoring social networks based on Zero-inflated Poisson regression model.Communications in Statistics - Theory and Methods 52, 2099–2115.
  • Motalebi etal. [2021b]Motalebi, N.; Stevens, N.T.; Steiner, S.H. (2021b).Hurdle Blockmodels for Sparse Network Modeling.The American Statistician 75, 383–393.
  • Newman and Girvan [2004]Newman, M. E.J.; Girvan, M. (2004).Finding and evaluating community structure in networks.Physical Review E 69, 026113.
  • Norros and Reittu [2006]Norros, I.; Reittu, H. (2006).On a conditionally Poissonian graph process.Advances in Applied Probability 38, 59–75.
  • Peixoto [2013]Peixoto, T.P. (2013).Parsimonious Module Inference in Large Networks.Physical Review Letters 110, 148701.
  • Peixoto [2017]Peixoto, T.P. (2017).Nonparametric Bayesian inference of the microcanonical stochastic block model.Physical Review E 95, 012317.
  • Sari etal. [2021]Sari, D.N.; Purhadi, P.; Rahayu, S.P.; Irhamah, I. (2021).Estimation and Hypothesis Testing for the Parameters of Multivariate Zero Inflated Generalized Poisson Regression Model.Symmetry 13, 1876.
  • Scholtes [2017]Scholtes, I. (2017).When is a Network a Network?: Multi-Order Graphical Model Selection in Pathways and Temporal Networks.In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’17, ACM.
  • Traag etal. [2019]Traag, V.A.; Waltman, L.; van Eck, N.J. (2019).From Louvain to Leiden: guaranteeing well-connected communities.Scientific Reports 9, 5233.
  • Yang etal. [2009]Yang, Z.; Hardin, J.W.; Addy, C.L. (2009).Testing overdispersion in the zero-inflated Poisson model.Journal of Statistical Planning and Inference 139, 3340–3353.
Enhancing Multi-Edge Models with Zero-Inflation (2024)
Top Articles
Latest Posts
Article information

Author: Lilliana Bartoletti

Last Updated:

Views: 5955

Rating: 4.2 / 5 (53 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Lilliana Bartoletti

Birthday: 1999-11-18

Address: 58866 Tricia Spurs, North Melvinberg, HI 91346-3774

Phone: +50616620367928

Job: Real-Estate Liaison

Hobby: Graffiti, Astronomy, Handball, Magic, Origami, Fashion, Foreign language learning

Introduction: My name is Lilliana Bartoletti, I am a adventurous, pleasant, shiny, beautiful, handsome, zealous, tasty person who loves writing and wants to share my knowledge and understanding with you.