Identifying and Testing Recursive vs. Interdependent Links in Simultaneous Equation Models via the SIRE Package

Abstract:

Simultaneous equation models (SEMs) are composed of relations which either represent unidirectional links, which entail a causal interpretation, or bidirectional links, due to feedback loops, which lead to the notion of interdependence. The issue is of prominent interest in several respects. Investigating the causal structure of a SEM, on the one hand, brings to light the theoretical assumptions behind the model and, on the other hand, pilots the choice of the befitting estimation method and of which policy to implement.
This paper provides an operational method to distinguish causal relations from interdependent ones in SEMs, such as macro-econometric models, models in ecology, biology, demography, and so forth. It is shown that the causal structure of a system crucially rests on the feedback loops, which possibly affect the equations. These loops are associated to the non-null entries of the Hadamard product of matrices encoding the direct and indirect links among the SEM dependent variables. The effectiveness of feedbacks is verified with a Wald test based on the significance of the aforementioned non-null entries.
An R package, SIRE (System of Interdependent/Recursive Equations), provides the operational completion of the methodological and analytic results of the paper. SIRE is applied to a macroeconomic model to illustrate how this type of analysis proves useful in clarifying the nature of the complex relations in SEMs.

Cite PDF Tweet

Published

Aug. 15, 2019

Received

Oct 8, 2018

Citation

Vacca & Zoia, 2019

Volume

Pages

11/1

149 - 169


1 Introduction

As is well known, each equation in a simultaneous equation model (SEM) represents a specific link between a dependent (endogenous) variable and a set of other variables which play an explicative role for the former. These links can reflect either one-way relations between the dependent and their explicative variables or two-ways relations, ascribable to the presence of feedback loops operating either at a systematic or a stochastic level. SEMs are of recursive type as long as the equations represent unidirectional links. Otherwise, if the equations are bidirectional, the SEM (or part of it) is interdependent. Interdependence is, both structurally connected to the presence of current endogenous variables playing an explicative role, and can result as a by-product of error-term dependencies.

Investigating the nature, causal rather than interdependent, of a SEM is important in several respects. First the analysis, unfolding the dynamics among variables, sheds more light on the rationale behind the theoretical assumptions of the model. For instance, in an economic framework, the distinction between interdependent and causal SEMs leads to models which can be traced back to two main streams of economic theory: Neoclassical and Keynesian (, ). Furthermore, the implication of interdependence vs. causality is crucial for undertaking parameter estimation, given that a set of causal equations can be estimated equation by equation by ordinary least squares (OLS), while simultaneous estimation methods, like three stage least squares (3SLS) are required when interdependence occurs. Given that large SEMs have become increasingly popular, the need for an analytical set-up, able to effectively detect and test causality versus interdependence, has of course become more urgent.

Starting from this premise and following , ; , ; and more recently , ; , ); in this paper we have devised an operational method to distinguish the causal from the interdependent equations of a SEM.

Other approaches for detecting feedback-loops arising in deterministic (error free) models are based on either graph or system theory (see e.g., ). Our methodological proposal goes beyond the aforementioned methods, as besides covering both the cases of deterministic and error-driven feedback effects, it provides a way for testing the feedback effectiveness. In addition, it differs in principle from other approaches, as the one proposed by Granger (see , ) and the Covariance Structural Analysis (CSA; ). The former essentially rests on a predictability criterion for defining causality regardless of the theory behind the model. The latter, which is meant to find the best parametric approximation of the sample covariance matrix in terms of a given theoretical SEM structure; as such, it does not lead to a causal/interdependent interpretation of the model links as the one developed in our paper.

The feedbacks identified by the method proposed here demand statistical confirmation on certain empirical evidence arguments. Lack of significance of (one or more of) the estimated feedbacks can overturn the nature of the connections among model variables. To this end, a Wald type test is devised to check whether a given equation is significantly affected by feedback or not. The statistic of this test hinges on the parameter matrices of the model: the matrix associated to the endogenous variables playing an explicative role and the dispersion matrix of the error terms. If an equation is affected by feedback loops, the testing procedure allows to diagnose which endogenous variables are significantly connected in the loop of interest. Indeed, testing the significance of feedbacks means also checking if the links among variables, suggested by the theory at the basis of the model, are confirmed according to an empirical evidence argument.

The methodological approach put forth in this paper is implemented in R with the SIRE package. Besides integrating functions usually employed for the estimation of SEM’s, the package provides new functions meant to duly split a system of equations into its unidirectional and bidirectional links, and test their significance. To our knowledge, extant alternative approaches to causality do not offer a similar test.

The paper is structured as follows. The first section provides the methodological set-up devised to single out causal and interdependent relations in a SEM. In the second section, a Wald-type test is worked out to check whether a given equation is affected by feedbacks or not. The third section shows how the method and the R code work for detecting and testing feedback-loops in a macroeconomic model. An Appendix, with proofs of the main theoretical results, completes the paper.

2 Detecting Loops in an Equation System

An equation system is a set of structural equations representing economic theory-driven relations linking the variables relevant to the study at hand.

It is customary to specify an equation system as follows (1)yt=Γyt+Azt+ϵtt=1,,T where yt is a L×1 vector of current dependent or endogenous variables, zt is a J×1 vector of explicative variables and ϵt is a L×1 vector of error terms. T is the sample period. Γ and A are, respectively, L×L and L×J sparse parameter matrices. In particular Γ, expressing the relations among current endogenous variables, is a hollow matrix to prevent any endogenous variable from explaining itself. Furthermore, it is assumed that (IΓ) is of full rank, meaning that the equations are linearly independent.

Error terms are assumed to be non-systematic, stationary in a wide sense, and uncorrelated over time, that is (2)E(ϵt)=0LE(ϵtϵτ)={Σ(L×L)ift=τ0(L×L)iftτ Actually, the pattern of relations recognizable in an econometric model can be interpreted either in terms of causal or interdependent schemes. A causal relation among variables is an asymmetric, theoretically-grounded and predictive relations which can be ideally meant as a stimulus-response mechanism (see , and ). The equations of a model form a causal chain when, once they are properly ordered, each current endogenous variable turns out to be, on the one hand, resultant of the joint effect of the endogenous which precede it in the chain and, on the other hand, cause of the current endogenous which follow the same endogenous in the chain. A model with equations that form a causal chain is defined recursive. The following simple equation system provides an example of a recursive model (see Figure 1, left panel) (3)y1,t=a1zt+ϵ1,ty2,t=γ2,1y1,t+a2zt+ϵ2,ty3,t=γ3,2y2,t+γ3,1y1,t+a3zt+ϵ3,ty4,t=γ4,3y3,t+γ4,1y1,t+a4zt+ϵ4,t Recursive systems can be easily estimated, equation by equation, using OLS, starting from the top of the chain.

When a causal chain exists among blocks of current endogenous variables, a causal order can be established among those blocks of equations. In this case, the current endogenous variables of a block are effects of the variables belonging to the blocks which come before them in the chain, as well as the causes of the variables belonging to blocks which follow the block at stake in the chain. In this case, the model is of block-recursive type. The following simple equation system provides an example of a recursive model (see Figure 1, middle panel) (4)y1,t=γ1,2y2,t+a1zt+ϵ1,ty2,t=γ2,1y1,t+a2zt+ϵ2,ty3,t=γ3,2y2,t+γ3,4y4,t+a3zt+ϵ3,ty4,t=γ4,3y3,t+γ4,1y1,t+a4zt+ϵ4,t Here, the chain is formed by two blocks of variables (y1, y2) and (y3 and y4) with the variables of the first block explaining those of the second.

Sometimes the composite nature of the connections among variables leads to a closed sequence of dependencies among variables to be ascribed to feedback loops. This type of interaction among endogenous variables is usually called interdependence. Interdependence is structurally connected to the presence of both current endogenous variables on the right-hand side of the model and the correlation between contemporaneous error terms.See the system below as an example in this regard (see Figure 1, right panel) (5)y1,t=γ1,2y2,t+a1zt+ϵ1,ty2,t=γ2,1y1,t+γ2,3y3,t+a2zt+ϵ2,ty3,t=γ3,2y2,t+γ3,4y4,t+a3zt+ϵ3,ty4,t=γ4,3y3,t+γ4,1y1,t+a4zt+ϵ4,t

graphic without alt text graphic without alt text graphic without alt text
  1. Recursive model .
  1. block-recursive model .
  1. interdependent model .
Figure 1: The three patterns of relations in a simultaneous equation model.

Based on this premise, it is clear that the causal or interdependent features of a model’s equations depend on the pair of matrices Γ and Σ. The former matrix highlights the possible (circular) dependencies or feedbacks among endogenous variables, while the latter features those induced by the stochastic components. In fact, the correlation of error terms associated to an equation-pair may transform the link between the endogenous, explained by these equations, into a relation with feedback.

Moreover, the essential information concerning the causal structure of a model can be obtained from the topological propertiesThe term topological properties refers to those properties of a matrix which depend exclusively on the number and the relative position of its null and non-null elements (, ). of the pair of the mentioned matrices and, at the very end, from the topological properties of the associated binary matrices Γband Σb. A binary matrix associated to a matrix G is a matrix whose entries are equal to 1 if the corresponding entries of G are non-null, or 0 otherwise. Binary matrices preserve the topological properties of the parent matrices.

Following Faliva (, ) matrix Γ can be split as follows (6)Γ=C~+Ψ0 where C~ includes the coefficients associated to current endogenous variables involved in feedback loops, and Ψ0 those associated to endogenous variables involved in causal relations.

Matrix C~ is specified as follows (7)C~=C+Ψ1 where C includes the feedbacks arising in the systematic part of the model and matrix Ψ1 those induced by the correlation of the error terms. Matrices C and Ψ1 are defined as follows (8)C=ΓRR={[r=1L1(Γb)r]b}

(9)Ψ1=(ΓC)[Σb(I+R)]b, where the symbol "" denotes the Hadamard product.The Hadamard product of two matrices, A and B of the same order, is defined as the matrix of the term-to-term products of the elements of these matrices, that is (AB)(i,j)=a(i,j)b(i,j).An alternative approach for determining the feedbacks operating at a systematic level in a model is based on graph theory (see , and , ).

The rationale of (8) hinges on the fact that a direct feedback between variables yi and yj corresponds to the simultaneous non-nullity of γi,j and γj,i of coefficient matrix Γ. This entails that a direct feedback between these two variables exists if the (i,j)-th element of the matrix The element γj,i of Γ corresponds to the element γi,j of Γ (10)Γ(Γb) is non null. An indirect feedback between the same variables is instead associated to a bidirectional connection between yi and yj established

through other variables and equations. In algebraic terms this corresponds to the simultaneous non-nullity of the (i,j)-th element of Γ and of the (i,j)-th element of a positive power of Γ (, ). This entails that an indirect feedback exists between the mentioned variables if the (i,j)-th element of the following matrix (11)Γ{[r=2L1(Γb)r]b} is non-null.

Accordingly, matrix (12)Ψ=ΓC includes the coefficients associated to endogenous variables which, as far as the systematic aspects of the model are concerned, have a causal role.It is worth mentioning that Ψ is Hadamard-orthogonal to C (two matrices A and B are said to be Hadamard-orthogonal if AB=0). Furthermore, while matrix C is co-spectral to Γ (i.e., they have the same eigenvalues), matrix Ψ is a hollow-nilpotent matrix, like Γ (a square matrix N is nilpotent if Nk=0 for some k<M, where M is the matrix dimension). A hollow, nilpotent matrix can always be expressed in triangular form.

In order to show how feedbacks operating in the systematic part of a model can be detected, let us consider as an example the following deterministic model (13)y1,t=γ1,5y5,t+γ1,7y7,t+a1zty2,t=a2z2,ty3,t=γ3,11y11,t+a3zty4,t=γ4,3y3,t+a4zty5,t=γ5,10y10,t+a5zty6,t=γ6,5y5,t+γ6,9y9,t+a6zty7,t=γ7,6y6,t+a7zty8,t=γ8,12y12,t+a8zty9,t=γ9,7y7,t+a9zty10,t=γ10,5y5,t+a10z2,ty11,t=γ11,12y12,t+a11zty12,t=γ12,4y4,t+γ12,11y11,t+a12zty13,t=γ13,2y2,t+γ13,6y6,t+a13zt Matrix Γb is given by Γb=[1111111111111111] Using (8) and (12), Γb is split in the following two submatrices (14)Cb=[1111111111],Ψb=[111111]
Looking at matrix Cb, we see that the simultaneous non-nullity of the c5,10, c10,5, c11,12, and c12,11 elements imply the existence of two direct feedbacks: one between the variable-pair y5 and y10, and the other between y11 and y12. The non-nullity of the c3,11, c4,3, and c12,4 elements denotes the existence of indirect feedbacks between the four variables y3, y4, y11, and y12. Similarly, variables y6, y7, and y9 are connected by an (indirect) feeback as a consequence of the non-nullity of the c6,9, c7,6, and c9,7 elements. Looking at matrix Ψ we conclude that variables y5 and y7 have a causal role in the first equation. Variables y5 and y12 have the same role in the equations six and eight, while variables y2 and y6 play a causal role in the last equation. The results ensuing from the decomposition of Γb are depicted in Figure 2.

graphic without alt text
Figure 2: Interdependent links (in red) and causal links (in black) operating in the model (13).

If the error terms are correlated, the causal structure of a model could no longer match that of its systematic counterpart since part of the relations that are recursive at systematic level, namely Ψyt, may become interdependent as a consequence of the feedback mechanisms induced by the stochastic terms in Σ. In this case, matrix Ψ turns out to be the sum of two Hadamard-orthogonal matrices, Ψ0 and Ψ1, that is (15)Ψ=Ψ0+Ψ1Ψ0Ψ1=0(L×L) where (16)Ψ1b=ΨFF=[Σb(I+R)]b Here, matrix Ψ1 includes the coefficients associated to the endogenous variables involved in loops induced by disturbances. In fact, it can be proved (see 1. in Appendix) that the matrix [Σb(I+R)]b is the binary counterpart of the covariance matrix between the error terms and the endogenous variables given by (17)E(ϵtyt)=[Σ(IΓ)1] The non-null elements of the above matrix express the effect of the model’s left-hand side (LHS) endogenous variables on the right-hand side (RHS) ones, which are induced by the error term correlation.

Equations (15) and (16) rest on the logical relations between the concepts of causality and predictability, where the notion of optimal predictor (in mean-square sense) tallies with that of conditional expectation. In fact, given that causal relations are also predictive, but not vice-versa, we can define as causal those relations that are both causal in the deterministic model and predictive in a stochastic context. This means that if the conditional expectations of the relations, which are causal in the deterministic model, namely Ψyt, are not affected by the error terms, then Ψyt turns out to also have a causal role in a stochastic context.

Accordingly, we can say that the stochastic specification is neutral with respect to the underlying systematic causal structure if the following holds (, ) (18)E(Ψyt+ϵt|Ψyt)=Ψyt+E(ϵt|Ψyt)=Ψyt

meaning that (19)E(ϵt|Ψyt)=0

Otherwise, the correlation between the error terms and the endogenous variables may affect the conditional expectation of the error term as follows (see , ) (20)E(ϵt|Ψyt)=Ψ1yt which, in turn, implies that (21)E(Ψyt+ϵt|Ψyt)=ΨytΨ1yt=Ψ0yt In this case, only the subset Ψ0yt of the original set of causal relations, playing a predictive role, is causal. This, in turn, implies that the overall feedback operating in the system is included in matrix C~=C+Ψ1.

To highlight the role played by the stochastic specification on the model causal structure, let us consider as an example the following specification for matrix Σb (22)Σb=[11111111111111111111111111111111111111111111111] Then, matrices Cb~ and Ψ0b are (23)Cb~=[11111111111111]=Cb+Ψ1b=[1111111111]+[1111]

(24)Ψ0b=[111] The non-null correlation between the pairs of error terms {ϵ5,ϵ1}, {ϵ6,ϵ13}, {ϵ7,ϵ1} and {ϵ12,ϵ8} (see Equation (22)) has transformed the relations among the pairs of variables {y5,y1}, {y6,y13}, {y7,y1}, and {y12,y8}, which were causal in the deterministic model (13), into interdependent links. Figure 3 shows the effect of the stochastic specification (22) on the feedbacks originally detected in the deterministic model (13).

graphic without alt text
Figure 3: Interdependent (in red) and causal (in black) links operating in the model (13) when the stochastic specification is as in (22). Dashed red lines with double-headed arrows denote interdependent links induced by the correlation of the error terms.

The flow-chart in Figure 4 shows the different cases, according to the structure of matrices Γ and Σ.

graphic without alt text
Figure 4: Flow-chart showing the possible outcome of the system decomposition in terms of Γ and Σ.

3 Testing the Significance of Feedback Loops

In the previous section an analytic framework was set up to describe the potential feedbacks operating in a model. In fact, the analysis developed, relying on binary matrices, was meant to be qualitative since it only highlights the feedback set that potentially operates in a model, given the characteristics of its relations and its stochastic specification. Only once the model has been duly estimated, can the coefficients of matrix C~ be properly evaluated. At this point, it proves useful to devise a procedure for testing the significance of the estimated loops (see , ). To this end, let us observe that, once the matrix including all the feedbacks operating in the model (25)C+Ψ1=C+Ψ1=C+(ΨF)F=Σ(IΓ)1 have been properly estimated, a test for the effective functioning of feedback loops can be established, based on the significance of its non-null entries. Any given equation, say the j-th one, turns out to be involved in feedback loops with other equations of the model whenever the jth row of the above matrix is not a null vector. Should the (j,i)-th entry of this matrix be non-null, then a feedback between the j-th and the i-th equation would be expected to exist (see (55) in the Appendix). Actually, it can be proved (see 2. in Appendix) that, in light of the identity

(26)C+(ΨF)=(C+Ψ)F=ΓF a test for the significance of the loops can be based on the exam of the statistical non-nullity of the elements of matrix ΓF which, unlike C~, does not require the preliminary split of Γ into its components, given the feedback loops C+Ψ1 and causal links Ψ0.

In this context (following , ), it can be proved that the j-th row of matrix ΓF measures both the direct effect of the RHS endogenous variables on the j-th one and the feedback effect of the latter on the former variables. In fact, the direct effects of the RHS endogenous variables, collected in vector yo, on variable yj are included in the j-th row of matrix Γ (excluding its j-th element), that is (27)E(yj|yo)yo=ejΓMj Here, ej is the L-dimensional j-th elementary vector and Mj is the (L×(L1)) selection matrix obtained from the identity matrix by deleting its j-th column, that is (28)Mj=[e1(L,1),ej1(L,1),ej+1(L,1),eL1(L,1)]

The feedback effects of the yj variable on its explicative endogenous variables, yo, are included in the j-th row of matrix F (excluding its j-th element), that is (29)E(yo|yj)yj=(MjFej) To prove (29), let us focus on the j-th equation and consider this equation as the first of the system, with the others in sequence, that is (30)yj(1,1)=γj(1,L1)yo(L1,1)+aj(1,J)z(J,1)+ϵj(1,1)

(31)yo(L1,1)=η(L1,1)yj(1,1)+Γo(L1,L1)yo(L1,1)+Ao(L1,J)z(J1,1)+ϵo(L1,1)

(32)(ϵjϵo)NL(0,Σ)whereΣ=(σjjσjoσojΣo  ) Looking at the j-th equation, it is clear that vector γj=ejΓMj measures the direct effect of the (RHS) endogenous variables on yj.

In order to determine the feedback effect of yj on yo, let us rewrite (31) as follows (33)yo=η(γjyo+ajz+ϵj)+Γoyo+Aoz+ϵo Next, given that, under normality, the following holds (34)ϵo=σojσjjϵj+ζo;ζoϵj the set of equations ((33)) can be conveniently rewritten in the form (35)(IG)yo=Dz+dϵj+ζo where (36)G=ηγj+Γo;D=ηaj+Ao;d=η+σojσjj This, in turn (see 3. in Appendix) entails (37)E(yo|ϵj)yj=E(yo|ϵj)ϵjϵjyj=[(IG)1d]=φj=1σjjejFMj Thus, we can conclude that the presence of non-zero elements in the vector (38)ρj=γjφj=(ejΓMj)(1σjjejFMj)=ej1σjj(ΓF)Mj reveals the simultaneous action of both the direct effects of yo on yj and the feedback effects of yj on yo.

Accordingly, testing the significance of ρj means checking whether the j-th endogenous is involved in feedback loops with other endogenous variables.

Actually, the statistic of the test can be derived from (38), by deleting from γj the elements that, according to the exclusion constraints postulated by the economic theory, are null. This leads to move from the former ρj vector to the following compressed vector (39)ρ~j=γ~jφj~=(Sjγj)(Sjφj) which has no zero entries. Here Sj is a selection matrix selecting from γj and φj the non-null entries. Accordingly, the reference model (30)-(32) can be restated as (40)yj=γ~jyr+a~jzr+ϵj

(41)yr=Kz+φ~jϵj+ϵr

(42)f(ϵj,ϵr)NL(0,(σjj00Ω)) where yr=Sjyo,  a~j=Sraj,  zr=Srz and Sr is the matrix selecting the non-null entries from aj and the sub-set of predetermined variables playing an explicative role in the j-th equation.

Furthermore, (43)K=Sj(IG)1D,ϵr=Sj(IG)1ζo,Ω=E(ϵrϵr) Hence, the issue of the nature, unidirectional rather than bidirectional, of the equation at stake can be unfolded by testing a hypothesis in the given form (44){H0:ρ~j=0H1:ρ~j0

The Wald test takes the form (45)W=ρ~^j(J^Ψ^1J^)1ρ~^j where ρ~^j is the maximum likelihood estimate of ρ~j (see 4. in Appendix), and J^, Ψ^ are, respectively, the Jacobian matrix (46)J^=ρ~j(θ)θ|θ=θ^ and the information matrix (47)Ψ^=2l(θ)θθ|θ=θ^ evaluated in correspondence of the maximum likelihood estimate of the parameter vector (48)θ=[γ~j,a~j,φ~j,vec(K),σjj,vech(Ω)] Under the null hypothesis (49)Was.χk2 where k is the dimension of ρ~j.

If the Wald test provides evidence that the j-th equation is involved in a statistically significant feedback loop with other equations of the model, it is worth singling out the variables that are primarily responsible for the feedback at hand. They can be identified by checking the significance of each non-null element of ρ~^j. Under the null that the i-th element of ρ~^j is non-zero, the Wald statistic, for testing the significance of the loop bridging the i-th and j-th endogenous, turns out to be (50)W=(eiρ~^j)[ei(J^Ψ^1J^)1ei](ρ~^jei)as.χ12

4 Detecting and testing causal and interdependent links in a model with SIRE

Investigating potential feedbacks with SIRE

The analysis developed in the previous sections allows the identification of the potential feedbacks operating in a model. By assuming the stochastic specification of the model as known, the investigation can be carried out by using binary matrices Γb and Σb without a preliminary estimation of the model. The causal structure, which emerges from this analysis, is implied by the theory underlying the model and mirrored by the topological properties of matrices Γ and Σ. It is also important to point out that the feedback loops thus detected are only potential, because their effectiveness must find confirmation in empirical evidence.

We start by loading the SIRE package.

> install.packages("SIRE")
> library(SIRE)

The function causal_decompose() is devised for decomposing the matrix Γ. If the structure of Σ is assumed as known by the user, the function takes the following arguments:

and provides the following output:

Furthermore, if the error terms are assumed to be spherical, then the SIRE package simply splits Γ in two sub-matrices Cb and Ψb, reflecting the interdependent and causal relations operating in the system at a deterministic level.

With regard to the system (13), the corresponding code is

> eq.system <- list(
+              eq1 = y1 ~ y5 + y7, eq2 = y2 ~ z,
+              eq3 = y3 ~ y11, eq4 = y4 ~ y3,
+              eq5 = y5 ~ y10, eq6 = y6 ~ y5 + y9,
+              eq7 = y7 ~ y6, eq8 = y8 ~ y12,
+              eq9 = y9 ~ y7, eq10 = y10 ~ y5,
+              eq11 = y11 ~ y12, eq12 = y12 ~ y4 + y11,
+              eq13 = y13 ~ y2 + y6)
> #fictitious Sigma matrix
> Sigma <- diag(length(eq.system))
> #function call
> decompose.A <- causal_decompose(eq.system , sigma.in = Sigma)

The output is comprised of matrices Cb and Ψb given in (14). The graphical representation of the system, given in Figure 2, is obtained with the tkplot() function of the R package igraph

> tkplot(decompose.A$dec.graph)

The following example refers to a matrix Σb specified as in (22)

> # indexes of non-null elements of Sigma
> sigma.idx <- cbind(
+  rbind(rep(1,5),c(4,5,8,10,12)),   #y1
+  rbind(rep(2,4),c(4,6,8,9)),       #y2
+  rbind(rep(3,4),c(6,7,11,13)),     #y3
+  rbind(rep(4,6),c(5,6,8,9,10,12)), #y4
+  rbind(rep(5,3),c(8,10,12)),       #y5
+  rbind(rep(6,5),c(7,8,9,11,13)),   #y6
+  rbind(rep(7,2),c(11,13)),         #y7
+  rbind(rep(8,3),c(9,10,12)),       #y8
+  rbind(rep(10,1),c(12)),           #y10
+  rbind(rep(11,1),c(13)))           #y11
> # fictitious Sigma matrix
> low.tri <- as.matrix(Matrix::sparseMatrix(i = sigma.idx[2,] , j = sigma.idx[1,], x = 1,
+                                           dims = rep(length(eq.system),2)))
> Sigma <- low.tri + t(low.tri) + diag(length(eq.system))
> # function call
> decompose.B <- causal_decompose(eq.system = eq.system,
+                             sigma.in = Sigma)

In this case, the package provides as output matrix Cb and splits matrix Ψb into sub-matrices Ψ1b and Ψ0b, as in (23) and (24). The tkplot() function can still be used to obtain the pictures of the relations among the variables given in Figure 3.

The next section will show how to perform the decomposition with causal_decompose() if the structure of Σ is not known and the goal is to carry out estimation and feedback testing from observed data.

Finding significant feedbacks with SIRE: an application to Italian macroeconomic data

As pointed out in the previous section, empirical evidence aside, the results of a decomposition based on binary matrices Γb and Σb must be considered as preliminary since they show only the potential links acting in the system. The effectiveness of these links demands a confirmation based on a sound empirical-evidence argument. In fact, the lack of significance of one or more of the feedbacks thus detected can alter the nature of the connections among the endogenous variables found by the preliminary decomposition, which is based only on the topological properties of matrices Γ and Σ. In order to show how effective feedbacks operating in a model can be detected and tested, we have applied the functionalities of SIRE to the Klein model (see , , and , ). This model, originally conceived for the US economy, has been recast for the Italian economy. The Italian macroeconomic variables, mirroring the US counterparts, are available at http://dati.istat.it/. The given model is composed of n=60 observations on a quarterly basis and six equations explaining the following endogenous variables: consumption expenses for Italian families [C], added value [CP], private wages from dependent employment [WP], gross investment [I], gross capital stock [K], gross domestic product [GDP]. The model is specified as follows (51)[CtItWPtGDPtCPtKt]=a0+[0γ1200γ15000000γ260γ320γ3400γ41γ42000000γ530000γ620000][CtItWPtGDPtCPtKt]++[a11000a2100000a34000a440000a550a6200][CPt1Kt1GDPt1Tt]+[eCeIeWPeGDPeCPeK] where a0 is the intercept vector. As equation (51) shows, the set of predetermined variables includes one exogenous variable, taxes [Tt], and three lagged endogenous variables, that is: the one-lagged added value [CPt1], the one-lagged gross capital stock [Kt1], and the one-lagged gross domestic product [GDPt1]. We first load the data into the R workspace.

> data(macroIT)

Following , the model equations have been estimated with 3SLS by using the R package systemfit (, ). The one-lagged capital stock [Kt1], [Tt], [CPt1], and [GDPt1] have been employed as instrumental variables. Matrix Σ, if the user does not specify its structure, is estimated by using the covariance matrix of the structural residuals. The function causal_decompose() can be also employed to estimate both the model via 3SLS and the Σ matrix, and yields three matrices: C, Ψ1, and Ψ0. The first two include the coefficients associated to variables affected by feedback loops, operating either at a deterministic level or induced by error terms, the third contains the coefficients associated to variables playing a causal role in the system.

This version of causal_decompose() takes the following arguments:

The output of this function is a list containing the following objects:

The code below performs the decomposition using the macroIT data

> #system of equations
> eq.system <- list(eq1 <- C ~  CP  + I +  CP_1  ,
+                   eq2 <- I ~ K + CP_1,
+                   eq3 <- WP ~  I + GDP +  GDP_1,
+                   eq4 <- GDP ~ C + I + GDP_1,
+                   eq5 <- CP ~   WP  + T,
+                   eq6 <- K ~ I + K_1)
> #instruments
> instruments <- ~ T +  CP_1 + GDP_1 + K_1
> #decomposition
> dec.macroIT <- causal_decompose(data = macroIT,
+                                 eq.system = eq.system,
+                                 resid.est = "noDfCor",
+                                 instruments = instruments)

Table 1 shows the results of the model estimation. Since some coefficients are not statistically significant (such as the coefficient associated to [I] in the equation explaining [C] and the coefficient associated to [GDP] in the equation explaining [WP]), the model has been re-estimated and the coefficient matrix associated to the explicative endogenous variables decomposed again.

> #system of equations
> eq.system <- list(eq1 <- C ~ CP + CP_1  ,
+                   eq2 <- I ~ K,
+                   eq3 <- WP ~  I +  GDP_1,
+                   eq4 <- GDP ~ C + I + GDP_1,
+                   eq5 <- CP ~   WP  + T,
+                   eq6 <- K  ~ I + K_1)
> #instruments
> instruments <- ~ T +  CP_1 + GDP_1 + K_1
> #decomposition
> dec.macroIT.new <- causal_decompose(data = macroIT,
+                                     eq.system = eq.system,
+                                     resid.est = "noDfCor",
+                                     instruments = instruments)

The results of the last estimation process are shown in Table 2. Looking at the Theil inequality indexes (, ) reported in the last column of the table, we can see that the estimated equations fit the data very well. In fact, all Theil indexes are close to zero.

Table 1: Macroeconomic model: preliminary estimates with 3SLS. The last column shows the Theil index for each model equation.
: significant at level α=0.1
*: significant at level α=0.05
**: significant at level α=0.01
***: significant at level α=0.001
a0 Ct It WPt GDPt CPt Kt Tt GDPt1 CPt1 Kt1 Theil
Ct 5.50(11.97) - 0.1(0.19) - - 10.1(0.09) - - - 0.35(0.099) - 0.0073
It 10.28(7.627) - - - - - 0.72(0.089) - - 0.005(0.04) - 0.0115
WPt 320.17(28.87) - 1.93(0.453) - 0.46(0.277) - - - 1.04(0.251) - - 0.0288
GDPt 4.55(10.732) 1.02(0.098) 0.37(0.181) - - - - - 0.31(0.075) - - 0.004
CPt 144.49(10.848) - - 0.49(0.045) - - - 3.62(0.373) - - - 0.0074
Kt 5.69(1.599) - 0.68(0.073) - - - - - - - 0.49(0.053) 0.0063
Table 2: Macroeconomic model: final estimates with 3SLS. The last column shows the Theil index for each model equation.
: significant at level α=0.1
*: significant at level α=0.05
**: significant at level α=0.01
***: significant at level α=0.001
a0 Ct It WPt GDPt CPt Kt Tt GDPt1 CPt1 Kt1 Theil
Ct 10.06(9.17) - - - - 1.02(0.115) - - - 0.39(0.111) - 0.0076
It 11.22(2.202) - - - - - - - - 0.73(0.027) - 0.0114
WPt 299.12(26.387) - 1.65(0.444) - - - - - 1.39(0.134) - - 0.0304
GDPt 3.04(8.89) 1.09(0.118) 0.39(0.112) - - - - - 0.28(0.076) - - 0.0042
CPt 142.70(9.554) - - 0.48(0.034) - - - 3.68(0.299) - - - 0.0073
Kt 5.53(1.606) - 0.67(0.074) - - - - - - - 0.49(0.053) 0.0062

The estimated covariance matrix of the structural error terms is given by Σ^=[10.932.512.6110.755.0452.317.551.553.667.159.64.2719.736.0715.080.430.680.530.090.680.81] while matrices C+Ψ1 and Ψ0 turn out to be (52)C+Ψ1=[000000000000.7300000000000000000000.670000]+[00001.02000000001.6500001.090.390000000.48000000000]=[00001.020000000.7301.6500001.090.390000000.4800000.670000] (53)Ψ0=0 The matrix in Equation (52) embodies all the coefficients associated to variables involved in feedback loops, while matrix (53) includes those associated to variables playing a causal role. Looking at (52) we find a direct feedback between variables [I] and [K], while the variables of the pairs [I, WP], [I, GDP], [C, GDP], [CP, C], and [CP, WP] are directly linked (a black arrow connects the variables of each pair) as well as explained by equations with correlated errors. Accordingly, the variables of each pair may be internally connected by feedback loops.

The goal of our testing procedure will be to bring out which of these feedbacks, being significant, are truly effective.

Figure 5 depicts the links operating in this model, using the function tkplot() of the igraph package. In this figure, a unidirectional arrow denotes that a variable is explicative for another. If two variables are explicative one for the other, a direct feedback loop exists, depicted as two red arrows going in opposite directions. Instead, a red, dashed, curved, two-headed arrow between two variables indicates the existence of a feedback induced by error correlation.

> tkplot(dec.macroIT.new$dec.graph)
graphic without alt text
Figure 5: Path diagram of the macroeconomic model. Unidirectional arrows denote that one variable is explicative for another. The two red unidirectional arrows denote the presence of a direct feedback. The red, dashed, curved, double-headed arrows between pairs of variables denote feedback loops induced by error correlation.

Testing for feedback effects

The significance of these loops has been investigated by using the function feedback_ml() which performs the Wald test given in (50). The 3SLS parameter estimates have been used as preliminary estimates to obtain the maximum likelihood (ML) estimates of the parameters needed to build the test statistic. In particular, in order to reach the global maximum of the log-likelihood, the initial 3SLS parameter estimates have been randomly perturbed a certain number of times. The optimizer chosen for the scope is included in the Rsolnp package where the function gosolnp is specially designed for the randomization of starting values. The function feedback_ml() takes the following arguments:

The output of this function is a list containing the following objects:

As an example, let us assume that the interest is in testing the significance of the feedbacks affecting the second equation, explaining the endogenous variable [I]. According to the previous analysis, this variable is connected to [K] by a bidirectional link.

The Wald test for the significance of this feedback is performed by using the function feedback_ml() specified as follows

> test.E2=feedback_ml(data = macroIT,
+                     out.decompose = dec.macroIT.new,
+                     eq.id = 2,
+                     lb = min(dec.macroIT.new$Sigma) - 10,
+                     ub = max(dec.macroIT.new$Sigma) + 10,
+                     nrestarts = 10,
+                     nsim = 20000,
+                     seed.in = 1)

By visualizing the estimate of ρ and the Wald statistic

> test.E2$rho.tbl
  Feedback eqn.   rho.est
1             6 0.1641469
> test.E2$wald
         [,1]
[1,] 4.115221

we can see that the existence of a feedback loop between [I] and [K] is confirmed.

Table 3 shows the results of the test for all the equations of the model. Looking at the p-values we conclude that all feedbacks are significant except the ones involving [CP] and [GDP]. For what concerns [CP], it is explained by [WP] without a feedback effect from the latter to the former. Regarding [GDP], which is affected by feedback effects, a deeper analysis is required in order to understand which of its two explicative variables [C] and [I] (if not both) are responsible for it. To this end, we have applied the Wald statistic given in (50) which leads us to conclude that only [C] is involved in a feedback loop with [GDP].

Table 3: Macroeconomic model: tests for feedback effects for the final model. Joint W denotes the Wald statistic used to test the set of feedback loops affecting a given variable (see (45). Singular W denotes the Wald statistic used to test the feedback effect between two specific variables (see (50)).
Equation Feedback Variable Joint W p-value Singular W p-value
C CP 386.6 <.001
I K 4.115 0.042 - -
WP I 25.55 <.001 - -
GDP C 95.368 <0.0001 84.315 <0.0001
I 0.352 0.553
CP WP 0.046 0.831 - -
K I 19.595 <0.0001 - -

In the end, the path diagram fully describing the recurrent and interdependent relationships in the model is displayed in Figure 6.

graphic without alt text
Figure 6: Path diagram of the modified macroeconomic model after testing for feedback effects. Black arrows denote causal link (Ψ0), red arrows denote interdependent links (C), black arrows and red dashed arrows denote interdependent links induced by the correlation of the error terms (Ψ1).

5 Discussion

The set of functions worked out in the paper allows a system of simultaneous equations to be split into recursive and/or interdependent subsystems. The user can rely on causal_decompose() in two ways: to assess the presence of interdependent relations with a known structure of correlation among the error terms, or to estimate the whole model in presence of empirical data.

The significance of the feedback loops operating in the model is tested with a Wald test using the feedback_ml() function. The 3SLS parameter estimates are used as preliminary estimates to obtain the maximum likelihood ones, which are needed to build the test.

As for the rationale of our procedure, which rests on a properly devised test, it is worth taking into account the considerable concern raised recently in the statistical community about the use of significance testing (see , ). In this connection, in order to avoid improper use of p-values and significance-related results, it may be worth addressing the issue of detecting feedback mechanisms in a simultaneous equations model with different approaches. Among them, the construction of confidence intervals and the employment of Bayesian methods look particularly promising for future investigation.

Moving now on more technical notes:

6 Proof of relevant formulas

In this Appendix we provide the proofs of some relevant formulas of the paper.

  1. Let Σ and R be defined as in Section 2. Then, the proof that Σb(I+R) is the binary matrix associated to Σ(IΓ)1 is based on the following two theorems.

    If two conformable matrices, A and B, are such that A=AHB=BK then the binary matrix associated to AB is (HK)b.

    If a non-singular matrix A is such that A=AH where H is a given binary matrix, then (A1)(I+n=1N1Hn)b=(A1) where N is the matrix dimension.

    Now, upon noting that (IΓ)=(IΓ)(IΓb), reference to Theorem [th:t2] leads to conclude that (54)(IΓ)1=(IΓ)1(I+R) Next, taking into account that Σb and (I+R) are the binary counterparts of the Σ matrix and (IΓ)1 reference to Theorem entails the following \[\boldsymbol{\Sigma}(\mathbf I-\boldsymbol\Gamma)=[\boldsymbol{\Sigma}(\mathbf I-\boldsymbol\Gamma)]\ast [\boldsymbol{\Sigma}^b(\mathbf I+\mathbf R)].\]

  2. The proof that C and F, defined as in Section 3, satisfy the following relationship (55)CF=C hinges on a preliminary result given in the following theorem.

    The matrices C and I+R satisfy the following relationship Cb(I+R)=Cb

    Proof

    Taking into account that the Hadamard product is both commutative (AB=BA) and idempotent for binary matrices (AbAb=Ab), and being Γ hollow, the following holds ΓbI=0, simple computations yield (56)Cb(I+R)=ΓbR(I+R)=ΓbRI+ΓbRR=Cb
    Now, consider the following theorem (where the symbol A0 denotes that all the elements of matrix A are non negative numbers):

    Let B0 and AbBb=Ab. If C0, then Ab(B+C)b=Ab

    Given this premise, we can now prove (55). To this end, let us write Σb as follows (57)(I+Δ)=Σb where ΔI=0 is a hollow matrix, and note that, in light of (57) and (54), the binary matrix associated to F is, according to Theorem [th:t1], given by Fb=[(I+Δ)(I+R)]b Next, use of Theorems [th:t3] and [th:t4], yields the following CbFb=Cb[(I+Δ)(I+R)]b=Cb[(I+R)+Δ(I+R)]b=Cb as (Δ(I+R))b0. This, in turn, entails that (58)Cb+Ψ1bFb=(Cb+Ψ1b)Fb=ΓbFb which means that C+Ψ1F and ΓF have the same topological structure.

  3. Proof of (37) . Formula (37) can be proved as follows. First, note that matrix Γ weighting the current endogenous explicative variables in the model (30), (31) can be expressed as (59)Γ=PjΓPj where Pj is a permutation matrix obtained from an identity matrix by interchanging its first row with its j-th row. Then note that Γ=[0γjη(IΓo)] and that (IΓ)1=[1+γjL1ηγjL1L1ηL1] , where L=IΓoηγj=(IG) Accordingly 1σjjM1(IΓ)1Σe1=(IG)1d=φj, where ej is the first elementary vector, Σ, G and d are defined as in (32)and (36) respectively, and M1 is the selection matrix obtained from the identity matrix by deleting its first column. Now, taking into account that the following holds (IΓ)=Pj(IΓ)Pj in light of (59), and that the following proves true (IΓ)1=Pj(IΓ)1Pj, as Pj is both symmetric and orthogonal, some computations yield (60)φj=1σjjM1(IΓ)1Σe1=1σjjM1Pj(IΓ)1PjΣPjPje1==1σjjMj(IΓ)1Σej=1σjjMjFej

  4. Derivation of the log-likelihood for the model (40)-(42)
    The logarithm of the density in (42) is given by (61)lnf(ϵj,ϵr)=c12lnσjj12ln|Ω|ϵj22σjj12ϵrΩ1ϵr where c is a constant term. Now, upon noting that |J|=|(ϵj,ϵr)(yj,yr)|=1, and assuming to operate with N observations on the variables of interest, the log-likelihood function can be written as (62)l=t=1Nl(yj,yr)=kN2lnσjjN2ln|Ω|αHα2σjj12tr(ΞΩ1ΞH) where (63)α=[1,γ~j,a~jSj]

    (64)Ξ=[φ~j,I+φ~jγ~j,φ~ja~jSjK]

    (65)ν=[yj,yo,z]H=(t=1Nνtνt), and k is a constant term. Formula (62) can be obtained by noting that, in light of (40), the following holds (66)ϵj=yjγ~jyra~jzr=[1,γ~j,a~jSr][yjyrz]=αν and that, according to (41), we have (67)ϵr=yrKzφ~jϵj==yrKzφ~j(yjγ~jyra~jSrz)==[φ~j,I+φ~jγ~j,φ~ja~jSrK][yjyrz]=Ξν This implies that (68)ϵrΩ1ϵr=tr(νΞΩ1Ξν)=tr(ΞΩ1Ξνν)

CRAN packages used

SIRE, igraph, systemfit, Rsolnp

CRAN Task Views implied by cited packages

Econometrics, GraphicalModels, Optimization, Psychometrics

Note

This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.

Footnotes

  1. The term topological properties refers to those properties of a matrix which depend exclusively on the number and the relative position of its null and non-null elements (, ).[↩]
  2. A binary matrix associated to a matrix G is a matrix whose entries are equal to 1 if the corresponding entries of G are non-null, or 0 otherwise. Binary matrices preserve the topological properties of the parent matrices.[↩]
  3. The Hadamard product of two matrices, A and B of the same order, is defined as the matrix of the term-to-term products of the elements of these matrices, that is (AB)(i,j)=a(i,j)b(i,j).[↩]
  4. An alternative approach for determining the feedbacks operating at a systematic level in a model is based on graph theory (see , and , ).[↩]
  5. The element γj,i of Γ corresponds to the element γi,j of Γ[↩]
  6. It is worth mentioning that Ψ is Hadamard-orthogonal to C (two matrices A and B are said to be Hadamard-orthogonal if AB=0). Furthermore, while matrix C is co-spectral to Γ (i.e., they have the same eigenvalues), matrix Ψ is a hollow-nilpotent matrix, like Γ (a square matrix N is nilpotent if Nk=0 for some k<M, where M is the matrix dimension). A hollow, nilpotent matrix can always be expressed in triangular form.[↩]

References

P. R. Amestoy. Igraph: Network analysis and visualization. 2017. URL https://CRAN.R-project.org/package=igraph. R package version 1.1.2.
E. Bellino, S. Nerozzi and M. G. Zoia. Introduction to luigi pasinetti’s “causality and interdependence ….” Structural Change and Economic Dynamics, 2018. URL https://doi.org/10.1016/j.strueco.2018.09.007.
M. Faliva. Recursiveness vs. Interdependence in econometric models: A comprehensive analysis for the linear case. Journal of the Italian Statistical Society, 1(3): 335–357, 1992.
M. Faliva and M. G. Zoia. Detecting and testing causality in linear econometric models. Journal of the Italian Statistical Society, 3(1): 61–76, 1994.
M. Fiedler. Special matrices and their applications in numerical mathematics: Second edition. Dover Publications, 2013.
M. Gilli. Causal ordering and beyond. International Economic Review, 957–971, 1992. URL https://doi.org/10.2307/2527152.
C. W. Granger. Testing for causality: A personal viewpoint. Journal of Economic Dynamics and control, 2: 329–352, 1980. URL https://doi.org/10.1016/0165-1889(80)90069-x.
W. H. Greene. Econometric analysis. Pearson Education, 2003.
A. K. Gupta and D. K. Nagar. Matrix variate distributions. Taylor & Francis, 1999. URL https://doi.org/10.1201/9780203749289.
A. Henningsen and J. D. Hamann. Systemfit: Estimating systems of simultaneous equations. 2017. URL https://CRAN.R-project.org/package=systemfit. R package version 1.1-20.
K. G. Jöreskog. Structural analysis of covariance and correlation matrices. Psychometrika, 43(4): 443–477, 1978.
K. G. Jöreskog and H. O. A. Wold. Systems under indirect observation: Causality, structure, prediction. North-Holland Publishing Company, 1982.
L. R. Klein. Economic fluctuations in the united states, 1921-1941. John Wiley & Sons, 1950.
R. B. Marimont. System connectivity and matrix properties. The Bulletin of Mathematical Biophysics, 31(2): 255–274, 1969.
J. Ponstein. Matrices in graph and network theory. Van Gorcum & Comp., 1966.
R. H. Strotz and H. O. A. Wold. Recursive vs. Nonrecursive systems: An attempt at synthesis (part I of a triptych on causal chain systems). Econometrica, 28(2): 417–427, 1960.
H. Theil. Economic forecasts and policy. North-Holland Publishing Company, 1961.
R. L. Wasserstein and N. A. Lazar. The ASA’s statement on p-values: Context, process, and purpose. The American Statistician, 70(2): 129–133, 2016. URL https://doi.org/10.1080/00031305.2016.1154108.
H. O. A. Wold. Econometric model building: Essays on the causal chain approach. North-Holland Publishing Company, 1964.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Vacca & Zoia, "Identifying and Testing Recursive vs. Interdependent Links in Simultaneous Equation Models via the SIRE Package", The R Journal, 2019

BibTeX citation

@article{RJ-2019-016,
  author = {Vacca, Gianmarco and Zoia, Maria Grazia},
  title = {Identifying and Testing Recursive vs. Interdependent Links in Simultaneous Equation Models via the SIRE Package},
  journal = {The R Journal},
  year = {2019},
  note = {https://rjournal.github.io/},
  volume = {11},
  issue = {1},
  issn = {2073-4859},
  pages = {149-169}
}