Simultaneous equation models (SEMs) are composed of relations which
either represent unidirectional links, which entail a causal
interpretation, or bidirectional links, due to feedback loops, which
lead to the notion of interdependence. The issue is of prominent
interest in several respects. Investigating the causal structure of a
SEM, on the one hand, brings to light the theoretical assumptions
behind the model and, on the other hand, pilots the choice of the
befitting estimation method and of which policy to implement.
This paper provides an operational method to distinguish causal
relations from interdependent ones in SEMs, such as macro-econometric
models, models in ecology, biology, demography, and so forth. It is
shown that the causal structure of a system crucially rests on the
feedback loops, which possibly affect the equations. These loops are
associated to the non-null entries of the Hadamard product of matrices
encoding the direct and indirect links among the SEM dependent
variables. The effectiveness of feedbacks is verified with a Wald test
based on the significance of the aforementioned non-null entries.
An R package, SIRE (System of Interdependent/Recursive Equations),
provides the operational completion of the methodological and analytic
results of the paper. SIRE is applied to a macroeconomic model to
illustrate how this type of analysis proves useful in clarifying the
nature of the complex relations in SEMs.
As is well known, each equation in a simultaneous equation model (SEM) represents a specific link between a dependent (endogenous) variable and a set of other variables which play an explicative role for the former. These links can reflect either one-way relations between the dependent and their explicative variables or two-ways relations, ascribable to the presence of feedback loops operating either at a systematic or a stochastic level. SEMs are of recursive type as long as the equations represent unidirectional links. Otherwise, if the equations are bidirectional, the SEM (or part of it) is interdependent. Interdependence is, both structurally connected to the presence of current endogenous variables playing an explicative role, and can result as a by-product of error-term dependencies.
Investigating the nature, causal rather than interdependent, of a SEM is important in several respects. First the analysis, unfolding the dynamics among variables, sheds more light on the rationale behind the theoretical assumptions of the model. For instance, in an economic framework, the distinction between interdependent and causal SEMs leads to models which can be traced back to two main streams of economic theory: Neoclassical and Keynesian (Bellino et al. (2018), (2018)). Furthermore, the implication of interdependence vs. causality is crucial for undertaking parameter estimation, given that a set of causal equations can be estimated equation by equation by ordinary least squares (OLS), while simultaneous estimation methods, like three stage least squares (3SLS) are required when interdependence occurs. Given that large SEMs have become increasingly popular, the need for an analytical set-up, able to effectively detect and test causality versus interdependence, has of course become more urgent.
Starting from this premise and following Strotz and Wold (1960), (1960); Wold (1964), (1964); and more recently Faliva (1992), (1992); Faliva and Zoia (1994), (1994)); in this paper we have devised an operational method to distinguish the causal from the interdependent equations of a SEM.
Other approaches for detecting feedback-loops arising in deterministic (error free) models are based on either graph or system theory (see e.g., Gilli (1992,)). Our methodological proposal goes beyond the aforementioned methods, as besides covering both the cases of deterministic and error-driven feedback effects, it provides a way for testing the feedback effectiveness. In addition, it differs in principle from other approaches, as the one proposed by Granger (see Granger (1980), (1980)) and the Covariance Structural Analysis (CSA; Jöreskog (1978)). The former essentially rests on a predictability criterion for defining causality regardless of the theory behind the model. The latter, which is meant to find the best parametric approximation of the sample covariance matrix in terms of a given theoretical SEM structure; as such, it does not lead to a causal/interdependent interpretation of the model links as the one developed in our paper.
The feedbacks identified by the method proposed here demand statistical confirmation on certain empirical evidence arguments. Lack of significance of (one or more of) the estimated feedbacks can overturn the nature of the connections among model variables. To this end, a Wald type test is devised to check whether a given equation is significantly affected by feedback or not. The statistic of this test hinges on the parameter matrices of the model: the matrix associated to the endogenous variables playing an explicative role and the dispersion matrix of the error terms. If an equation is affected by feedback loops, the testing procedure allows to diagnose which endogenous variables are significantly connected in the loop of interest. Indeed, testing the significance of feedbacks means also checking if the links among variables, suggested by the theory at the basis of the model, are confirmed according to an empirical evidence argument.
The methodological approach put forth in this paper is implemented in R with the SIRE package. Besides integrating functions usually employed for the estimation of SEM’s, the package provides new functions meant to duly split a system of equations into its unidirectional and bidirectional links, and test their significance. To our knowledge, extant alternative approaches to causality do not offer a similar test.
The paper is structured as follows. The first section provides the methodological set-up devised to single out causal and interdependent relations in a SEM. In the second section, a Wald-type test is worked out to check whether a given equation is affected by feedbacks or not. The third section shows how the method and the R code work for detecting and testing feedback-loops in a macroeconomic model. An Appendix, with proofs of the main theoretical results, completes the paper.
An equation system is a set of structural equations representing economic theory-driven relations linking the variables relevant to the study at hand.
It is customary to specify an equation system as follows
Error terms are assumed to be non-systematic, stationary in a wide
sense, and uncorrelated over time, that is
When a causal chain exists among blocks of current endogenous variables,
a causal order can be established among those blocks of equations. In
this case, the current endogenous variables of a block are effects of
the variables belonging to the blocks which come before them in the
chain, as well as the causes of the variables belonging to blocks which
follow the block at stake in the chain. In this case, the model is of
block-recursive type. The following simple equation system provides an
example of a recursive model (see Figure 1, middle
panel)
Sometimes the composite nature of the connections among variables leads
to a closed sequence of dependencies among variables to be ascribed to
feedback loops. This type of interaction among endogenous variables is
usually called interdependence. Interdependence is structurally
connected to the presence of both current endogenous variables on the
right-hand side of the model and the correlation between contemporaneous
error terms.See the system below as an example in this regard (see
Figure 1, right panel)
|
|
|
|
|
|
Based on this premise, it is clear that the causal or interdependent
features of a model’s equations depend on the pair of matrices
Moreover, the essential information concerning the causal structure of a
model can be obtained from the topological properties
Following Faliva (Faliva (1992), (1992)) matrix
Matrix
The rationale of (8) hinges on the fact that a direct
feedback between variables
through other variables and equations. In algebraic terms this
corresponds to the simultaneous non-nullity of the
Accordingly, matrix
In order to show how feedbacks operating in the systematic part of a
model can be detected, let us consider as an example the following
deterministic model
Looking at matrix
If the error terms are correlated, the causal structure of a model could
no longer match that of its systematic counterpart since part of the
relations that are recursive at systematic level, namely
Equations (15) and (16) rest on the logical
relations between the concepts of causality and predictability, where
the notion of optimal predictor (in mean-square sense) tallies with that
of conditional expectation. In fact, given that causal relations are
also predictive, but not vice-versa, we can define as causal those
relations that are both causal in the deterministic model and predictive
in a stochastic context. This means that if the conditional expectations
of the relations, which are causal in the deterministic model, namely
Accordingly, we can say that the stochastic specification is neutral
with respect to the underlying systematic causal structure if the
following holds (Faliva (1992), (1992))
meaning that
Otherwise, the correlation between the error terms and the endogenous
variables may affect the conditional expectation of the error term as
follows (see Faliva (1992), (1992))
To highlight the role played by the stochastic specification on the
model causal structure, let us consider as an example the following
specification for matrix
The flow-chart in Figure 4 shows the different cases,
according to the structure of matrices
In the previous section an analytic framework was set up to describe the
potential feedbacks operating in a model. In fact, the analysis
developed, relying on binary matrices, was meant to be qualitative since
it only highlights the feedback set that potentially operates in a
model, given the characteristics of its relations and its stochastic
specification. Only once the model has been duly estimated, can the
coefficients of matrix
In this context (following Faliva and Zoia (1994), (1994)), it can be proved
that the
The feedback effects of the
In order to determine the feedback effect of
Accordingly, testing the significance of
Actually, the statistic of the test can be derived from
(38), by deleting from
Furthermore,
The Wald test takes the form
If the Wald test provides evidence that the
The analysis developed in the previous sections allows the
identification of the potential feedbacks operating in a model. By
assuming the stochastic specification of the model as known, the
investigation can be carried out by using binary matrices
We start by loading the SIRE package.
> install.packages("SIRE")
> library(SIRE)
The function causal_decompose()
is devised for decomposing the matrix
data
: not appropriate to simulated context, set to NULL
.eq.system
: the system of equations.resid.est
: not appropriate to simulated context, set to NULL
.instruments
: not appropriate to simulated context, set to
NULL
.sigma.in
: the binary matrix and provides the following output:
eq.system
: the system of equations given as input.gamma
: the binary matrix sigma
: the binary matrix C
: the binary matrix of the coefficients associated to the
endogenous variables involved in interdependent mechanisms operating
at a systematic level.Psi1
: the binary matrix of the coefficients associated to the
endogenous variables involved in interdependent mechanisms induced
by error correlation (if Sigma
is not diagonal).Psi0
: the binary matrix of the coefficients associated to the
endogenous variables having a causal role.all.graph
: the DAG object for the undecomposed path diagram (via
the R package igraph;
Amestoy (2017), (2017)).dec.graph
: the DAG object for the decomposed path diagram.Furthermore, if the error terms are assumed to be spherical, then the
SIRE package simply splits
With regard to the system (13), the corresponding code is
> eq.system <- list(
+ eq1 = y1 ~ y5 + y7, eq2 = y2 ~ z,
+ eq3 = y3 ~ y11, eq4 = y4 ~ y3,
+ eq5 = y5 ~ y10, eq6 = y6 ~ y5 + y9,
+ eq7 = y7 ~ y6, eq8 = y8 ~ y12,
+ eq9 = y9 ~ y7, eq10 = y10 ~ y5,
+ eq11 = y11 ~ y12, eq12 = y12 ~ y4 + y11,
+ eq13 = y13 ~ y2 + y6)
> #fictitious Sigma matrix
> Sigma <- diag(length(eq.system))
> #function call
> decompose.A <- causal_decompose(eq.system , sigma.in = Sigma)
The output is comprised of matrices tkplot()
function of the R package igraph
> tkplot(decompose.A$dec.graph)
The following example refers to a matrix
> # indexes of non-null elements of Sigma
> sigma.idx <- cbind(
+ rbind(rep(1,5),c(4,5,8,10,12)), #y1
+ rbind(rep(2,4),c(4,6,8,9)), #y2
+ rbind(rep(3,4),c(6,7,11,13)), #y3
+ rbind(rep(4,6),c(5,6,8,9,10,12)), #y4
+ rbind(rep(5,3),c(8,10,12)), #y5
+ rbind(rep(6,5),c(7,8,9,11,13)), #y6
+ rbind(rep(7,2),c(11,13)), #y7
+ rbind(rep(8,3),c(9,10,12)), #y8
+ rbind(rep(10,1),c(12)), #y10
+ rbind(rep(11,1),c(13))) #y11
> # fictitious Sigma matrix
> low.tri <- as.matrix(Matrix::sparseMatrix(i = sigma.idx[2,] , j = sigma.idx[1,], x = 1,
+ dims = rep(length(eq.system),2)))
> Sigma <- low.tri + t(low.tri) + diag(length(eq.system))
> # function call
> decompose.B <- causal_decompose(eq.system = eq.system,
+ sigma.in = Sigma)
In this case, the package provides as output matrix tkplot()
function can still be used to obtain
the pictures of the relations among the variables given in Figure
3.
The next section will show how to perform the decomposition with
causal_decompose()
if the structure of
As pointed out in the previous section, empirical evidence aside, the
results of a decomposition based on binary matrices
> data(macroIT)
Following Greene (2003), the model equations have been estimated with 3SLS
by using the R package
systemfit
(Henningsen and Hamann (2017), (2017)). The one-lagged capital stock
[causal_decompose()
can be also employed to estimate both
the model via 3SLS and the
This version of causal_decompose()
takes the following arguments:
data
: data frame containing all the variables in the equations.eq.system
: list containing all the equations, as in systemfit.resid.est
: denotes the method used to estimate
instruments
: set of instruments used to estimate the model,
introduced either as a list or as a character vector, as in
systemfit.sigma.in
: not appropriate to empirical context, set to NULL
.The output of this function is a list containing the following objects:
eq.system
: the same list of equations provided as input.gamma
, C
, Psi0
, Psi1
, A
, and Sigma
: respectively
matrices systemfit
: the output of the systemfit()
function used to
estimate the model.all.graph
: the DAG object for the undecomposed path diagram.dec.graph
: the DAG object for the decomposed path diagram.path
: the data-set containing all the paths/relations among the
endogenous variables, along with their classification (i.e., causal,
interdependent). The graph highlights which interdependent relations
work at a systematic level and which are induced by the effect of
correlations among residuals).The code below performs the decomposition using the macroIT
data
> #system of equations
> eq.system <- list(eq1 <- C ~ CP + I + CP_1 ,
+ eq2 <- I ~ K + CP_1,
+ eq3 <- WP ~ I + GDP + GDP_1,
+ eq4 <- GDP ~ C + I + GDP_1,
+ eq5 <- CP ~ WP + T,
+ eq6 <- K ~ I + K_1)
> #instruments
> instruments <- ~ T + CP_1 + GDP_1 + K_1
> #decomposition
> dec.macroIT <- causal_decompose(data = macroIT,
+ eq.system = eq.system,
+ resid.est = "noDfCor",
+ instruments = instruments)
Table 1 shows the results of the model estimation. Since
some coefficients are not statistically significant (such as the
coefficient associated to
> #system of equations
> eq.system <- list(eq1 <- C ~ CP + CP_1 ,
+ eq2 <- I ~ K,
+ eq3 <- WP ~ I + GDP_1,
+ eq4 <- GDP ~ C + I + GDP_1,
+ eq5 <- CP ~ WP + T,
+ eq6 <- K ~ I + K_1)
> #instruments
> instruments <- ~ T + CP_1 + GDP_1 + K_1
> #decomposition
> dec.macroIT.new <- causal_decompose(data = macroIT,
+ eq.system = eq.system,
+ resid.est = "noDfCor",
+ instruments = instruments)
The results of the last estimation process are shown in Table 2. Looking at the Theil inequality indexes (Theil (1961), (1961)) reported in the last column of the table, we can see that the estimated equations fit the data very well. In fact, all Theil indexes are close to zero.
Theil | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
- | - | - | - | - | - | - | 0.0073 | |||||
- | - | - | - | - | - | - | - | 0.0115 | ||||
- | - | - | - | - | - | - | 0.0288 | |||||
- | - | - | - | - | - | - | 0.004 | |||||
- | - | - | - | - | - | - | - | 0.0074 | ||||
- | - | - | - | - | - | - | - | 0.0063 |
Theil | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
- | - | - | - | - | - | - | - | 0.0076 | ||||
- | - | - | - | - | - | - | - | - | 0.0114 | |||
- | - | - | - | - | - | - | - | 0.0304 | ||||
- | - | - | - | - | - | - | 0.0042 | |||||
- | - | - | - | - | - | - | - | 0.0073 | ||||
- | - | - | - | - | - | - | - | 0.0062 |
The estimated covariance matrix of the structural error terms is given
by
The goal of our testing procedure will be to bring out which of these feedbacks, being significant, are truly effective.
Figure 5 depicts the links operating in this model, using
the function tkplot()
of the igraph package. In this figure, a
unidirectional arrow denotes that a variable is explicative for another.
If two variables are explicative one for the other, a direct feedback
loop exists, depicted as two red arrows going in opposite directions.
Instead, a red, dashed, curved, two-headed arrow between two variables
indicates the existence of a feedback induced by error correlation.
> tkplot(dec.macroIT.new$dec.graph)
The significance of these loops has been investigated by using the
function feedback_ml()
which performs the Wald test given in
(50). The 3SLS parameter estimates have been used as
preliminary estimates to obtain the maximum likelihood (ML) estimates of
the parameters needed to build the test statistic. In particular, in
order to reach the global maximum of the log-likelihood, the initial
3SLS parameter estimates have been randomly perturbed a certain number
of times. The optimizer chosen for the scope is included in the
Rsolnp package where the
function gosolnp
is specially designed for the randomization of
starting values. The function feedback_ml()
takes the following
arguments:
data
: data frame containing all the variables in the equations.out.decompose
: the output from the previous causal decomposition
which is called by using the command causal_decompose()
.lb
and ub
: upper and lower bound of the parameter space (as in
gosolnp
).nrestarts
, nsim
and seed.in
: parameters tuning the number of
random initializations (as in gosolnp
).The output of this function is a list containing the following objects:
rho.est
: a data frame containing the estimated feedback loops for
a given equation. The first column of this data frame,
feedback eqn.
, provides the indexes of the equations involved in
the feedback loop with the equation given in input, while the
coefficients associated to the explicative endogenous for the
equation in question are shown in the column rho.est
.loglik
: the estimated log-likelihood of the best model.theta.hessian
: the estimated Hessian matrix rho.jacobian
: the estimated Jacobian matrix wald
: the value of the Wald test statistic As an example, let us assume that the interest is in testing the
significance of the feedbacks affecting the second equation, explaining
the endogenous variable
The Wald test for the significance of this feedback is performed by
using the function feedback_ml()
specified as follows
> test.E2=feedback_ml(data = macroIT,
+ out.decompose = dec.macroIT.new,
+ eq.id = 2,
+ lb = min(dec.macroIT.new$Sigma) - 10,
+ ub = max(dec.macroIT.new$Sigma) + 10,
+ nrestarts = 10,
+ nsim = 20000,
+ seed.in = 1)
By visualizing the estimate of
> test.E2$rho.tbl
Feedback eqn. rho.est
1 6 0.1641469
> test.E2$wald
[,1]
[1,] 4.115221
we can see that the existence of a feedback loop between [I] and [K] is confirmed.
Table 3 shows the results of the test for all the
equations of the model. Looking at the
Equation | Feedback Variable | Joint |
Singular |
||
---|---|---|---|---|---|
386.6 | |||||
4.115 | 0.042 | - | - | ||
25.55 | - | - | |||
95.368 | 84.315 | ||||
0.352 | 0.553 | ||||
0.046 | 0.831 | - | - | ||
19.595 | <0.0001 | - | - |
In the end, the path diagram fully describing the recurrent and interdependent relationships in the model is displayed in Figure 6.
The set of functions worked out in the paper allows a system of
simultaneous equations to be split into recursive and/or interdependent
subsystems. The user can rely on causal_decompose()
in two ways: to
assess the presence of interdependent relations with a known structure
of correlation among the error terms, or to estimate the whole model in
presence of empirical data.
The significance of the feedback loops operating in the model is tested
with a Wald test using the feedback_ml()
function. The 3SLS parameter
estimates are used as preliminary estimates to obtain the maximum
likelihood ones, which are needed to build the test.
As for the rationale of our procedure, which rests on a properly devised
test, it is worth taking into account the considerable concern raised
recently in the statistical community about the use of significance
testing (see Wasserstein and Lazar (2016), (2016)). In this connection, in order to
avoid improper use of
Moving now on more technical notes:
In this Appendix we provide the proofs of some relevant formulas of the paper.
Let
If two conformable matrices,
If a non-singular matrix
Now, upon noting that
The proof that
The matrices
Proof
Taking into account that the Hadamard product is both commutative
Now, consider the following theorem (where the symbol
Let
Given this premise, we can now prove (55). To this end,
let us write
Proof of (37) . Formula (37) can be proved as
follows. First, note that matrix
Derivation of the log-likelihood for the model
(40)-(42)
The logarithm of the density in (42) is given by
SIRE, igraph, systemfit, Rsolnp
Econometrics, GraphicalModels, Optimization, Psychometrics
This article is converted from a Legacy LaTeX article using the texor package. The pdf version is the official version. To report a problem with the html, refer to CONTRIBUTE on the R Journal homepage.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Vacca & Zoia, "Identifying and Testing Recursive vs. Interdependent Links in Simultaneous Equation Models via the SIRE Package", The R Journal, 2019
BibTeX citation
@article{RJ-2019-016, author = {Vacca, Gianmarco and Zoia, Maria Grazia}, title = {Identifying and Testing Recursive vs. Interdependent Links in Simultaneous Equation Models via the SIRE Package}, journal = {The R Journal}, year = {2019}, note = {https://rjournal.github.io/}, volume = {11}, issue = {1}, issn = {2073-4859}, pages = {149-169} }