Topological sensitivity analysis for systems biology |
| |
Authors: | Ann C. Babtie Paul Kirk Michael P. H. Stumpf |
| |
Affiliation: | Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, United Kingdom |
| |
Abstract: | ![]() Mathematical models of natural systems are abstractions of much more complicated processes. Developing informative and realistic models of such systems typically involves suitable statistical inference methods, domain expertise, and a modicum of luck. Except for cases where physical principles provide sufficient guidance, it will also be generally possible to come up with a large number of potential models that are compatible with a given natural system and any finite amount of data generated from experiments on that system. Here we develop a computational framework to systematically evaluate potentially vast sets of candidate differential equation models in light of experimental and prior knowledge about biological systems. This topological sensitivity analysis enables us to evaluate quantitatively the dependence of model inferences and predictions on the assumed model structures. Failure to consider the impact of structural uncertainty introduces biases into the analysis and potentially gives rise to misleading conclusions.Using simple models to study complex systems has become standard practice in different fields, including systems biology, ecology, and economics. Although we know and accept that such models do not fully capture the complexity of the underlying systems, they can nevertheless provide meaningful predictions and insights (1). A successful model is one that captures the key features of the system while omitting extraneous details that hinder interpretation and understanding. Constructing such a model is usually a nontrivial task involving stages of refinement and improvement.When dealing with models that are (necessarily and by design) gross oversimplifications of the reality they represent, it is important that we are aware of their limitations and do not seek to overinterpret them. This is particularly true when modeling complex systems for which there are only limited or incomplete observations. In such cases, we expect there to be numerous models that would be supported by the observed data, many (perhaps most) of which we may not yet have identified. The literature is awash with papers in which a single model is proposed and fitted to a dataset, and conclusions drawn without any consideration of (i) possible alternative models that might describe the observed behavior and known facts equally well (or even better); or (ii) whether the conclusions drawn from different models (still consistent with current observations) would agree with one another.We propose an approach to assess the impact of uncertainty in model structure on our conclusions. Our approach is distinct from—and complementary to—existing methods designed to address structural uncertainty, including model selection, model averaging, and ensemble modeling (2–9). Analogous to parametric sensitivity analysis (PSA), which assesses the sensitivity of a model’s behavior to changes in parameter values, we consider the sensitivity of a model’s output to changes in its inherent structural assumptions. PSA techniques can usually be classified as (i) local analyses, in which we identify a single “optimal” vector of parameter values, and then quantify the degree to which small perturbations to these values change our conclusions or predictions; or (ii) global analyses, where we consider an ensemble of parameter vectors (e.g., samples from the posterior distribution in the Bayesian formalism) and quantify the corresponding variability in the model’s output. Although several approaches fall within these categories (10–12), all implicitly condition on a particular model architecture. Here we present a method for performing sensitivity analyses for ordinary differential equation (ODE) models where the architecture of these models is not perfectly known, which is likely to be the case for all realistic complex systems. We do this by considering network representations of our models, and assessing the sensitivity of our inferences to the network topology. We refer to our approach as topological sensitivity analysis (TSA).Here we illustrate TSA in the context of parameter inference, but we could also apply our method to study other conclusions drawn from ODE models (e.g., model forecasts or steady-state analyses). When we use experimental data to infer parameters associated with a specific model it is critical to assess the uncertainty associated with our parameter estimates (13), particularly if we wish to relate model parameters to physical (e.g., reaction rate) constants in the real world. Too often this uncertainty is estimated only by considering the variation in a parameter estimate conditional on a particular model, while ignoring the component of uncertainty that stems from potential model misspecification. The latter can, in principle, be considered within model selection or averaging frameworks, where several distinct models are proposed and weighted according to their ability to fit the observed data (2–5). However, the models tend to be limited to a small, often diverse, group that act as exemplars for each competing hypothesis but ignore similar model structures that could represent the same hypotheses. Moreover, we know that model selection results can be sensitive to the particular experiments performed (14).We assume that an initial model, together with parameters or plausible parameter ranges, has been proposed to describe the dynamics of a given system. This model may have been constructed based on expert knowledge of the system, selected from previous studies, or (particularly in the case of large systems) proposed automatically using network inference algorithms (15–19), for example. Using TSA, we aim to identify how reliant any conclusions and inferences are on the particular set of structural assumptions made in this initial candidate model. We do this by identifying alterations to model topology that maintain consistency with the observed dynamics and test how these changes impact the conclusions we draw (). Analogous to PSA we may perform local or global analyses—by testing a small set of “close” models with minor structural changes, or performing large-scale searches of diverse model topologies, respectively. To do this we require efficient techniques for exploring the space of network topologies and, for each topology, inferring the parameters of the corresponding ODE models.Open in a separate windowOverview of TSA applied to parameter inference. (A) Model space includes our initial candidate model and a series of altered topologies that are consistent with our chosen rules (e.g., all two-edge, three-node networks, where nodes indicate species and directed edges show interactions). One topology may correspond to one or several ODE models depending on the parametric forms we choose to represent interactions. (B) We test each ODE model to see whether it can generate dynamics consistent with our candidate model and the available experimental data. For TSA, we select a group of these compatible models and compare the conclusions we would draw using each of them. (C) Associated with each model m is a parameter space Θm (gray); using Bayesian methods we can infer the joint posterior parameter distribution (dashed contours) for a particular model and dataset. (D) In some cases, equivalent parameters will be present in several selected models (e.g., θ1, which is associated with the same interaction in models a–c). We can compare the marginal posterior distribution inferred using each model for a common parameter to test whether our inferences are robust to topological changes, or rely on one specific set of model assumptions (i.e., sensitive). Different models may result in marginal distributions that differ in position and/or shape for equivalent parameters, but we cannot tell from this alone which model better represents reality—this requires model selection approaches (2–4).Even for networks with relatively few nodes (corresponding to ODE models involving few interacting entities), the number of possible topologies can be enormous. Searching this “model space” presents formidable computational challenges. We use here a gradient-matching parameter inference approach that exploits the fact that the nth node, xn, in our network representation is conditionally independent of all other nodes given its regulating parents, Pa(xn) (20–26). The exploration of network topologies is then reduced to the much simpler problem of considering, independently for each n, the possible parent sets of xn in an approach that is straightforwardly parallelized.We use biological examples to illustrate local and global searches of model spaces to identify alternative model structures that are consistent with available data. In some cases we find that even minor structural uncertainty in model topology can render our conclusions—here parameter inferences—unreliable and make PSA results positively misleading. However, other inferences are robust across diverse compatible model structures, allowing us to be more confident in assigning scientific meaning to the inferred parameter values. |
| |
Keywords: | robustness analysis biological networks network inference dynamical systems |
|
|