---------------------------------------------------------------------------- MCSim User' Manual (Derived from ./doc/mcsim.html) ---------------------------------------------------------------------------- Table of Contents * Software License o PREAMBLE o TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION * Overview o General procedure o Types of simulations o Major changes introduced with version 4.2.0 * Installation o System requirements o Distribution o Machine-Specific Installation * Working Through an Example * Defining Models o Using Mod to process model description files o Syntax of the model description file + General syntax + Global parameter declarations + Special functions + Input functions + Dynamics specifications + Parameter scaling + Output calculations + Comments on style + Note about models * Specifying Simulations o Using the compiled program o Syntax of the simulation definition file o Global specifications + SimType() specification + Integrate() specification + OutputFile() specification + MonteCarlo() specification + Distrib() specification + MCMC() specification + SetPoints() specification o Specifying basic conditions to simulate + Experiment definition + StartTime() specification + Print() specification + PrintStep() specification + Data() specification o Specifying a statistical model + Level definition o Analyzing results o Error Handling * Bibliographic References * Common Pitfalls * Using make * Examples o `linear.model' o `1cpt.model': A sample model description file o `perc.model': A sample model description file o `perc.lsodes.in' * Concept Index ---------------------------------------------------------------------------- Software License GNU GENERAL PUBLIC LICENSE Overview MCSim consists of two pieces, a model generator and a simulation engine. The model generator, mod, was created to facilitate the model maintenance and simulation definition, while keeping execution time fast. Other programs have been created to the same end, the Matlab family of graphical interactive programs being some of the more general and easy to use. Still, many available tools are not optimal for performing time and computer intensive Monte Carlo analyses. MCSim was created specifically to this end: to perform Monte Carlo analyses in a highly optimized, and easy to maintain environment. General procedure Model building and simulation proceeds in four stages: 1. You create with any text editor a model description file. The reference section on mod, later in this manual gives you the syntax to use (see section Defining Models). This syntax allows you to describe the model variables, parameters, equations, inputs and outputs in a C-like fashion without having you to actually know how to write a C program. 2. You instruct the model generator, mod, to preprocess your model description file. Mod creates a C file, called `model.c'. 3. You compile and link the newly created `model.c' file together with the other C program files. MCSim C code is standard, so you should be able to compile it with any standard C compiler, for example GNU gcc. After compiling and linking, an executable simulation program `mcsim' is created, specific of the particular model you have designed. These preprocessing and compilation steps may seem clumsy but they produce the most efficient code for your particular machine. 4. You design any number of simulation specification files and run them with the mcsim program. The simulation specification files describe the kind of simulation to run (simple simulations, Monte Carlo etc.), various settings for the integration algorithm if needed, and a description of one or several experimental conditions or observations to simulate (see section Specifying Simulations). The simulation output is written to standard ASCII files. Little or no knowledge of computer programming is required, unless you want to tailor the program to special needs, beyond what is described in this manual (in which case you should contact us). You need, however, some familiarity with program compilation under your operating system (see section Installation). The software manual for your compiler should be able to help you. Types of simulations Four types of simulations are available: * A DefaultSim simulation will simply solve (eventually integrate) the equations you specified, using the default parameter values and possible overridings imposed in a simulation specification file. User-requested outputs are sent to an output file (see section SimType() specification). * A MonteCarlo simulation will perform repeated (stochastic) simulations across a randomly sampled region of the model parameter space (see section MonteCarlo() specification). * A MCMC simulation performs a series of simulations along a Markov chain in the model parameter space (see section MCMC() specification). These are Monte Carlo simulations in which the choice of a new parameter value is influenced by the current value. They can be used to obtain the Bayesian posterior distribution of the model parameters, given their prior distributions (that you specify) and data for which a likelihood function can be computed. The program handles hierarchical (random effect) statistical models, such as population pharmacokinetic models (see section Specifying a statistical model). * A SetPoints simulation solves the model for a series of specified parameter sets, listed in a separate ASCII file (see section SetPoints() specification). You can create these parameter sets yourself or use the output of a previous Monte Carlo or MCMC simulation. Major changes introduced with version 4.2.0 * A new input function, Spikes() is available. It simulates instantaneous inputs (see section Input functions). * Six new distributions are available for Monte Carlo simulations: InvGamma (inverse-gamma), Piecewise, Normal_v, LogNormal_v, TruncNormal_v, TruncLogNormal_v (see section Distrib() specification). * In Monte Carlo simulations, shape parameters of Distrib() statements can now reference other sampled parameters (see section MonteCarlo() specification). * A simTypeFlag option, in the MCMC() specification, allows the printing of times, data and predictions for easy checking of the model fit. It can also be set to switch MCMC sampling from component by component sampling to vector sampling (see section MCMC() specification). Installation System requirements MCSim is written in ANSI-standard C language. We are distributing the source code and you should be able to compile it for any system, provided you have an ANSI C compiler. On a Unix system we recommend that you install the GNU gcc compiler (freeware) and the make utility (which is in fact standard on most systems). For PC-type computers we recommend the Unix operating system Linux. For the Macintosh, you can use version 8.0 (or higher) of SymantecÕs Think C compiler, although other compilers should work. Distribution MCSim source code is available on Internet through `ftp://sparky.berkeley.edu/pub/mcsim', `http://sparky.berkeley.edu/users/fbois/', and `http://www.gnu.ai.mit.edu/home.html'. Machine-Specific Installation To install on a given machine, download (in binary mode) the distributed archive file to your machine.directory. Move it to an empty directory of any name. Decompress the archive with GNU gunzip (gunzip <archive-name.gz>). Untar the decompressed archive with tar (tar xf <archive-name>; do man tar for further help). On Macintosh machines the programme "Stuffit Expander" should be able to both uncompress and untar the archive. This decompression will create two directories: `mod' and `sim'. Move to the `mod' directory. Under Unix, compile the mod program using the `Makefile' in that directory (the command make should just do that). Under other operating systems, refer to the documentation of your compiler to create an executable `mod' file from the source code provided in the `mod' directory. Move the executable `mod' program you just created to the `sim' directory. You are then ready to use MCSim. This requires creating a model definition file, processing it with the mod program, and compiling the resulting `model.c' file with all the other C files in the `sim' directory. You can then run simulations files. We recommend that you go to the next section of this manual (see section Working Through an Example, which walk you through an example of model building and running. The makefile `Test_mcsim' can be used to test whether the program output on your Unix machine is the same as the one on our machines. Just type: make; make -f Test_mcsim when in your `sim' directory. All input files will be run and their output compared to the corresponding output files. You will need to have the `mod' program already compiled and inside the `sim' directory, or on the command path. In case of differences, don't panic: check the actual differences between the culprit output file and the file `sim.out' produced by the makefile. Small differences may occur from different machine precision. This can happen for random numbers, in which case the Markov chain simulations (MCMC) can diverge greatly after a while. Working Through an Example Pharmacokinetics models describe the transport and transformation of chemical compounds in the body. These models often include nonlinear first-order differential equations. The following example is taken from our own work on the kinetics of tetrachloroethylene (a solvent) in the human body (Bois et al., 1996; Bois et al., 1990) (see section Bibliographic References). Go to the `sim' directory (under Unix) or to the `Development' folder (on a Macintosh). Open the file `perc.model' with any text editor (e.g., emacs or vi under Unix). This file is distributed as an example of a model definition file (see section `perc.model': A sample model description file). You can use it as a template for your own model, but you should leave it unchanged for now. Notice that it defines: * state variables for the model (for which differentials are defined), for example: States = {Q_fat, # Quantity of PERC in the fat Q_wp, # ... in the well-perfused compartment Q_pp, # ... in the poorly-perfused compartment Q_liv, # ... in the liver Q_exh, # ... exhaled Qmet} # Quantity of metabolite formed * output variables (obtainable at any time as analytical functions of the states, inputs and parameters), for example: Outputs = {C_liv, # mg/l in the liver C_alv, # ... in the alveolar air C_exh, # ... in the exhaled air C_ven, # ... in the venous blood Pct_metabolized, # % of the dose metabolized C_exh_ug} # ug/l in the exhaled air * input variables (independent of the others variables, and eventually varying with time), for example: Inputs = {C_inh} # Concentration inhaled * model parameters (independent of time), such as: LeanBodyWt = 55; # lean body weight * system's dynamics (differential or algebraic equations defining the model per se), * parameters' scaling (the parameters used in the equations can be made functions of other parameters: for example volumes can be computed from masses and densities), * equations to compute the output variables. This model definition file as a simple syntax, easy to master. It needs to be turned into a C program file before compilation and linking to the other routines (integration, file management etc.) of MCSim. You will use mod for that. First, quit the editor and return to the operating system. To start mod under Unix just type mod perc.model. On a Macintosh, double click the `Mod' icon; Mod prompts you for the name of your model definition file; Type `perc.model'. After a few seconds, with no error messages if the model definition is syntactically correct, Mod announces that the `model.c' file has been generated. On a Macintosh you need to hit the return key to exit Mod. The next step is to compile and link together the various C files that will constitute the simulation program for your particular model. Note that each time you want to change an equation in your model you will have to change the model definition file and repeat the steps above. However, changing just parameter values or state initial values does not require recompilation since that can be done through simulation specification files. * Under Unix, the simplest is to use the make utility. Just type make and compilation will be done automatically (see section Using make). An executable `mcsim' is created. You can rename it to better describe the fact that it is model specific: rename it `mcsim_perc', for example. * On a Macintosh, or PC, the best is to use a Think C project or a similar programming environment. You should first use the command make or its equivalent to compile the modified `model.c' file and other C files. Then create an application (you should give it a name specific to the model you are developing, e.g., `MCSim Perc'). Refer to your compiler manual for details on how to use your programming environment. Your executable `MCSim Perc' program is now ready to perform simulations. To start your MCSim program just type mcsim_perc (if you gave it that name) under Unix, or double click the `MCSim Perc' Ð or whatever name you specified Ð icon on your Macintosh. After an introductory banner (telling in particular which model file the program has been compiled with), you are prompted for an input file name: type perc.lsodes.in (see section `perc.lsodes.in', to see this file now). The program then prompts you for the output file name: type perc.lsodes.out. After a few seconds or less (depending on your machine) the program announces that it has finished and that the output file is `perc.lsodes.out' (on a Macintosh you should hit the return key to exit the program completely). You can open the output file with any text editor or word processor, you can edit it for input in graphic programs etc. Several other models and simulation specification files are provided with the package as examples (they are in the `sim' directory. Try them and observe the output you obtain. You can then start programming you own models and doing simulations. The next sections of this manual reference the syntax for model definition and simulation specifications. Defining Models Three examples of model simulation files: `linear.model', `1cpt.model' and `perc.model' are included with the program files and appears in Appendix of this manual (see section Examples). Using Mod to process model description files The mod program is a stand-alone facility. It takes a model description file in the "user-friendly" format described below and creates a C language file `model.c' which you will compile and link to the simulation program. Mod allows the user to define equations for the model, assign default values to parameters or default initial values to model variables, and define scaling functions for both the input parameters and the outputs variables. Mod lets the user create and modify models without having to maintain C code. In Unix, the command line syntax for the mod program is: mod [input-file [output-file]] where the brackets indicate that the input and output filenames are optional. If the input filename is not specified, the program will prompt for both. If only the input filename is specified, the output is written by default to the file `model.c'. Unless you feel like doing some makefile programming, we recommend using this default since the makefile for MCSim assumes the C language model file to have this name. You have to have prepared an input file containing a description of the model following the syntax described in the following (see section Syntax of the model description file). On the Macintosh you double-click the `Mod' icon and enter the name of the model definition file at the prompt (on the Macintosh, names can include space characters). Syntax of the model description file The model description file is a text (ASCII) file that consists of several sections, including global declarations, dynamics specifications (eventually with derivative calculations), model scaling, and output computations: # Model description file (this is a comment) <Global parameter specifications> Dynamics { <Equations for calculating the dynamics, or state derivatives> } Scale { <Equations for scaling model parameters> } CalcOutputs { <Equations for scaling the outputs> } where Dynamics, Scale and CalcOutputs are keywords and, if used, must appear as shown, followed by the curly braces which delimit the section. At least one of the sections Dynamics or CalcOutputs should be defined. Dynamics must be used if the model includes differential equations. General syntax The general syntax of the file is as follows: * Comments begin with a pound sign (#) and continue to the end of the line. * Blank lines are allowed and ignored. * All commands can span several lines and are terminated by a semi-colon (;). There should be no blank space before the semi-colon. * Variable assignments have the following syntax: var-name '=' constant-value-or-expression ';' where var-name is any valid C identifier, starting with a letter or underscore (_) and followed by any number of alpha-numeric characters or underscores, up to a maximum of 80. Variable names are case sensitive. Note that the name IFN, in capital letters, is reserved by the program and should not be used as parameter or variable name. The equal sign is needed. The right-hand side expression can be a valid C mathematical expression including already defined variables and ANSI C mathematical functions. Additional functions, called special functions, can be used, which take variable names, constant values or expressions as parameters. Special functions are detailed below (see section Special functions). Finally, there should be no space before the terminating semi-colon. Colon conditional assignments can also be used. Syntax: var-name '=' (<condition> ? <value-if-true> : <value-if-false>); For example: Adj_Parm = (Input > 0.0 ? Parm * Adjust : Parm); In this example, if `Input' is greater than 0, the parameter `Adj_Parm' is computed as the product of `Parm' by `Adjust'; otherwise `Adj_Parm' is equal to `Parm'. The comparison operators allowed are the equality operator ==, and non-equality operators !=, <, >, <>, <= and >=. Global parameter declarations Commands not specified within the delimiting braces of another section are considered to be global declarations. In the global section, you first declare the states, inputs, and outputs variables. There should be at least one state or output variable in your model. * States are variables for which a first-order differential equation is defined (higher orders or partial differential equations are not allowed). * Inputs are variables independent of the others variables, but eventually varying with time (for example an exposure concentration to a chemical). * Outputs are dependent model variables (obtainable at any time as analytical functions of the states, inputs or parameters) that do not have dynamics. The format for declaring each of these variables is the same, and consists of the keyword States, Inputs or Outputs followed by a list of the variable names enclosed in curly braces as shown here: States = {Qb_fat, # Benzene in the fat Qb_bm, # ... in the bone marrow Qb_liv}; # ... in the liver and others Inputs = {Q_gav, # Gavage dose C_inh}; # Inhalation concentration Outputs = {Cb_exp, # Concentration in expired air Cb_ven}; # ... in venous blood After being defined, states, inputs and outputs can then be given initial values (constants or expressions). Note that inputs can also be assigned input functions, described below. Some examples of initialization are shown here: Qb_fat = 0.1; # Default initial value for state variable Qb_fat # Gavage input assigned a periodic exponential input function Q_gav = PerExp(1, 60, 0, 1); # Magnitude of 1.0, # period of 60 time units, # T0 in period is 0, # Rate constant is 1.0 If a global state, input, or output variable is not given an initial value, it will default to zero. Initial values are reset to their specified value by the simulation program at the start of each simulation of an Experiment (see section Experiment definition). All the model parameters you want to be able to change through simulation files should be declared global. Parameters must be given nominal values, following the assignment rules given above. For example: BodyWt = 0.38 + sqrt(0.01); # (kg) Weight of the rat All parameters and variables are computed in double precision floating-point numbers. Initialization values should not be such as to cause computation errors in the model equations; this is likely to cause program crashes (so, for example, do not assign a default value of zero to a parameter appearing alone in a denominator). Note that the order of global declarations matters within the global section itself (i.e., parameters and variables should be defined and initialized before being used in assignments of others), but not with respect to other blocks. A parameter defined at the end of the description file can be used in the dynamics section which may appear at the beginning of the file. Still, such an inverse order should be avoided. For this reason, the format above, where global declarations come first, is strongly suggested to avoid confusing results. Note again that the name IFN, in capital letters, is reserved by the program and should not be used as parameter or variable name. Finally, if a parameter is defined several times, only the first definition is taken into account. Special functions The following special functions (whoe name is case-sensitive) are available to the user for assignment of parameters and variables in the model definition file: * CDFNormal(x): the normal cumulative density function; * erfc(x): the error function; * lnDFNormal(x, mean, sd): the natural logarithm of the normal density function; * lnGamma(x): the natural logarithm of the gamma function; * UniformRandom(min, max): returns a uniformly distributed random variable, sampled between min and max. The algorithm used is that of Park and Miller (Barry, 1996; Park and Miller, 1988; Vattulainen et al., 1994) (see section Bibliographic References). A default random generator seed (314159265.3589793) is used; * NormalRandom(mean, sd): returns a normally distributed random variable. The default random generator seed is used. Input functions They can be used in a special assignments, valid only for input variables. Inputs can be initialized as a constant or expression, or assigned one of the following input functions: * PerDose() specifies a periodic input of constant magnitude. The input begins at initial-time in the period and lasts for exposure-time time units. Syntax: PerDose(<magnitude>, <period>, <initial-time>, <exposure-time>); * PerExp() specifies a periodic exponential input. At time initial-time in the period the input rises instantaneously to magnitude and begins to decay exponentially with the constant decay-constant. The input is turned off once the magnitude reaches a negligible fraction of its original value. Syntax: PerExp(<magnitude>, <period>, <initial-time>, <decay-constant>); * NDoses() specifies a number of stepwise inputs of variable magnitude and their starting times. The first argument, n, is the number of input steps and start times. Next come a list of magnitudes and a list of initial times. Each list is comma-separated. The duration of input step is computed automatically by difference between the specified times. Currently this function can only be used in the simulation description file, and not in the model description file (which simply implies that you cannot use it as a default). Syntax: NDoses(<n>, <list-of-magnitudes>, <list-of-initial-times>); * Spikes() specifies a number of instantaneous inputs of variable magnitude and their exact times of occurence. The first argument, n, is the number of inputs and input times. Next come a list of magnitudes and a list of times. Each list is comma-separated. Currently this function can only be used in the simulation description file, and not in the model description file (which simply implies that you cannot use it as a default). Syntax: Spikes(<n>, <list-of-magnitudes>, <list-of-times>); The arguments of input functions can either be constants or variables. As an example, if `GavMag' and `RateConst' are defined model parameters, then the input variable `Q_gav' can be defined as: Q_gav = PerExp(GavMag, 60, 0, RateConst); In this way the parameters of input functions can, for example, be assigned statistical distributions in Monte Carlo simulations (see section Distrib() specification). Variable dependencies are resolved before the simulation is started. For each of the periodic functions, a single exposure beginning at time initial-time can be specified by giving an effectively infinite period, The first period starts at the initial time of the simulation. Magnitudes change exactly at the times given. Input functions can be combined to give a lot of flexibility (e.g., an input can be sum of some others). Separate inputs can be declared in the global section of the model definition file and combined in the Dynamics and CalcOutputs sections. The only limitation is that for each input function used, a separate input must be defined in the model, even though this function may not be a real element of the model. Dynamics specifications The dynamics specification section begins with the keyword Dynamics and is enclosed in curly braces. The equations given in this section will be called by the integrator at each integration step. Additional variables to those declared in the global section may be used for any calculations within the section. They will be declared as local temporary variables. (Note, for example, the use of `Cout_fat' and `Cout_wp' in the `perc.model' sample file). Local variables are not accessible from the simulation program, or from other sections of the model definition file, so don't try to output them. Each state variable declared in the global section must have one corresponding state equation in the Dynamics section. If a state equation is missing, mod issues an error message such as: Error: State variable 'Q_foo' has no dynamics. If one or more differential equations are missing, no program file will be created. Most error messages are self-explanatory. Where appropriate, they also show a line number in the input file where the error occurred. Beware, however, of cascades of errors generated as a consequence of a first one; so don't panic: start by fixing the first one and rerun mod. The derivative of a state variable is defined using the dt() operator, as shown here: dt(state-variable) '=' constant-value-or-expression ';' The right-hand side can be any valid C expression, including standard math library calls and the special functions mentioned above (see section Special functions). Note, however, that no syntactic check is performed on the library function calls. Their correctness is your responsibility. The dt() operator can also be used in the right-hand side of equations in the dynamics section to refer to the value of a derivative at that point in the calculations. For example: dt(Qm_in) = Qmetabolized - dt(Qm_out); The integration variable (e.g., time) can be accessed if referred to as t, as in: dt(Qm_in) = Qmetabolized - t; Output variables can also be made a function of t in the Dynamics section. Note that while state variables, input variables and model parameters can indeed be used on the right-hand side of equations, they cannot be assigned values in the Dynamics section. If you need a parameter to change with time, declare it as output variable in the global section. Assignments to inputs or parameters in this section causes an error message like the following to be issued: Error: line 48: 'YourParm' used in invalid context. Parameters cannot be defined in Dynamics{} section. Parameter scaling The parameter scaling section begins with the keyword Scale and is enclosed in curly braces. The equations given in this section will define a function that will be called by the simulation program at the beginning of each simulation of an Experiment (see section Experiment definition). They can therefore be used for initialization of the simulations. All model variables and parameters can be changed in this section. Modifications to state variables affect initial values only. Modifying an input is not allowed and state variables can only appear at the left hand side of equations. The dt() operator (see section Dynamics specifications) cannot be used in this section, since derivatives have not yet been computed when the scaling function is called. Output calculations The output calculation section begins with the keyword CalcOutputs and is enclosed in curly braces. The equations given in this section will be called by the simulation program at each output time specified by a Print() or PrintStep() statement (see section Print() specification, and see section PrintStep() specification). In this way, the output scaling is done efficiently, only when values are to be saved, and not at each integration step. Only variables that have been declared with the keyword Outputs can be changed in this section. Assignments to other types of variables cause an error message like the following to be issued: Error: line 56: 'Qb_fat' used in invalid context. Only outputs can be defined in CalcOutputs{} section. Any reference to an input or state variable will use the last calculated (current) value of the input. The dt() operator can appear in the right-hand side of equations, and it refers to values of the derivative as calculated at the last time step (see section Dynamics specifications). Like in the Dynamics section, the integration variable can be accessed if referred to as t, as in: Qx_out = DQx * t; Comments on style For your model file to be readable and understandable, it is useful to use a consistent style of notation. The example file `perc.model' follows such a consistent notation (see section `perc.model': A sample model description file). For example we suggest that: * All variable names begin with a capital letter followed by meaningful lower case subscripts. * Where two subscripts are necessary, they can be separated by an underscore, such as in `Qb_fat'. * Where there is only one subscript an underscore can still be used to increase readability as in `Q_fat'. * Where two words are used in combination to name one item, they can be separated visually by capitalizing each word, as in `BodyWt'. These conventions are suggestions only. The key to have a consistent notation that makes sense to you. Consistency is one of the best ways to: 1. Increase readability, both for others and for yourself. If you have to suspend work for a month or two and then come back to it, the last thing you want is to have to decipher your own file. 2. Decrease the likelihood of mistakes. If all of the equations are coded with a consistent, logical convention, mistakes stand out more readily. Note about models MCSim can easily deal with algebraic models. You do not need to define state variables or a Dynamics section for such models. Simply use input and output variables and paramaters. The model can be specified in the CalcOutputs section. You can use the time t if that is natural for your model. If you do not use t in your model, you will still need to specify "output times" in Print() or PrintStep() statements to obtain outputs: you can use an arbitrary time, such as 1. If you do not use t you will also need do define an Experiment (see section Experiment definition) for each combination of values for the "independent" variables of your model. This may be clumsy if many values are to be used. In that case, you may want to use the variable t to represent something else than time. Ordinary differential models, with algebraic components, can be setup easily with MCSim. Use state variables and specify a Dynamics section. Time, t is the integration variable, but here again, t can be used to represent anything you want. We are not aware of cases in which MCSim has been used for partial differential equations. Some problems might be solved by implementing rudimentary line methods... You can use MCSim for discrete-time dynamic models (or difference models). It's a bit tricky. Assignments in CalcOutput are volatile (not memorized), so the model equations have to be in Dynamics. But the model variables should still be declared as outputs, because they should not be updated by integration. However, you need at least a true state in the Dynamics section, so you should declare a dummy one (and give it a constant derivative of value zero). You also want the calls to Dynamics to be precisely scheduled, so it is best to use the Euler integration routine (see section Integrate() specification) which uses a constant step. Since Euler may call repeatedly Dynamics at any given time, you want to guard against untimely updating... Altogether, we recommend that you examine the sample files `discrete.model' and `discrete.in' provided with the source code for MCSim. Specifying Simulations After having your model defined and processed by mod, and the resulting `model.c' file compiled with the MCSim routines, you are ready to run simulations. For this you need to write a simulation file. An example file `perc.lsodes.in', which works with the perchloroethylene model, appears in Appendix (see section `perc.lsodes.in'). Using the compiled program The simulation environment MCSim provides several types of simulations for the models you create. Simulations are specified in a text file of format similar to that of the model description file. In Unix the command-line syntax for the MCSim program is: mcsim [input-file [output-file]] where the brackets indicate optional arguments. This assume that you have not renamed the executable file; If you have, substitute the name of your executable. If no input and output file names are specified, the program will prompt you for them. If already one file name is given on the command-line, the program will assume it specifies the input file and will prompt you for the output file name. You must provide an input file name. If you just hit the return key when prompted for the output name, the program will use the name you have specified in the input file, if any, or a default name. When the program starts up, it announces which model description file it was created with. The input file describes the simulations to perform and specifies which outputs should be printed out (see section Syntax of the simulation definition file). On the Macintosh you double-click the `MCSim' icon and enter the name of your simulation definition file at the first prompt and then the name of the output file (or just hit return if you want the default or the name you have specified in the input file to be used). Syntax of the simulation definition file The file starts with a global declaration section followed by a number of Experiment (i.e., simulation) definitions (see section Experiment definition), eventually enclosed in a Level definition if Markov chain Monte Carlo simulations are to be performed (see section Specifying a statistical model). Each Experiment defines simulation conditions, from an initial time (or whatever your dependent variable represents) to a final time. The initial values of model state variables, parameter values, input variables, and which outputs are to print at which times can all be changed in a given Experiment. The general syntax of the file is the same as that of mod with two differences: * Variables can only be assigned constant values. * Input variables' assignments can use any input function (including the NDoses() function) or constant values. Expressions are not allowed (unlike in the model definition file where they can be used). Similarly, structural change to the model, for instance, addition of a state, input, output or parameter, cannot be done here and must be done in the model description file. The simulation specification file is read until its end is reached, or until an End command is reached. The general layout of the file is: # Input file (this a comment) <Global modifications and analysis specifications> Experiment { <Specifications for first experiment> } Experiment { <Specifications for second experiment> } # Unlimited number of experiment specifications End # Optional End statement. Everything after this line is ignored For Markov chain Monte Carlo simulations (see section MCMC() specification), the general layout of the file includes Level definitions: # Input file (this is a comment) SimType(MCMC); <Global modifications and analysis specifications> Level { # Up to 10 levels of hierarchy Experiment { Specifications for first experiment } Experiment { Specifications for second experiment } # Unlimited number of experiment specifications } # end Level End # Optional statement. Everything after this line is ignored Global specifications The global section is used to give specifications relevant to all experiments, for example specification of the type of analysis, how the integrator should work, parameter modifications to be used for all experiments, etc. Both the global section and each experiment section can contain modifications to defined model variables. At the beginning of a simulation, all model parameters are initialized to the nominal values specified in the model description file. Next, modifications given in the global section are applied, and finally any modifications for the current experiment are applied. SimType() specification The type of analysis performed is specified using the SimType() specification. Example: SimType(MonteCarlo); The following keywords can be used: * DefaultSim: the list of specified experiments is simulated, * MonteCarlo: the specified experiments are simulated several times with random input parameters (see section MonteCarlo() specification), * MCMC (previously Gibbs): Markov chain Monte Carlo simulations are performed to attain the posterior distribution of the model's parameters, given their prior distributions and data for which the likelihood function can be computed (see section MCMC() specification), * SetPoints: the experiments are simulated using several lists of user-defined parameters values in input (see section SetPoints() specification). If MonteCarlo, MCMC, or SetPoints simulations are requested, additional specifications are needed (see below). Integrate() specification The integrator settings can be changed with the Integrate specification. Two integration routines are provided: Lsodes (which originates from the SLAC Fortran library and is originally based on Gear's routine) (Gear, 1971b; Gear, 1971a; Press et al., 1989) (see section Bibliographic References) and Euler (Press et al., 1989). The syntax for Lsodes is: Integrate(Lsodes, <rtol>, <atol>, <method>); Rtol is a scalar specifying the relative error tolerance for each integration step. Atol is a scalar specifying the absolute error tolerance parameter. They apply to all integration variables (state variables). The estimated local error in a state variable y(i) will be controlled so as to be roughly less (in magnitude) than Thus the local error test passes if, in each component, either the absolute error is less than atol, or the relative error is less than rtol. Set rtol toÊzero for pure absolute error control, and use atol to zero for pure relative error control. Caution: actual (global) errors may exceed these local tolerances, so choose them conservatively. The method flag should be 0 (zero) for non-stiff differential systems and 1 for stiff systems. You should try both and select the fastest for equal accuracy of output, unless insight from your system leads you to choose a priori. In our experience, a good starting point for atol and rtol is about The syntax for Euler is: Integrate(Euler, <time-step>, 0, 0); time-step is a scalar specifying the constant time-step to be taken at each integration step. The next two scalars are reserved for future use and should be set to zero. If the Integrate() specification is not used, the default integration method is Lsodes with parameters We recommend using Lsodes, since is it highly accurate and efficient. Euler can be used for special applications (e.g., in system dynamics) where a constant time step and a simple algorithm are needed. OutputFile() specification The OutputFile() specification allows you to specify a name for the output file of DefaultSim simulations. If this specification is not given the name `sim.out' is used if none has been supplied on the command-line or the initial dialog. The corresponding syntax is: OutputFile("<OutputFilename>"); MonteCarlo() specification Monte Carlo simulations (Hammersley and Handscomb, 1964; Manteufel, 1996) (see section Bibliographic References) require the use of two specifications, MonteCarlo() and Distrib(), which must appear in the global section of the file, before the Experiment sections. Such Monte Carlo specifications are ignored if they appear in an Experiment specification. The MonteCarlo specification gives general information required for the runs: the output file name, the number of runs to perform, and a starting seed for the random number generator. Its syntax is: MonteCarlo("<OutputFilename>", <nRuns>, <RandomSeed>); The output filename is a string field and must be enclosed in quotes. If a null-string "" is given, the default name `simmc.out' will be used. The seed of the pseudo-random number generator can be any positive real number. Seeds between 1.0 and 2147483646.0 are used as is, others are rescaled within those bounds (and a warning is issued). Here is an example of use: MonteCarlo("percsimmc.out", 5, 9386.630); The parameters' sampling distributions are specified by a list of Distrib specifications, as explained in the next section. The format of the output file of Monte Carlo simulations is discussed later (see section Analyzing results). Distrib() specification This specification indicates which variable to sample, and its sampling distribution. One Distrib() specification must be included for each variable to sample. The specification file can include any number of these commands at the global level, or within any Level section in the case of Markov chain Monte Carlo sampling (see section Specifying a statistical model). The syntax is: Distrib(<identifier>, <iType>, [<shape parms>]); The iType field specifies the sampling distribution to use and can be one of following: * Uniform, * LogUniform, * Normal, * Normal_v, * LogNormal, * LogNormal_v, * TruncNormal, * TruncNormal_v, * TruncLogNormal, * TruncLogNormal_v, * Beta, * Gamma, * InvGamma, * Chi2, * Exponential, * Binomial, * Poisson, * Piecewise. The corresponding shape parameters (Bernardo and Smith, 1994; Gelman et al., 1995) (see section Bibliographic References) are as follow: * Uniform and log-uniform distributions: minimum and maximum of the sampling range, real numbers in natural space. * Normal distribution (two reals numbers): mean and standard deviation, the latter being stricly positive. The variant Normal_v takes the variance instead of the standard deviation as second parameter. * Truncated normal distribution (four reals numbers): mean, standard deviation (stricly positive), minimum and maximum. The variant TruncNormal_v takes the variance instead of the standard deviation as second parameter. * LogNormal distribution (two reals numbers): geometric mean (exponential of the mean in log-space) and geometric standard deviation (exponential, stricly superior to 1, of the standard deviation in log-space). The variant LogNormal_v takes the variance (in log-space!) instead of the standard deviation as second parameter. * Truncated Lognormal distribution (four reals numbers): geometric mean and geometric standard deviation (stricly superior to 1), minimum and maximum in natural space. For example: Distrib(Var, TruncLogNormal, 1, 2.718, 0.01, 10) samples Var such that while Var is truncated to fall between 0.01 to 10. The variant TruncLogNormal_v takes the variance (in log-space!) instead of the standard deviation as second parameter. * Beta distribution (at least two strictly positive real numbers): A and B. By default the Beta distribution is defined over the interval [0;1]. If a range is given for the beta distribution, the [0;1] interval is mapped onto the specified range. * Gamma distribution (two strictly positive real numbers): shape a and inverse scale b. * Inverse-gamma distribution (two strictly positive real numbers): shape a and scale b. * Chi-squared distribution (one strictly positive real number): n. This distribution is the same as Gamma(n/2, 1/2). * Exponential distribution (one strictly positive real number): inverse-scale b. * Binomial distribution (two strictly positive numbers, a real and an integer): p (in the interval [0;1]), and N. If N is not input as an integer it will be rounded down during the simulations. * Poisson distribution (a strictly positive real): rate l. * Piecewise distribution (four reals): minimum, a, b, maximum. The distribution has the form of a truncated triangle, with a plateau between a and b. If the distribution is the triangular distribution. The shape parameters of the above distribution specifications can reference other parameters, provided than distributions for these have already been defined. For example: Distrib(A, Normal, 0, 1); Distrib(B, Normal, A, 2); MCMC() specification Markov chain Monte Carlo (MCMC) simulations, used in a Bayesian context, allow the user to specify a statistical model (eventually hierarchical) and sample parameters from their joint posterior distribution, given a prior distribution for each parameter, a set of data to simulate, and corresponding likelihoods. Sampling from the posterior is not immediate: it requires the simulation chain, which start by sampling purely from the prior, to reach equilibrium. Checking that equilibrium is obtained is best achieved, in our opinion, by running multiple independent chains. Hence these computations are very intensive. For a discussion of Markov chain Monte Carlo and convergence issues you should consult the appropriate statistical literature (for example, Bernardo and Smith, 1994; Gelman, 1992; Gelman et al., in press; Gelman et al., 1995; Gelman and Rubin, 1992; Smith, 1991; Smith and Roberts, 1993) (see section Bibliographic References). Technically, MCSim uses Metropolis-Hasting sampling and you do not need to worry about issues of conjugacy or log-concavity of your prior or posterior distributions. Like simple Monte Carlo simulations, MCMC simulations require the use of two specifications, MCMC() and Distrib() and of one special section definition: Level. The syntax for the MCMC() specification is: MCMC("<OutputFilename>", "<RestartFilename>", "", <nRuns>, <simTypeFlag>, <printFrequency>, <itersToPrint>, <RandomSeed>); The output filename is a string field and must be enclosed in quotes. If a null-string "" is given, the default name `MCMC.default.out' will be used. If a restart file name (enclosed in quotes) is given, the first simulations will be read from that file (which must be a text file). This allows you to continue a chain where you left it, since an MCMC output file can be used as a restart file with no change. Note that the first line of the file (which typically contains column headers) is skipped. Also, the number of lines in the file must be less than or equal to nRuns. The first column of the file should be integers, and the following columns (tab- or space-separated) should give the various parameters, in the same order as specified in the list of Distrib() specifications in the input file. The third field is reserved for future use and should just be a pair of empty quotes. The integer nRuns gives the total number of runs to be performed, including the runs eventually read in the restart file. The next field, simTypeFlag should be either 0, 1, or 2. It should be set at zero to start a chain of MCMC simulations. In that case, parameters are updated by Metropolis steps, one at a time. If the value of simTypeFlag is set to 1 or 2, a restart file must also be specified. In the case of 1, the output file will contain codes for the level sequence, experiment numbers, printing times, data values and the corresponding model predictions, computed using the last parameter vector of the restart file. This is useful to quickly check the model fit to the data. If simTypeFlag is equal to 2, the entire restart file is used to compute the parameters' covariance matrix. All parameters are then updated at once using a multivariate normal kernel as proposal distribution of the Metropolis steps. This results in large improvement in speed. However, we recommend that this option be used only when convergence is approximately obtained (therefore, you should run MCMC simulations with simTypeFlag set to 0 first, up to approximate convergence, and then restart the chain with the flag at 2). The integer printFrequency should be set to 1 if you want an output at each iteration, to 2 if you want an output at every other iteration etc. itersToPrint is the number of final iterations for which output is required (e.g., 1000 will request output for the last 1000 iterations; to print all iterations just set this parameter to the value of nRuns). Note that if no restart file is used, the first iteration is always printed, regardless of the value of itersToPrint. Finally, the seed of the pseudo-random number generator can be any positive real number. Seeds between 1.0 and 2147483646.0 are used as is, others are rescaled silently within those bounds. To use the MCMC specification, you must define a statistical model precising each parameter's prior distribution, or conditional distribution (in the case of a hierarchical model), and the data likelihood (i.e., the distribution of observation errors). These distributions must be enclosed in a Level section and are specified with Distrib() statements (see section Specifying a statistical model). In the context of MCMC sampling, MCSim provides an extension of the Distrib() specification. First, the first two shape parameters of distributions may depend on other model parameters. For example: Distrib(A, Normal, 0, 1); Distrib(B, Normal, A, C); The data distribution is given by a similar statement, which uses the specification Prediction() to differentiate data from their predicted counterparts. The Prediction() specification can be used for the first two shape parameters only (therefore, not for ranges, except in the case of uniform or loguniform distributions). If Prediction() is used for the first shape parameter, the variable enclosed in parentheses must be the same as the variable whose distribution is described. There should be one and only one distribution specified for a given type of data in the whole input file (i.e., you cannot redefine a likelihood; this limitation will hopefully be removed in a future release). Note that only states and outputs can use Prediction() specifications (but you can always define an output to be equal to a parameter or an input in your model file). For example: Distrib(y, TruncNormal, Prediction(y), Prediction(z), -10, 10); To recapitulate, the extended Distrib() syntax, for use with MCMC simulations is therefore: Distrib(<identifier>, <iType>, [<shape parms>]); where the first two shape parameters can be Prediction(<identifier>), or any model parameter or numerals, and the last two shape parameters numerals only (this limitation will also be removed in a future release). If a statement like: Distrib(Var, <iType>, Prediction(<Var>), Prediction(<Other_Var>), ...); is used, the two variables Var and Other_Var must have identical output times. It is then useful to group them in the same Print() statement. The other tool MCSim brings you to build a complete statistical model is the Level keyword. The use of this keyword, is described below (see section Specifying a statistical model). Finally, the format of the output file of MCMC simulations is discussed in a later section (see section Analyzing results). SetPoints() specification To impose a series of set points (i.e., already tabulated values for the parameters), the global section can include a SetPoints() specification. It allows you to perform additional simulations with previously Monte Carlo sampled parameter values, eventually filtered. You can also generate parameters values in a systematic fashion, over a grid for example, with another program, and use them as input to MCSim. Importance sampling, latin hypercube sampling, grid sampling, can be accommodated in this way. This command specifies an output filename, the name of a text file containing the chosen parameter values, the number of simulations to perform and a list of model parameters to vary. It has the following syntax: SetPoints("<OutputFilename>", "<SetPointsFilename>", <nRuns>, <identifier>, <identifier>, ...); If a null string is given for the output filename, the set points output will be written to the same default output file used for Monte Carlo analyses, `simmc.out'. The set points file name is required and must refer to an existing file containing the parameter values to use. The first line of the set points file is skipped and can contain column headers, for example. Each of the other lines should contain an integer (e.g., the line number) followed by values of the various parameters in the order indicated in the SetPoints() specification. If extra fields are at the end of each line they are skipped. The first integer field is needed but not used (this allows you to directly use Monte Carlo output files for additional SetPoints simulations). The variable nRuns should be less or equal to the number of lines (minus one) in the set points file. If a zero is given, all lines of the file are read. The format of the output file of set points simulations is discussed below (see section Analyzing results). Following the SetPoints() specification, Distrib() statements can be given for parameters not already in the list (see section Distrib() specification). These parameters will be sampled accordingly before to performing each simulation. The shape parameters of the distribution specifications can reference other parameters, including those of the list. Specifying basic conditions to simulate Any simulation file must define at least one Experiment to simulate. Experiment definition After global simulation specifications, "experiments" must be included in the input file. They define simulation conditions and specify outputs. An "experiment" section starts with the keyword Experiment and is enclosed in curly braces. An "experiment" can make modifications to any model variable or parameter that was defined in the global section of the model description file. The syntax is the same, except that variables or parameters can only take constant values. So, for example, in an experiment the body weight could be modified with: BodyWt = 83.2; This overrides any previously assigned values, even if randomly sampled, for the specified parameter. Inputs can be redefined with the input functions listed in the Mod reference section above (see section Defining Models). Input functions can reference other variables (eventually random), as in: Q_gav = PerExp(GavMag, 60, 0, RateConst); The maximum number of experiments definable is 200. This can be changed by changing MAX_INSTANCES and MAX_EXPERIMENTS in the header file `sim.h' and recompiling. Within an experiment definition, or at the global level (if you want them to apply to all experiments), several additional specifications can also be used: * StartTime(), * Print(), * PrintStep(), * Data(). StartTime() specification The origin of time for a simulation, if it needs to be defined, is specified with the StartTime() specification: StartTime(<initial-time>); If this specification is not given, a value of zero is used by default. The final time is automatically computed to match the largest output time specified in the Print() or PrintStep() statements (see section Print() specification; see section PrintStep() specification). Print() specification The value of any model variable, input, output or parameter can be requested for output with Print() specifications. Their arguments are a list of names of variables (at least one and up to 10), and a list of increasing times at which to output their value: Print(<identifier1>, <identifier2>, ..., <time1>, <time2>, ...); The same output times are used for all the variables specified. The size of the time list is only limited by the available memory. The limit of 10 variables names can be increased by changing MAX_PRINT_VARS in the header file `sim.h' and recompiling. The number of Print() statements you can used in a given Experiment section is only limited by the available memory. PrintStep() specification The value of any model variable, input or parameter can be also output with PrintStep specifications. They allow dense printing, suitable for smooth plots, for example. The arguments are the name of only one variable, the first output time, the last one, and a time increment: PrintStep(<identifier>, <start-time>, <end-time>, <time-step>); The final time has to be superior to the initial time and the time step has to be less than the time span between end and start. If the time step is not an exact divider of the time span the last printing step is shorter and the last output time is still the end-time specified. The number of outputs produced is only limited by the memory available at run time. You can use several PrintStep() in a given Experiment section. Data() specification Experimental observations of model variables, inputs, outputs, or parameters, can be specified with the Data() command. Markov chain Monte Carlo sampling requires that you specify Data() statements (see section MCMC() specification; see section Specifying a statistical model). The data are then used internally to evaluate the likelihood function for the model. The arguments are the name of the variable for which observations exist, and a list of data values: Data(<identifier>, <value1>, <value2>, ...); This specification can only be used with a matching Print() or PrintStep() for the same variable (see section Print() specification; see section PrintStep() specification). You must make sure that there are as many data values in the Data() specification as output time requested in the corresponding Print() or PrintStep(). A data value of "-1" is treated as "missing data" and ignored in likelihood calculations. The convention "-1" can be changed by changing INPUT_MISSING_VALUE in the header file `mc.h' and recompiling. Specifying a statistical model Statistical models are defined in the simulation specification file, rather than in the model definition file. It is necessary to define a statistical model (with parameter dependencies, prior distributions and likelihood) if you want to use MCMC sampling. MCMC sampling will then give you in output a sample of parameters drawn from their joint posterior distribution. Take for example the simple linear regression model: The first thing to do is to define a model to compute y as a function of x. Here is such a model (quite similar to the one distributed with MCSim source code (see section `linear.model'): # --------------------------------------------- # Model definition file for a linear model # --------------------------------------------- Outputs = {y}; # Model parameters Alpha = 0; Beta = 0; Sigma2 = 1; x_bar = 0; CalcOutputs { y = Alpha + Beta * (t - x_bar); } # --------------------------------------------- The parameters' initialization values are arbitrary, and could be anything reasonable. They will be changed or sampled through the input file. Note that is not used in the model equations, but still needs to be defined here in order to be part of the statistical model. On the other hand, Finally x is replaced by the time, t, for convenience. An alternative would be to define an input `x' and use it instead of t. We now need to write an input file specifying the distribution of y (i.e., the likelihood), and the prior distributions of the various parameters. Here is what such a file could look like: # --------------------------------------------------------------- # Simulation input file for a linear regression # --------------------------------------------------------------- SimType (MCMC); MCMC ("linear.MCMC.out", "", "", 50000, 0, 5, 40000, 63453.1961); Level { Distrib(Alpha, Normal_v, 0, 10000); Distrib(Beta, Normal_v, 0, 10000); Distrib(Sigma2, InvGamma, 0.01, 0.01); Distrib(y, Normal_v, Prediction(y), Sigma2); Experiment { x_bar = 3.0; PrintStep (y, 1, 5, 1); Data (y, 1, 3, 3, 3, 5); } } # end Level End # --------------------------------------------------------------- The file begins the obvious SimType() (see section SimType() specification) and MCMC() (see section MCMC() specification) keywords. The keyword Level comes next. Level is used to specify the dependence between model parameters in a hierarchy. There should be at least one Level in every MCMC input file, even for a non-hierarchical model like the one above (actually, "non-hierarchical" models can be thought of has having only one level of hierarchy). See below for further discussion of the Level keyword. You can also look at the MCMC input files provided as examples with MCSim source code. The Distrib() statements define the parameter priors. Normal_v specifications are used since we use variances instead of standard deviations. The inverse-Gamma distribution is used for the variance component, since the precision is supposed to be Gamma-distributed. The likelihood is the distribution of the data, given the model: it is also specified by a Distrib() statement, valid for every y data point. Again, note that the variable is not used. Instead the Prediction(y) specification is used to signify the linear model output. These distributions are in effect for every sub-level or every Experiment included in the current level. The "simulations" to perform, and the corresponding data values, are specified by the Experiment section. Only one Experiment is needed here, but several could be specified. In this section, the value of is provided. The different values of x (time in our model) can be specified via a PrintStep() specification (see section PrintStep() specification), since they are equally spaced. More generally, a Print() specification could have been used (see section Print() specification). The data values are given in a Data() statement. Level definition Markov chain Monte Carlo simulations require the definition of a statistical model and the use of the Level keyword. At least one level must be defined. A level section starts with the keyword Level and is enclosed in curly braces. It can include any number of sub-levels or Experiments. Experiments (where the data are specified) form the lowest level of the hierarchy (see section Experiment definition. There must be one and only one top level and at most 10 sub-levels in the hierarchy. This limit of 10 levels can be increased (up to 255) by changing MAX_LEVELS in the header file `sim.h' and recompiling. A level can make modifications to the sampling distribution of any model parameter. For example: Level { # this is the top level Distrib(A, Uniform, 0, 1); Level { # this is sub-level 1 Distrib(A, Normal, A, 1); Experiment { ... } # experiment 1 Experiment { ... } # experiment 2 } } These distribution assignments apply to all sub-levels of the level where they take place. If several assignments are given, their position within the level section is irrelevant (although a logical order is recommended for clarity). A level can also make modifications to any model parameter that was defined in the global section of the model description file. The syntax is the same, except that variables can only take constant values. So, for example, in an experiment, the parameter A could be modified with: A = 2.0; This overrides any previously assigned values, even if randomly sampled, for the specified parameter. This assignment also applies to the sub-levels of the level where they take place. An important concept to grasp here is that of "instance". In the code fragment given above, the parameter A, defined at sub-level 1, is "cloned" as many times as there are sub-levels or experiments enclosed in sub-level 1 (hence, it will be cloned twice in the example above, once for each Experiment defined). In that way, the parameters distributions defined at one level in fact apply to the next lower sub-level, or at the Experiment level. This convention saves a lot writing and effort in the long run. For example, the uniform distribution assigned to A, at the top level, applies to the sub-level 1. There is only one "clone" of A at sub-level 1 since only one sub-level is included in the top level. In contrast, two normally-distributed "clones" of A will be defined and sampled. The first one will apply to experiment 1, and will be conditioned by the data of that experiment only, and the other will apply to experiment 2. A total of three variables of "type" A will be sampled and will be printed in the output file (coded so that the position in the hierarchy is apparent): the "parent" A(1), a priori uniformly distributed, and two "dependents" A(1.1) and A(1.2), a priori normally distributed around A(1). Analyzing results The output from Monte Carlo or SetPoints simulations is a tab-delimited text file with one row for each run (i.e., parameter set) and one column for each parameter and output in the order specified. Thus each line of the output file is in the following order: <# of run> <parameters> <outputs for Exp 1> <outputs for Exp2> ... The parameters are printed in the order they were sampled or set. The first line gives the column headers. A variable called name requested for output in an experiment i at a time j is labeled name_i.j. The output of Markov chain Monte Carlo simulations is also a text file with one row for each run. It displays a column of iteration labels, and one column for each parameter sampled. The last three columns contain respectively, the sum of the logarithms of each parameter's density given its parents' values (`LnPrior'), the logarithm of the data likelihood (`LnData'), and the sum of the previous two values (`LnPosterior'). The first line gives the column headers. On this line, parameters names are tagged with a code identifying their position in the hierarchy defined by the Level statements. For example, the second instance of a parameter called name placed at the fist level of the hierarchy is labeled name(2); the first instance of the same parameter placed at the second instance of the second level of the hierarchy is labeled name(2.1), etc. The tab-delimited file can easily be imported into your favorite spreadsheet, graphic or statistical package for further analysis. Error Handling If integration fails for an Experiment in DefaultSim simulations no output is generated for that experiment, and the user is warned by an error message on the screen. In MonteCarlo or SetPoints simulations, the corresponding simulation line is not printed, but the iteration number is incremented. Finally, in MCMC simulations, the parameter for which the data likelihood was computed is simply not updated (which implicitly forbids the uncomputable region of the parameter space). In all cases an error message is given on the screen, or wherever the screen output has been redirected. Bibliographic References Barry, T. M. (1996). Recommendations on the testing and use of pseudo-random number generators used in Monte Carlo analysis for risk assessment. Risk Analysis 16:93-105. Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Wiley, New York. Bois, F. Y., Gelman, A., Jiang, J., Maszle, D., Zeise, L. and Alexeef, G. (1996). Population toxicokinetics of tetrachloroethylene. Archives of Toxicology 70:347-355. Bois, F. Y., Zeise, L. and Tozer, T. N. (1990). Precision and sensitivity analysis of pharmacokinetic models for cancer risk assessment: tetrachloroethylene in mice, rats and humans. Toxicology and Applied Pharmacology 102:300-315. Gear, C. W. (1971a). Algorithm 407 - DIFSUB for solution of ordinary differential equations [D2]. Communications of the ACM 14:185-190. Gear, C. W. (1971b). The automatic integration of ordinary differential equations. Communications of the ACM 14:176-179. Gelman, A. (1992). Iterative and non-iterative simulation algorithms. Computing Science and Statistics 24:433-438. Gelman, A., Bois, F. Y. and Jiang, J. (1996). Physiological pharmacokinetic analysis using population modeling and informative prior distributions. Journal of the American Statistical Association 91:1400-1412. Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (1995). Bayesian Data Analysis. Chapman & Hall, London. Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences (with discussion). Statistical Science 7:457-511. Hammersley, J. M. and Handscomb, D. C. (1964). Monte Carlo Methods. Chapman and Hall, London. Manteufel, R. D. (1996). Variance-based importance analysis applied to a complex probabilistic performance assessment. Risk Analysis 16:587-598. Park, S. K. and Miller, K. W. (1988). Random number generators: good ones are hard to find. Communications of the ACM 31:1192-1201. Press, W. H., Flannery, B. P., Teukolsky, S. A. and Vetterling, W. T. (1989). Numerical Recipes (2st ed.). Cambridge University Press, Cambridge. Smith, A. F. M. (1991). Bayesian computational methods. Philosophical Transactions of the Royal Society of London, Series A 337:369-386. Smith, A. F. M. and Roberts, G. O. (1993). Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods. Journal of the Royal Statistical Society Series B 55:3-23. Vattulainen, I., Ala-Nissila, T. and Kankaala, K. (1994). Physical tests for random numbers in simulations. Physical Review Letters 73:2513-2516. Common Pitfalls The following mistakes are particularly easy to make, and sometimes hard to notice, or understand at first. * Putting a space before the end of line ; in the model definition file may causes strange error messages. * Forgetting about type-related arithmetics in C: `1000/882' gives `1' since it is interpreted as an integer division by the compiler. To get a floating-point (usual) division use `1000./882.'. Using make Make is a utility that facilitates doing repetitive tasks like compilation. A `makefile' is a text file that contains a description of what make should do and under what circumstances. For example the compilation `Makefile' included with the MCSim distribution only compile a C-file if it has changed since the last compilation. This means that when you change your model and create a new `model.c' file using mod, only the `model.c' file needs to be compiled to recreate the simulation engine. Before you run make for the first time on a machine you must change some settings in the makefile to specify where your C compiler is on your file system, and some special settings for that compiler. Refer to the documentation (or manual pages in Unix) for your compiler to do this. In the makefile file there are several variables defined which you may need to change. They are described in the makefile itself. You run the make program by entering make or make -f Makefile at the prompt from the directory where the program is that you want to compile. Examples You will find here some examples of model description files and simulation iput files. `linear.model' # Linear Model with a random component # y = A + B * time + N(0,SD_true) # Setting SD_true to zero gives the deterministic version #--------------------------------------------------------- # Outputs Outputs = {y}; # Model Parameters A = 0; B = 1; SD_true = 0; SD_esti = 0; CalcOutputs { y = A + B * t + NormalRandom(0,SD_true); } `1cpt.model': A sample model description file # One Compartment Model # First order input and output #--------------------------------------------------------- # Inputs Inputs = {Dose}; # Outputs Outputs = {C_central, AUC, ln_C_central, ln_AUC, SD_C_computed, SD_A_computed}; # Model Parameters ka = 1; ke = 0.5; F = 1; V = 2; # Statistical Parameters SDb_ka = 0; SDw_ka = 0; SDb_ke = 0; SDw_ke = 0; SDb_V = 0; min_F = 0; max_F = 0; SD_C_central = 0; SD_AUC = 0; CV_C_cen = 0; CV_AUC = 0; CV_C_cen_true = 0; CV_AUC_true = 0; # Calculate Outputs CalcOutputs { # algebraic equation for C_central C_central = (ka != ke ? (exp(-ke * t) - exp(-ka * t)) * F * ka * Dose / (V * (ka - ke))): exp(-ka * t) * ka * t * F * Dose / V); # algebraic equation for AUC AUC = (ka != ke ? ((1 - exp(-ke * t)) / ke - (1 - exp(-ka * t)) / ka) * F * ka * Dose / (V * (ka - ke))): F * Dose * (1 - (1 + ka * t) * exp(-ka * t)) / (V * ke)); C_central = C_central + NormalRandom(0, C_central * CV_C_cen_true); AUC = AUC + NormalRandom(0, AUC * CV_AUC_true); ln_C_central = (C_central > 0 ? log (C_central) : -100); ln_AUC = (AUC > 0 ? log (AUC) : -100); SD_C_computed = (C_central > 0 ? C_central * CV_C_cen : 1e-10); SD_A_computed = (AUC > 0 ? AUC * CV_AUC : 1e-10); } # End of output calculations `perc.model': A sample model description file #--------------------------------------------------------- # perc.model # A four compartment model of Tetrachloroethylene (PERC) # and total metabolites. # Copyright (c) 1993. Don Maszle, Frederic Bois. All rights reserved. #--------------------------------------------------------- # States are quantities of PERC and metabolite formed, they can be output States = {Q_fat, # Quantity of PERC in the fat Q_wp, # ... in the well-perfused compartment Q_pp, # ... in the poorly-perfused compartment Q_liv, # ... in the liver Q_exh, # ... exhaled Qmet}; # Quantity of metabolite formed # Extra outputs are concentrations at various points Outputs = {C_liv, # mg/l in the liver C_alv, # ... in the alveolar air C_exh, # ... in the exhaled air C_ven, # ... in the venous blood Pct_metabolized, # % of the dose metabolized C_exh_ug}; # ug/l in the exhaled air Inputs = {C_inh} # Concentration inhaled # Constants # Conversions from/to ppm: 72 ppm = .488 mg/l PPM_per_mg_per_l = 72.0 / 0.488; mg_per_l_per_PPM = 1/PPM_per_mg_per_l; #--------------------------------------------------------- # Nominal values for parameters # Units: # Volumes: liter # Vmax: mg / minute # Weights: kg # Km: mg / minute # Time: minute # Flows: liter / minute #--------------------------------------------------------- InhMag = 0.0; Period = 0.0; Exposure = 0.0; C_inh = PerDose (InhMag, Period, 0.0, Exposure); LeanBodyWt = 55; # lean body weight # Percent mass of tissues with ranges shown Pct_M_fat = .16; # % total body mass Pct_LM_liv = .03; # liver, % of lean mass Pct_LM_wp = .17; # well perfused tissue, % of lean mass Pct_LM_pp = .70; # poorly perfused tissue, will be recomputed in scale # Percent blood flows to tissues Pct_Flow_fat = .09; Pct_Flow_liv = .34; Pct_Flow_wp = .50; # will be recomputed in scale Pct_Flow_pp = .07; # Tissue/blood partition coeficients PC_fat = 144; PC_liv = 4.6; PC_wp = 8.7; PC_pp = 1.4; PC_art = 12.0; Flow_pul = 8.0; # Pulmonary ventilation rate (minute volume) Vent_Perf = 1.14; # ventilation over perfusion ratio sc_Vmax = .0026; # scaling coeficient of body weight for Vmax Km = 1.0; # The following parameters are calculated from the above values in # the Scale section before the start of each simulation. # They are left uninitialized here. BodyWt = 0; V_fat = 0; # Actual volume of tissues V_liv = 0; V_wp = 0; V_pp = 0; Flow_fat = 0; # Actual blood flows through tissues Flow_liv = 0; Flow_wp = 0; Flow_pp = 0; Flow_tot = 0; # Total blood flow Flow_alv = 0; # Alveolar ventilation rate Vmax = 0; # kg/minute #--------------------------------------------------------- # Dynamics # Define the dynamics of the simulation. This section is # calculated with each integration step. It includes # specification of differential equations. #--------------------------------------------------------- Dynamics { # Venous blood concentrations at the organ exit Cout_fat = Q_fat / (V_fat * PC_fat); Cout_wp = Q_wp / (V_wp * PC_wp); Cout_pp = Q_pp / (V_pp * PC_pp); Cout_liv = Q_liv / (V_liv * PC_liv); # Sum of Flow * Concentration for all compartments dQ_ven = Flow_fat * Cout_fat + Flow_wp * Cout_wp + Flow_pp * Cout_pp + Flow_liv * Cout_liv; # Venous blood concentration C_ven = dQ_ven / Flow_tot; # Arterial blood concentration # Convert input given in ppm to mg/l to match other units C_art = (Flow_alv * C_inh / PPM_per_mg_per_l + dQ_ven) / (Flow_tot + Flow_alv / PC_art); # Alveolar air concentration C_alv = C_art / PC_art; # Exhaled air concentration C_exh = 0.7 * C_alv + 0.3 * C_inh / PPM_per_mg_per_l; # Differentials dt (Q_exh) = Flow_alv * C_alv; dt (Q_fat) = Flow_fat * (C_art - Cout_fat); dt (Q_wp) = Flow_wp * (C_art - Cout_wp); dt (Q_pp) = Flow_pp * (C_art - Cout_pp); # Quantity metabolized in liver dQmet_liv = Vmax * Q_liv / (Km + Q_liv); dt (Q_liv) = Flow_liv * (C_art - Cout_liv) - dQmet_liv; # Metabolite formation dt (Qmet) = dQmet_liv; } # End of Dynamics #--------------------------------------------------------- # Scale # Scale certain model parameters and resolve dependencies # between parameters. Generally the scaling involves a # change of units, or conversion from percentage to actual # units. #--------------------------------------------------------- Scale { # Volumes scaled to actual volumes BodyWt = LeanBodyWt/(1 - Pct_M_fat); V_fat = Pct_M_fat * BodyWt/0.92; # density of fat = 0.92 g/ml V_liv = Pct_LM_liv * LeanBodyWt; V_wp = Pct_LM_wp * LeanBodyWt; V_pp = 0.9 * LeanBodyWt - V_liv - V_wp; # 10% bones # Calculate Flow_alv from total pulmonary flow Flow_alv = Flow_pul * 0.7; # Calculate total blood flow from the alveolar ventilation rate and # the V/P ratio. Flow_tot = Flow_alv / Vent_Perf; # Calculate actual blood flows from total flow and percent flows Flow_fat = Pct_Flow_fat * Flow_tot; Flow_liv = Pct_Flow_liv * Flow_tot; Flow_pp = Pct_Flow_pp * Flow_tot; Flow_wp = Flow_tot - Flow_fat - Flow_liv - Flow_pp; # Vmax (mass/time) for Michaelis-Menten metabolism is scaled # by multiplication of bdw^0.7 Vmax = sc_Vmax * exp (0.7 * log (LeanBodyWt)); } # End of model scaling #--------------------------------------------------------- # CalcOutputs # The following outputs are only calculated just before values # are saved. They are not calculated with each integration step. #--------------------------------------------------------- CalcOutputs { # Fraction of TCE metabolized per day Pct_metabolized = (InhMag ? Qmet / (1440 * Flow_alv * InhMag * mg_per_l_per_PPM) : 0); C_exh_ug = C_exh * 1000; # milli to micrograms } # End of output calculation `perc.lsodes.in' #--------------------------------------------------------- # perc.lsodes.in # # Copyright (c) 1993. Don Maszle, Frederic Bois. All rights reserved. # #--------------------------------------------------------- SimType (DefaultSim); Integrate (Lsodes, 1e-4, 1e-6, 1); #--------------------------------------------------------- # The following experiment is for a simulation of one of Dr. Monster's # exposure experiments described in "Kinetics of Tetracholoroethylene # in Volunteers; Influence of Exposure Concentration and Work Load," # A.C. Monster, G. Boersma, and H. Steenweg, # Int. Arch. Occup. Environ. Health, v42, 1989, pp303-309 # # The paper documents measurements of levels of TCE in blood and # exhaled air for a group of 6 subjects exposed to # different concentrations of PERC in air. # # Inhalation is specified as a dose of magnitude InhMag for the # given Exposure time. # # Inhalation is given in ppm #--------------------------------------------------------- Experiment { InhMag = 72; # ppm Period = 1e10; # Only one dose Exposure = 240; # 4 hour exposure # measurements before end of exposure and at [5' 30'] 2hr 18 42 67 91 139 163 Print (C_exh_ug, 239.9 245 270 360 1320 2760 4260 5700 8580 10020 ); Print (C_ven, 239.9 360 1320 2760 4260 5700 8580 10020 ); } END. Concept Index ' * '#' sign * ';' sign a * Analyzing results * Assignment b * Beta distribution * Bibliographic references * Binomial distribution * Blank lines c * CalcOutputs, output specification * CDFNormal() function * Chi2 distribution * Colon conditional assignment * Comments * Common pitfalls * Comparison operators * Conditional assignment * Cumulative density function, Normal d * Data() specification * DefaultSim * Defining models * Density function, Normal * Derivative specification * Distrib() specification * Distribution, beta * Distribution, binomial * Distribution, chi2 * Distribution, exponential * Distribution, gamma * Distribution, inverse-gamma * Distribution, lognormal * Distribution, lognormal_v * Distribution, loguniform * Distribution, normal * Distribution, normal_v * Distribution, piecewise * Distribution, Poisson * Distribution, triangular * Distribution, truncated lognormal * Distribution, truncated normal * Distribution, trunclognormal_v * Distribution, truncnormal_v * Distribution, uniform * Dt() operator * Dynamics specifications e * Erfc() function * Error function * Error handling * Euler integrator * Examples, Examples * Experiment definition * Exponential distribution f * Function, CDFNormal() * Function, erfc() * Function, lnDFNormal() * Function, lnGamma() * Function, NDoses * Function, NormalRandom() * Function, PerDose() * Function, PerExp() * Function, Spikes() * Function, UniformRandom() * Functions, input * Functions, special g * Gamma distribution * Gamma function * Global specifications i * Input functions * Input variables * Installation * Integrate() specification * Integration routine, Euler * Integration routine, Lsodes * Integration variable * InvGamma distribution l * Level definition * License * LnDFNormal() function * LnGamma() function * Lognormal distribution * LogNormal_v distribution * LogUniform distribution * Lsodes integrator m * Major changes in versions 4.2.0 * Make * Markov-chain Monte Carlo simulations, Markov-chain Monte Carlo simulations * MCMC simulations * MCMC() specification * Mod syntax * Mod usage * Model definition files * Models * Monte Carlo * MonteCarlo() specification n * NDoses() function * Normal cumulative density function * Normal density function * Normal distribution * Normal_v distribution * NormalRandom() function o * Output specification * Output variables * OutputFile() specification * Overview p * Parameter declaration * Parameter scaling * PerDose() function * PerExp() function * Piecewise distribution * Pitfalls * Poisson distribution * Print() specification * PrintStep() specification r * Random number, normal * Random number, uniform s * Scale, scaling specification * Semi-colon * SetPoint() specification * SetPoints simulations * SimType() specification * Simulation definition files * Simulation file, syntax * Software license * Special functions * Specification, Data() * Specification, Distrib() * Specification, Integrate() * Specification, MCMC() * Specification, MonteCarlo() * Specification, OutputFile() * Specification, Print() * Specification, PrintStep() * Specification, SetPoint() * Specification, SimType() * Specification, StartTime() * Specification, statistical model * Specifying simulations * Spikes() function * StartTime() specification * State variables * Statistical model specification * Style * Syntax for mod * Syntax of simulation files t * Triangular distribution * TruncLogNormal distribution * TruncLogNormal_v distribution * TruncNormal distribution * TruncNormal_v distribution u * Uniform distribution * UniformRandom() function * Unix make utility w * Working Through an Example ---------------------------------------------------------------------------- This document was generated on 15 November 1997 using the texi2html translator version 1.51.