HP UX Archive Centre

This file describes the program "nvc", which stands for "Noise vs. Chaos."

It implements a statistical test designed to distinguish a time series
produced by a low-dimensional chaotic process from one produced by
noise, possibly with a non-white power spectrum.

The method is described in the paper:

Matthew Kennel and Steven Isabelle, "Method to distinguish
possible chaos from colored noise and determine embedding parameters",
to be published in Physical Review A,  September 15, 1992.

The text of the paper can be downloaded from the anonymous FTP archive
on lyapunov.ucsd.edu.  I strongly suggest you read the paper before trying
to use this program, otherwise you'll have no idea what's going on.


The program takes on input a scalar data set, and on output the computed
"Z statistic" that compares the nonlinear predictability in reconstructed phase
space of the input data set, as compared to matched noise signals.
It computes the statistic for a range of embedding dimensions and time
delays.


The program itself is written in 1/2 Pascal 1/2 Modula-2 1/2 Something else.
I translated it to C using the freely available 'p2c' program.

I provide the C files so that you should be able to compile it without
p2c.

Command line parameters: (all required)

nvc -N	number-of-points, must be a power of two.

       This is the total number of points to read from the time series.

    -delay tmin tmax dt

       The range of time delays (from "tmin" to "tmax" in steps of "dt")

     -pred prediction_time

       This is the time ahead to compute the predictions.  Typical values
       are on the order of the time delay or less.

     -dim dmin dmax

       The range of embedding dimensions to loop over.

     -correl decorrelation_time

       Don't accept as "neighbors" points whose time indexes are less than
       this number.  This number should be the approximate autocorrelation time
       of the original data set.

     -nfake number_of_fake_datasets

       For computing the statistic, this should be the number of fake
       data sets to synthesize.  A higher number here results in more
       statistical power, but longer running time.

      -decimate decimation_interval

	The interval over which the prediction errors are considered to
	be "statistically independent".  I usually take the autocorrelation
        time of the predictor errors.  There's no obviously correct 
	value for this, but is typically significantly lower than the
	autocorrelation time of the data set itself.  See the paper
        text.

       -o outputfile

	 Write the results of the computation to this file.  The output lines
	 look like:
	 "time_delay Z-value(d=dmin) Z(d=dmin+1) ... Z(d=dmax)"

       -A ascii_datafile
       -B double-precision binary datafile

	 These specify the name of the datafile of the time series
	 in question. There should be no formatting here, only numbers.


Optional parameters:

	-phasesonly
  	   Only randomize the phases in creating the fake data sets.
	   Try this if you're getting many failures of the K-S test
           in creating the fake data sets.  I haven't fully explored the
           consequences of this.  (not in paper).

	-dumpdata
	   Dump out, in an two-column ascii file named "DUMPDATA.dat" the
           gaussianized input data, and the synthesized data.  (real
	   is in first column, fake is in second) Do this
           if you want to see exactly what's going on.

	-dumperrs
	   Dump out the predictor errors in an ascii file name
	   "DUMPERRS.dat".  Note that these have NOT had their absolute
           value taken yet, which I do before computing the
	   mann-whitney-wilcoxon statistic.  Again, for debugging purposes.
		

good luck,
Matt Kennel
mbk@inls1.ucsd.edu
8/15/92