Using Advanced Analyzers — choosing their Parameter Values

Cutting your coat according to your cloth ...

All these rules of thumb are based on the assumption of a 1,000 point data set and using a 3.6 GHz Intel Pentium 4 with 1 Gb RAM (SpecFP2,000 ~ 1,900) — if you do not have a fast machine, or you need an answer in minutes rather than hours, forget about using the advanced methods. Sorry! It's just the way it is.

(Go to http://www.spec.org/ and get the floating point benchmarks for your machine — divide 1,900 by the benchmark for your machine and you will have the adjustment factor for timings relative to the test case.)

For all studies we suggest 5,000 runs MINIMUM and at least 10,000 for serious work.

Ordinary random walks run in minutes.

  • BPN analyzer
  • Number of inputs ~ 50-80
  • Number of hiddens ~ 120-200
  • 3 layer network — i.e.:
    • 1 input layer
    • 1 hidden layer
    • 1 output layer

We describe this as then, say a 50-120-1 feedforward network.

Using a four layer network (e.g., 50-120-30-1) MAY be better, but not by much and will take a lot longer to train. More than four layers is almost certainly a waste of time.

These kind of studies can be done in a couple of hours, say two to four.

To filter or not to filter?

Filtering can help with speed of training and accuracy, i.e., error measures of the trained network, but if you overdo it, you will, instead of uncovering the 'true signal' be introducing spurious order into the problem — you will be training your network to "see things which aren't really there" — this is bad enough, but if you subsequently overtrain the network as well... you will probably get some really crazy looking results.

Experimentation will be necessary to find suitable filters, so if you are in a hurry and don't like "fiddling about" too much, restrain yourself to the MCT algorithm and RAW input data.

(From a researcher's point of view, the interest of the advanced filtering is in seeing how wavelet transforms — a very exciting mathematical technique — can be used in stock market analysis; as far as I know this is still pretty much virgin territory.)

  • RNN analyzer
    • Similiar to BPN
  • SOM analyzer
  • BCOR analyzer

Data Resolution and Sampling

1,000 points is about:

  • 3 years of DAILY data
  • 3 months of HOURLY data, and so on...

So if you want to use, say 5,500 data points, or only 700 (I would use at least 500 as a minimum dataset size), then the above benchmarks scale linearly.

How far back you go in time with your input data depends on how far forward you want to get with your predictions, and with how far into the past you consider the data to be relevant to the current state (note that, unlike physical systems, markets change their behaviour over time — they have no universal laws which govern them.) How much data you can use for input is severely limited by your computational resources, as stated above. Obviously, selecting your input data — interval and samplng — is of prime importance.

Data drawn at different samplings will have different statistical properties, i.e., its apparent randomness can vary — this is to your advantage. That is to say, using lower resolution data (but not too low) may produce better results than using very high resolution data — which will tend to be very noisy indeed. This is interesting from the point of view of the small investor — he may feel himself disadvantaged at being unable to afford a real-time, non-delayed, high frequency, second by second datafeed, but in fact, having such a service would not necessarily do him any good anyway.

In choosing a sampling interval suitable for our advanced techniques we can use the histogram for help; if you see 'fat tails' in the histogram for your data at a particular resolution, it means that there are significant correlations within the data at this sampling level — which means the advanced, and very processor hungry, algorithms will have a reasonable chance of working well.