System evaluation based on past performance: Random Signals Test

Hypothesis Test

We evaluate a system by asking whether its high past performance can be achieved by random trading with a reasonably high probability. This question is formally viewed as a hypothesis test. The null hypothesis is that the system in question is making random trades.

If this hypothesis cannot be rejected, the system should not be traded. The alternate hypothesis is that, since the system’s performance is so high, the trades it makes are not random. A system should be traded only if the null is rejected in favor of the alternate.

To perform this test, we must know the probability distribution of a performance measure under the null hypothesis of random trading; call it the performance distribution for short. Based on this performance distribution, we calculate a critical value. If c is this critical value, the hypothesis test is

· H0: System is bad. System’s performance is indistinguishable from performance of random trading. Performance (System) = c .

· HA: System is good. System’s performance is better than performance of random trading. Performance (System) > c .

Distribution of Performance Under the Null Hypothesis

Define the random system as a series of random trades on the same price series that is used to calculate the performance of the system being tested. The performance from one run of the random system is one draw from the distribution under the null hypothesis of random trading. Make an arbitrarily large number of such draws thus reconstructing the performance distribution.

Distribution of random system’s trades

The trades issued by the random system are randomly picked from some distribution. This distribution should match closely the distribution of trades issued by the system being tested. This is so that the random system and the system being tested could not be distinguished solely on the types of trades they issue; performance is the only potential distinguishing factor.

All trades are defined by three trade characteristics, namely

1. Number of contracts (for example, -2 for short two contracts);

2. Transaction cost (including commission, slippage, etc.); and

3. Trade duration.

This paper assumes that the three trade characteristics are independent of each other, except that if a trade is of one type (short, flat, or long), then the next trade has to be of a different type. A trader can model trade characteristics by introducing other dependencies as well. For example, in another model, trade duration for flat trades could on average be different than trade duration for long or short trades.

Estimating distribution of trade characteristics of the system being tested

Estimate the distribution for number of contracts simply from observed probabilities. For example, if there are 40 trades in all and 15 of them are long one contract, the probability of being long one contract is 15/40=37.5%.

Assume that transaction cost has a normal distribution and estimate its mean and standard deviation. Since trade duration can only be positive, assume that it has a truncated normal distribution. Also, draws from this truncated normal have to be rounded to the nearest integer.

This approach for estimating the distribution of trade characteristics only works if the system being tested makes enough trades. As a rule of thumb, only apply this approach if the number of trades is at least 30. Alternatively, we can estimate the distribution of trade characteristics using Bayesian estimation. In this approach, the empirical probabilities are combined with prior beliefs about the system. The approach is more involved but is particularly suited for situations in which the system being tested does not make many trades.

Prof. Alex Strashny

Next: Performance Measures

Summary: Index