function openNewWindow(url) { window.open(url,"",'scrollbars=yes'); }

Multifractal Codes



Now the fractal paper is out of the way, the obvious step it to move to multifractal and multiscalar measures. The most common of these are the f(alpha) approach, (e.g., Lawrence papers), structure functions (e.g., Abramenko) and generalised correlation dimension (GCD; below). Specifically the GCD and f(alpha) are related by a Legendre transformation so codes should act as confirmation of each other. There are of course the complimentary multi-scalar approaches (wavelet-Russ, shapelet-Alex etc.). I've developed a generalised correlation dimension approach, keeping in mind the need for precision and accuracy whilst keeping computational time within reason. The first step is to generate a multifractal, so as to test the code. The following will generate this multifractal
calc_coords.pro
multifrac_synth.pro
split_image.pro
IDL>multifrac_synth, mf
IDL>tvim,mf,range=[0,100]
IDL>tvim,mf,range=[0,10]
IDL>tvim,mf,range=[0,1]





This is the multifractal suggested by the Lawrence et al. papers. The left image is scaled from 0 to 100, middle image is 0 to 10, right image is 0 to 1

This figure is generated by dividing a uniform density square into 4 equal areas and dividing the measure according to to probabilities M1 and M2 from a PDF of P(M)=4M, for M=0->0.5 and P(M)=4(1-M) for M=0.5->1.0. Then each sub-area (now each with its own uniform density) is split up according to the same rules. This is repeated down to the single pixel size. The division of the measure makes the result multifractal and the degree of multi-fractality can be calculated exactly (Lawrence et al. 1996). I could do with more test images (Jack pointed out Gaussian random variable is a good one to try).


Generalised Correlation Dimension

The standard generalised correlation dimensions are



where q is any real number (but is normally positive, so as to have a real meaning), r is the dimensionless box size (r=a/sqrt(XY), where a is the boxsize in pixels and the image is X by Y), p_i is the count in each box divided by the total count (hence is a probability) and the summation is over the N boxes (where N is sqrt(XY)/a). For the generated multifractal above the D_q against q spectrum looks like this






The D(0) is 2, as this is the Euclidean dimension of the supporting set, and the more/quicker that D(q) deviates from this value as q increases, the greater the multifractality of the image.

Algorithm Issues

This simple box counting algorithm is known to suffer from problems, so several corrections can be made.

(1) At large box sizes there are few available boxes (i.e., small N) and the result is highly dependent on the position of the grid (or to put it another way, it is highly dependent on the data window). But, instead of counting boxes in a fixed grid, the algorithm takes M random positioned boxes, sums over all these, and normalizes back down to the required number.

(2) Furthermore by repeating this entire operation V times its possible to put an estimate (spread) on the error. And as these values are Gaussian distributed about the real value, this can be passed to the IDL REGRESS program.

(3) At each q the D(q) is then just the result of a linear regression in log space. The problems here are
(A) what is the linear range to be plotted over (i.e. what is the maximum box size which should be included)
(B) getting a reliable goodness of fit test.

Both these can be addressed with by using the Q-test (called using IGAMMA in IDL) which depends on the number of free parameters (number of data points - 2) and the chi-squared value.




This is the result of the linear regression and Q-test over a range of *maximum* boxsizes (from log(10/512)=-1.709 to log(100/512)=-0.709) for q=5, M=100,000 and V=10 . The solid line is the chi-squared (left hand axis), the dotted line is the Q-test (from poor-0 to good-1), the dashed line is the calculated gradient (i.e. D(5)) and the thick solid line is the known value of D(5). As larger and larger boxes are included in the fit, the gradient becomes more precise (smaller error bars) but less accurate (too large). At the same time the chi-squared increases and the q-fit decreases. The important fact here is the sudden increase in chi-squared (and hence decrease in Q-test) occurs at the same maximum box size as where the calculated gradient equals the real value. Hence this test can be used to identify maximum box size and therefore the linear range. Hence the algorithm will know when to accept the fit and stop increasing the box size, decreasing the computational time.

Accuracy and Computational Issues

The biggest trouble is the number at the top of the previous figure - 6238 is the time in seconds (=103mins) to complete this calculation. The computation time is directly proportional to the number of randomly positioned boxes, M, and the number of repeated values, V. I've fixed the maximum boxsize, B, at 40 and ran a few tests. The next two figures are for M=10,000, V=10 (620 secs) and then M=10,000 V=5 (315 secs)





There are a few important things to note. At V=5 the calculated gradient never dips down to the known value. For V=10, the calculated gradient matches better with the known value, but the Q-test (dashed line) now only starts dropping at box sizes (and hence gradients) larger than wanted.




This plot shows the Dq versus q spectrum for M=100,000,V=10 (green), M=10,000, V=10 (orange), M=10,000, V=5 (red) and the calculated Dq spectrum (white line). (The D(0) is trivially 2, I still have a few small tweaks with the D(1) as it needs different coding). It's pretty obvious that to get both precision and accuracy means as large M and V as possible, hence a lot of CPU time.

I've got a couple of further tests

(1) Compare the results of M=100,000,V=10 against M=50,000,V=20 - these will both take the same time, but one may give better results?

RESULT:






Plot of the multifractal spectrum for
M=100,000,V=10 (green - each point takes 6200sec)
M=50,000,V=20 (red - each point takes 6200sec)
M=50,000,V=10 (orange - each point takes 3100sec)
As expected an increased M improves the accuracy, an increased V improves precision.

(2) See how high in q this can go.The Lawrence work gets to q=5 and stops. I think I can get higher. Also check interval in q (I'm simply using integer values of q but any value can be used) which gives acceptable results.

RESULT:






Plot of the multifractal spectrum for
M=50,000,V=10 (orange - each point takes 3100sec) for increasing steps of 0.1 in q
At low q, it is over estimated badly - This is because of the low M value, hence the q-fit test drops below 1 at a high maximum boxsize value - see the comparison of M=100,000, V=10 and then M=10,000 V=10 above. But at higher q, it fits pretty well with the theoretical spectrum, even up to q=6. I've got more tests running now.

(3) Find the minimum combination (i.e., minimum CPU time) of M, V, q and interval in q (I'm simply using integer values of q but any value can be used) which gives acceptable results.

Then we're ready for real data. Comments, suggestions anyone?


View Stats