function openNewWindow(url) { window.open(url,"",'scrollbars=yes'); }
The two main steps here are
(i) a cosine correction on the actual B field values - this racked my brain for ages. It turns out the correct correction at any point is
correction_factor = sin (arccos(r/r_sun))
where r is the heliocentric distance. Then B_real = B_measured / correction_factor
(ii) unwrap the disc into a 2D cyclindrical equal area map. This one annoyed me too, but turns out to be pretty simple and is can be called a 'Lambert projection (according to the textbooks). However a google search for 'Lambert projection' produces lots of cartwheel-like maps, with the north pole at the centre?


I also ignore pixels near the edge of the corrected image, due to 'ghost regions'. The active region on the bottom left (50,50) is actually a ghost of the region at (250,250)
(1) Select the pixel with the largest absolute field. This becomes the seed pixel. Extract a window around this point
(2) Use a boundary extraction technique to define the active region. This uses a 50G (absolute field) threshold. The minimum number of contiguous pixels in a contour, and the maximum distance ('wander region') away from the seed varies according to the size of the region. This allows bigger regions to exist to a larger extent. The wander region is also twice as large in the X-axis as the Y-axis. This is becuase I want to make sure I get both polarities of an active region, but I don't want to pull a seperate active region into the area.


(3) From the initial seed, and with the boundary defined in (2), grow the active region. Basically any pixel contiguous to the seed is considered a 'candidate'. If any candidate has an absolute field greater than 50G, and appears inside a contour (this was the tricky bit), it becomes part of the seed. Then new candidates are defined, and the whole thing repeats until every pixel in the image is tested. There is a triple check in place here - the image in checked each row at a time, each column at a time, and against the binary thresholded image.
Each defined closed contour is given a different intensity value in the following image. In this case there are 24 defined contours making up this active region.

Calculate statistics for this region. These are position (in the corrected map), size (in Mm^2) of the positive field part, the negative field part and total size, flux (in Mx) of positive field part and of negative field part, total flux, total absolue flux, and a mixing parameter (how many contours contain both positive and negative flux?).
| x_pn | y_pn | ar_psz | ar_nsz | ar_sz | ar_pfx | ar_nfx | ar_fx | ar_fxabs | mix | contours |
| px | px | Mm^2 | Mm^2 | Mm^2 | *10^15Mx | *10^15Mx | *10^15Mx | *10^15Mx | ||
| 262 | 587 | 7258 | 9010 | 16268 | 95345 | -123386 | -28041 | 218730 | 1 | 24 |
Remove all the pixel belonging to this active region from the corrected image. Find the largest absolute field and repeat the whole process.

Keep repeating until certain selection criteria are no longer satisfied - no pixels are contoured or no pixels with abs(B)>50 remain.
At this stage there will be a bunch of defined active regions. Each active region in the image below has a different intensity value.

Compare this to the original corrected magnetogram. It seems to have done a pretty good job at defining each active region individually. It has also picked out some plage regions, single polarity regions etc. In this next image the red symbols are the (numbered) areas picked out by the program. The white symbols are active regions as identified by NOAA. Every NOAA active region (excepting limb regions, as I remove these pixels) is identified well.

The stats for each region are below. This can be compared to the flare rates for each region (form the SEC lists). For this day the first region produced 7 C-class and 2 M-class. Not surprising - it is the biggest in terms of area, flux, number of contours. It has obvious mixing in one contour. The third region produced 2 C-class flares the previous day. I haven't even thought about the other stats which can be extracted here, but obvious ones include fractal dimension, gradient across a neutral line, multiscale index etc. All this can be repeated for every MDI image in the sample - it's just computer time.
Title: Automated Boundary Extraction and Region Growing Techniques applied to
Solar Magnetograms
Abstract:
We present a novel automated approach to active region extraction from
full disc MDI longitudinal magnetograms. This uses a region growing
technique combined with boundary extraction to identify a number of
enclosed contours belonging any given active region. This provides an automated, daily record of active regions on the Sun and their physical parameters.