The Data Sets

If you use these datasets, please cite:

Brown, K. M., Barrionuevo, G., Canty, A. J., De Paola, V., Hirsch, J. A., Jefferis, G. S. X. E., Lu, J., Snippe, M., Sugihara, I., & Ascoli, G. A. (2011). The DIADEM data sets: representative light microscopy images of neuronal morphology to advance automation of digital reconstructions. Neuroinformatics, 9(2-3), 143-157.

Note for DIADEM Final Round teams: Some modifications have been made to Final Round data set file names as was requested during the DIADEM Final Round Conference. Specifically, the ‘Cerebellar Climbing Fibers’ and ‘Olfactory Projection Fibers’ data set image stacks and SWC files have been changed from ordered by letter to ordered by number (e.g., ‘CF_A’ used in the Final Round is now ‘CF_3’, whereas ‘CF_1’ and ‘CF_2’ used for the Training and Qualifier Rounds, respectively, remain as ‘CF_1’ and ‘CF_2’). Although other data sets use letter-based ordering for Final Round data, these were left unaltered since the letters are still useful in distinguishing them from other data subsets. Training and Qualifier Round data sets have not been modified.

This documentation provides general instructions on how to use the DIADEM data sets. Specific instructions for each of the 6 data sets can be found at:

1. Cerebellar Climbing Fibers

2. Hippocampal CA3 Interneuron

3. Neocortical Layer 1 Axons

4. Neuromuscular Projection Fibers

5. Olfactory Projection Fibers

6. Visual Cortical Layer 6 Neuron


General Download Instructions:

1. Download RAR files.

2. Extract RAR files using software of choice. As an example, PeaZip (http://peazip.sourceforge.net/) is free and can be downloaded for Windows and Linux platforms.

3. Each RAR file contains both image stacks and corresponding “gold standard” digital reconstructions traced manually or semi-manually by the lab providers.

4. All images are in TIFF format.

5. All gold standard reconstructions are in non-proprietary SWC format (see FAQ for explanation of SWC format) and are provided to serve as standards for which to evaluate automated reconstruction quality.

6. As subjectivity is an inherent aspect of neuronal reconstruction, subjective reconstruction decisions may be present in the gold standards. These traces serve as gold standards not because they are known to be perfect but because they reach the limits of subjective accuracy.

7. Freely available ImageJ ( http://rsbweb.nih.gov/ij/) or Neuromantic ( http://www.rdg.ac.uk/neuromantic/) software can be downloaded to open image stacks. Neuromantic can also load SWC files to overlay on corresponding image stacks. Gold standard reconstructions will load in correct XYZ alignment with corresponding image stacks unless otherwise specified in the dataset-specific READMEs (see neuromuscular, neocortical, and visual cortical dataset READMEs in particular).

8. Neuromantic automatically converts image stacks to 8 bit. ImageJ will load image stacks with their original bit depth but may require changing default computer memory restrictions (see instructions). Numerous plugins have been created for ImageJ, some of which may aid in visualization and the reconstruction process (see http://rsb.info.nih.gov/ij/plugins).

9. TIFF files contain no inherent information regarding the physical size of the captured region of interest. Therefore, ImageJ and Neuromantic both use pixel-format X and Y coordinates for loaded image stacks and image sequence number within an image stack for Z coordinates (top image Z = 0, next image Z = 1, etc.).

10. Gold Standard reconstruction Z values also correspond to image sequence number (top image Z = 0, second image = 1, etc) within their associated image stacks. X and Y values correspond to pixel values.

11. Below each data set-specific README 'Experimental Procedures' section, the Z distance between images when converted to pixels is shown. This information can be used to compare physical Z distance relative to X and Y values.

12. Neuromantic (as of version 1.7.5) has the ability to load multiple image stacks and individually change their coordinate positions in X, Y and Z. Coordinate values for each stack loaded can be inputted in the 'Multistack GUI' window (window opens automatically if an image stack is loaded using 'Load and Add Stack' from the 'File' menu, or can be accessed from the 'Window' menu). Another change from Neuromantic v1.6.3 is in the way image stacks are loaded, where now only portions of large image stacks are loaded so that they will not crash the system (consider starting to load a large stack then canceling, which will load up just a small portion of images, while the rest can be loaded by scrolling through the stack). The latter change may actually impair functionality when loading large image stacks due to slow stack manipulation, so having the previous version of Neuromantic (v1.6.3) available is recommended.

13. All gold standard reconstructions for all data sets except for 'Subset 2' of the Neuromuscular Projection Fibers were originally traced or edited to contain only branch points that split into two daughter branches (i.e. bifurcations). Neuromuscular Projection Fibers 'Subset 2' contains both bifurcations and multifurcations. The DIADEM metric handles both bifurcations and multifurcations, but will score test reconstruction branch points accordingly. Therefore, if your algorithm does not handle multifurcations, you may wish to edit Neuromuscular Projection Fibers 'Subset 2' gold standard reconstructions to have bifurcations only.

14. For any questions concerning the data sets not answered in the provided READMEs or on the DIADEM website FAQ, contact Kerry Brown (kbrownk at gmail.com).