Exposure-age data archiving performance experiment
A notable aspect of the U.S. National Science Foundation’s Antarctic Research program is that there are some data-availability and archiving requirements not generally present in other programs. This is in part a responsibility of NSF under the Antarctic Treaty system, which obligates treaty nations to report on their scientific activities to other treaty nations. The current implementation of these requirements at the level of most project PIs is that PIs must create some sort of a pointer to any data sets created during the project in an index maintained by NASA and known as the Antarctic Master Directory. My understanding of this policy is, furthermore, that not only must some sort of a pointer to the data be in the AMD by the end of the grant period, but all data collected in the course of the project must also be publicly available, presumably online, no later than two years after data collection. Although I think this is an excellent policy in principle and I do my best to comply in letter and spirit, the reality is that it’s actually not at all trivial to do this in a comprehensive way that really makes the data useful and accessible to others. Part of putting the ICE-D:Antarctica database together was to make it a little easier for me to accomplish this.
But recently in putting together the ICE-D database, the overall idea of which is that it is supposed to contain all known cosmogenic-nuclide exposure-age data for Antarctica, it seemed like a sensible idea to see exactly what cosmogenic-nuclide data were accessible through the AMD — the hope being that some of the large known inventory of unpublished cosmogenic-nuclide data from Antarctica would be indexed and archived there, thus facilitating its inclusion in the ICE-D database.
This page shows the results of my experiment. Basically, I used the search utility available on the AMD front page to conduct a full-text search for words such as “cosmogenic,” “exposure-age,” and similar terms that seemed relevant. I then attempted to navigate through links in each AMD entry that this search located, so as to actually locate the data described in the entry. This exercise was interesting. At the time of this writing, I found a total of 34 entries in the AMD whose description indicated that they might point to useful exposure-age data. In 14 of these cases, I was easily able to follow links to obtain a data set that closely resembled what was described in the AMD entry. In an additional 10 cases, I was able to navigate to at least some data that comprised part of the data set described in the AMD entry, and in some of these cases I was able to use additional knowledge (for example, I independently knew where the data were located on a different web site, or in a publication, that was not described in the AMD entry) to obtain all the data or verify that it was publicly accessible. In ten cases, I was not able to follow links to obtain anything that remotely resembled the data described in the entry; links were either dead or uninformative. Again, this page shows all the details.
I think this is interesting. Clearly the requirement to index data generated in Antarctic research projects has a non-zero positive result; in the aggregate, I obtained a significant amount of information that does not otherwise appear in easily accessible publications. However, many of the AMD entries contain extremely sparse information that may minimally comply with the letter of the NSF indexing requirement, but is far from complying with the spirit of the overall goal of public access to data.