Frequently asked questions
What is the Ecological Register?
It's a public, non-profit academic repository for ecological survey data. It's hosted at Macquarie University and operated by John Alroy (john.alroy@mq.edu.au). It has a MySQL backend database and web-database integration software written in Perl. It was opened to the web on 1 January 2013.
What does it include?
The database includes published scientific surveys that provide names of species and counts of individuals seen or collected in particular places. It does not include simple species occurrence records, which can be obtained from any number of other sites. There are no limits with respect to geographic regions or taxonomic groups, so you can find information about plants and animals from all sorts of habitats and from around the world.
What is it for?
Ecological count data (also called abundance distributions) are needed to calculate diversity estimates and descriptive statistics such as sampling-standardized richness, Shannon's H, Hurlbert's PIE, and Fisher's α. These statistics are used to understand how diversity relates to various factors such as climate, altitude, latitude, and disturbance (a central topic within the areas of biogeography and macroecology). Without quantifying these relationships we won't be able to predict the effects of global change on local ecological communities.
What can I do with the site?
Right now, you can view references to scientific publications yielding the data and individual ecological samples. There is a map interface (see the links on the home page) and you can download customized subsets of the data. I aim to develop other tools such as a subsampling curve calculator and abundance distribution fitter. Because body mass is a key focus of the macroecological literature, I am also adding data and tools concerning mass (and other measures of size) that are being integrated with the community-scale abundance data. So far, I have created a new bat body mass data set and uploaded the CRC bird data set. All the data will be provided to all interested academic organizations using web services when the website is in a more developed state.
How can I become involved?
You can ask me for a data entry account, which I'll set up right away, and then go straight to work. My e-mail address is john.alroy@mq.edu.au. Your data will be credited on every page, you will own copyright over your data (see below), and you can change or remove your data at any time.
Aren't there other databases like this one?
To my knowledge, there are no other databases that provide abundance data for all groups on a global scale (although PREDICTS is working towards a similar goal). Most of the nominally related databases focus on individual taxonomic groups or geographic regions, such as AmphibiaWeb, FishBase, or the Atlas of Australia. These databases do provide a broad array of information, but they emphasize point occurrence records and include little or no abundance data. The same thing goes for some important, global-scale mashup sites such as GBIF, the Map of Life, and Encylopedia of Life. There are a few data sets such as the Mammal Community Database and the Gentry forest surveys that do include abundances, but they are also quite focused. Finally, PREDICTS is focused on recently published data whereas the Register includes as much historical data as possible. Most of these projects are also very different because they either (a) include only very simple metadata about the surveys even when counts are given, (b) have no websites with integrated database backends, (c) don't allow volunteer data contributions, and/or (d) aren't being updated on a regular basis.
What is the governance structure?
Many similar databases have a management structure that includes something like a Director plus an Executive Committee. The Ecological Register has no analogous governance structure because it's merely a public data repository. There are no finances to oversee, no group projects to coordinate, no legal issues to navigate, and no difficult policies to deliberate. That said, if enough people become involved I think it would be a good idea to establish something like a Board of Editors.
Who runs the database?
I created it and oversee it. I own the domain name and hold copyright of the software and website design elements, and the site is hosted at my academic institution (Macquarie University). If at any point I am unable to continue working on the site, I will hand it over to another technically competent ecologist who is committed to making the data public and respecting the rights of the contributors.
In what sense is the database public?
It's public because anyone can use the data, anyone can contribute, and most importantly because the contributors own the data. Any contributor can choose to alter or remove the data at any time, and any contributor can choose to make the data available in some other format and on some other platform. It's important to understand that owning data and running a database are unrelated things.
Can scientists actually own data?
Yes. According to well-established law in most major countries, you can assert copyright over data records if they represent novel creative work on your part. By contrast, a simple list of data records directly copied from some other source (such as a phone book) can't be copyrighted. In the case of the Ecological Register, the sample metadata are so complex and so structured that they do represent significant creative work. Therefore, the contributors own the data. One exception is that the U.S. Federal Government holds copyright over intellectual property created by its employees in the course of their routine work. However, the National Science Foundation merely licenses works created by its grant awardees and does not hold copyright.
What are the terms of use?
The data are made available under a standard Creative Commons BY-NC-ND (Attribution/Noncommerical/No Derivative Works) license. This means you can reproduce the data in any medium, but you must credit the data contributor and the Ecological Register; you can't use the data for commercial purposes; and you can't adapt or change the work. I also strongly urge you tell me if you have used the data in a publication so I can maintain a list on the website (which I will create once I hear about any such papers!).
What are the advantages of relational databases?
The computer science community is highly interested in data mining topics such as ontologies, workflows, and so forth because there are huge amounts of unstructured but deeply interrelated data sets on the web. However, I believe that relational databases are more efficient and reliable for the purpose of conducting large-scale ecological data analyses. There's minimal IT overhead, and in the case of these data there are no overlapping data sets open to the internet that really need to be integrated. I think it's best to focus on getting the data into a well-designed, structured format from the start.
Aren't you also involved in Fossilworks and the Paleobiology Database?
I'm currently running Fossilworks, which is a portal to the PaleoDB that I created in 2013. I'm not involved in the PaleoDB's administration or IT operations. However, I founded the PaleoDB in 1998 (before its name change in 2000), created the website and software, managed the server and backend database, served as the main contact person, chaired its various governance committees, and contributed large amounts of data. I stepped back from all of these roles once I created the Fossilworks and Ecological Register sites.
How does the Ecological Register relate to Fossilworks?
The table structure, software, and website design are completely new and different. There is no exchange of data between the two sites and they are fully independent. However, there are some broad similarities in terms of the focus on published sample data.