Biodiversity Data Journal : Editorial / Correspondence
|
Corresponding author: John Stephen Wood ([email protected])
Academic editor: Quentin Groom
Received: 16 Mar 2015 | Accepted: 14 Apr 2015 | Published: 17 Apr 2015
© 2015 John Stephen Wood, Fabio Moretzsohn, James Gibeaut.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation: Wood J, Moretzsohn F, Gibeaut J (2015) Extending Marine Species Distribution Maps Using Non-Traditional Sources. Biodiversity Data Journal 3: e4900. doi: 10.3897/BDJ.3.e4900
|
Background
Traditional sources of species occurrence data such as peer-reviewed journal articles and museum-curated collections are included in species databases after rigorous review by species experts and evaluators. The distribution maps created in this process are an important component of species survival evaluations, and are used to adapt, extend and sometimes contract polygons used in the distribution mapping process.
New Information
During an IUCN Red List Gulf of Mexico Fishes Assessment Workshop held at The Harte Research Institute for Gulf of Mexico Studies, a session included an open discussion on the topic of including other sources of species occurrence data. During the last decade, advances in portable electronic devices and applications enable 'citizen scientists' to record images, location and data about species sightings, and submit that data to larger species databases. These applications typically generate point data. Attendees of the workshop expressed an interest in how that data could be incorporated into existing datasets, how best to ascertain the quality and value of that data, and what other alternate data sources are available. This paper addresses those issues, and provides recommendations to ensure quality data use.
Species, Distribution, Crowdsource, IUCN, Red List, Protocol, Geographic Information Systems, GIS, Biodiversity Databases, citizen science.
“How can we standardize the methods used to incorporate point data into distribution range polygons? How can we accelerate the collection of observation (point) data”? These questions were posed to an international group of taxonomists during a workshop that was held in conjunction with the IUCN Red List Gulf of Mexico Fishes Assessment Workshop, which was held at the Harte Research Institute for Gulf of Mexico Studies on the campus of Texas A&M University-Corpus Christi in January 2014.
Red List Workshop (Jan. 2014) Attendees
First Name |
Last Name |
Affiliation |
Beth |
Polidoro |
IUCN/Arizona State University |
Bruce |
Collette |
Smithsonian Institute/ chair of Tuna and Billfishes SSG |
Christi |
Linardich |
IUCN/Old Dominion University |
Fabio |
Moretzsohn |
Harte Research Institute |
George |
Sedberry |
NOAA Office of National Marine Sanctuaries |
Gina |
Ralph |
IUCN/Old Dominion University |
Heather |
Harwell |
IUCN/Christopher Newport University |
Hector |
Espinosa-Perez |
Instituto de Biología, UNAM, Mexico |
Howard |
Jelks |
USGS Southeast Ecological Science Center |
James |
Tolan |
Texas Parks and Wildlife Department, Coastal Fisheries Division |
Jeff |
Williams |
Smithsonian Institute |
Jim |
Cowan* |
Louisiana State University |
John |
McEachran |
Texas A&M University |
John |
Wood |
Harte Research Institute |
Jorge |
Brenner |
The Nature Conservancy, Corpus Christi |
Kathy |
Goodin |
NatureServe |
Ken |
Lindeman* |
Florida Institute of Technology/Co-Chair of Snapper, Sea Bream, Grunt SSG |
Kent |
Carpenter |
Old Dominion University/Manager IUCN Marine Biodiversity Unit |
Kyle |
Strongin |
IUCN/Arizona State University |
Labbish |
Chao |
Museum of Marine Biology & Aquarium, Taiwan/Sciaenidae SSG coordinator |
Luiz |
Rocha |
California Academy of Sciences/ member of Groupers and Wrasses SSG |
Luke |
Tornabene |
Texas A&M University |
Maria |
Vega Cendejas |
CINVESTAV-IPN, Unidad Merida, Mexico |
Mia |
Comeros-Raynal |
IUCN/Old Dominion University |
Michelle |
Zapp Sluis |
Harte Research Institute |
Riley |
Pollom |
Project Seahorse – University of British Columbia Fisheries Centre, Canada |
Rodolfo |
Claro |
Instituto de Oceanología CITMA, La Habana, Cuba |
Roger |
McManus |
IUCN, Arizona |
Ross |
Robertson |
Smithsonian Tropical Research Institute, Panama |
Tomas |
Camarena Luhrs |
National Commission of Natural Protected Areas–SEMARNAT, Mexico |
The goal of this workshop was to discuss a methodology for community-based recording of observations of marine species. These data, if collected in a repeatable and consistent manner over a long period of time, will become a valuable reference for distribution mapping for marine species ranges (adapted from
The IUCN Red List of Threatened Species™ is essentially a checklist of taxa that have undergone an extinction risk assessment using the IUCN Red List Categories and Criteria, as shown in
Required Attributes for IUCN Distribution Shapefiles (
Field |
ESRI Field Type |
Description |
Required for Crowdsource Data |
ID_NO |
Integer |
Internal Record ID |
Assigned by IUCN |
BINOMIAL |
String |
Scientific name of the species |
Recommended but not necessary |
BASINID (for freshwater species only) |
Integer |
River Basin ID (Hydrosheds). (Note that this field is only included when species are mapped using the freshwater mapping protocol) |
|
PRESENCE |
ShortInt |
Is/Was the species in this area, codes listed below |
Assigned by IUCN |
ORIGIN |
ShortInt |
Why/ How the species is in this area, codes listed below |
Assigned by IUCN |
SEASONAL |
ShortInt |
What is the seasonal presence of the species in the area, codes listed below |
Assigned by IUCN (by date/time stamp?) |
COMPILER |
String |
Name of the individual/s or institution/s responsible for generating the polygon, if not IUCN. |
Yes, with contact information (usually email address) |
YEAR |
ShortInt |
Year in which the polygon was mapped, compiled, or modified |
Date Field |
CITATION |
String |
Individual/s or institution /s responsible for providing the data |
Assigned by IUCN/app? |
SOURCE |
String |
Source of distribution range given. |
Yes (app name?) |
DIST_COMM |
String |
Distribution comments that refer directly to the polygon. |
Optional |
ISLAND |
String |
Name of the island the polygon is on |
Bay system or other geography? |
SUBSPECIES |
String |
Epithet |
Optional |
SUBPOP |
String |
Epithet |
Optional |
TAX_COMM |
String |
Taxonomic comments that refer directly to the polygon. Includes notes on polygons pertaining to subspecies or subpopulations. |
Assigned by IUCN |
LEGEND |
String |
Code containing the combinations of the presence, origin and seasonality fields determining how the map will be displayed on The IUCN Red List website. |
Assigned by IUCN |
Coded Domain Values for Presence (
Code |
Presence |
1 |
Extant |
2 |
Probably Extant (discontinued) |
3 |
Possibly Extant |
4 |
Possibly Extinct |
5 |
Extinct (post 1500) |
6 |
Presence Uncertain |
Description of Coded Values: Extant – The species is known or thought very likely to occur presently in the area, which encompasses localities with current or recent (last 20-30 years) records where suitable habitat at appropriate altitudes remains. Extant ranges are included in the calculation of the extent of occurrence (EOO) and in maps of the historical distribution (See Note 5) of the species. Probably Extant – This code value has been discontinued for reasons of ambiguity. It may exist in the spatial data but will gradually be phased out. Possibly Extant – There is no record of the species in the area, but the species may possibly occur, based on the distribution of potentially suitable habitat at appropriate altitudes, although the area is beyond where the species is Extant (i.e., beyond the limits of known or likely records), and the degree of probability of the species occurring is lower (e.g., because the area is beyond a geographic barrier, or because the area represents a considerable extension beyond areas of known or probably occurrence). Identifying Possibly Extant areas is useful to flag areas where the taxon should be searched for. Possibly Extant ranges are not included in the calculation of EOO or in maps of the current and / or historical distribution of the taxon. Possibly Extinct – The species was formerly known or thought very likely to occur in the area (post 1500 AD), but it is most likely now extirpated from the area because habitat loss and/or other threats are thought likely to have extirpated the species, and there have been no confirmed recent records despite searches. Possibly Extinct ranges are not included in the calculation of EOO, but are included in maps of the historical distribution of the taxon. Extinct – The species was formerly known or thought very likely to occur in the area (post 1500 AD), but it has been confirmed that the species no longer occurs because exhaustive searches have failed to produce recent records, and the intensity and timing of threats could plausibly have extirpated the taxon. Extinct ranges are not included in the calculation of EOO, but are included in maps of the historical distribution of the taxon. Presence Uncertain – A record exists of the species' presence in the area, but this record requires verification or is rendered questionable owing to uncertainty over the identity or authenticity of the record, or the accuracy of the location. Presence uncertain records are not included in the calculation of EOO or in maps of the historical distribution of the taxon. Notes: 1. These codes are mutually exclusive, e.g. a polygon coded as “Extant” cannot also be coded as “Extinct”. 2. In accordance with the Red List Categories and Criteria, Extant polygons can include inferred or projected sites of present occurrence (see the Guidelines for Using the IUCN Red List Categories and Criteria for further guidance). 3. When there is uncertainty as to whether or not a species still occurs in an area in which it was formerly known to occur (usually because there have been no recent surveys), it is necessary for assessors to judge whether it is more appropriate to assign a coding of Extant or Possibly Extinct (based on available knowledge of remaining habitat, intensity of threats, adequacy of searches, and other evidence). 4. EOO calculations should be based on polygons coded as Extant only. 5. Maps of the historical range of a species can be produced by combining polygons coded as Extant, Probably Extant, Possibly Extinct, and Extinct. 6. The old Presence code 2 (Probably Extant) is now discontinued. |
Coded Domain Values for Origin (
Code |
Origin |
1 |
Native |
2 |
Reintroduced |
3 |
Introduced |
4 |
Vagrant |
5 |
Origin Uncertain |
Description of Coded Values: Native – The species is/was native to the area Reintroduced - The species is/was reintroduced through either direct or indirect human activity. Introduced – The species is/was introduced outside of its historical distribution range through either direct or indirect human activity. Vagrant – The species is/was recorded once or sporadically, but it is known not to be native to the area. Origin Uncertain -The species’ provenance in an area is not known (it may be native, reintroduced or introduced) Note: These codes are mutually exclusive; a polygon coded as “Native” cannot also be coded as “Introduced”. |
Coded Domain Values for Seasonality.
Code |
Seasonality |
1 |
Resident |
2 |
Breeding Season |
3 |
Non-breeding Season |
4 |
Passage |
5 |
Seasonal Occurrence Uncertain |
Description of Coded Values: Resident – the species is/was known or thought very likely to be resident throughout the year Breeding Season – The species is/was known or thought very likely to occur regularly during the breeding season and to breed. Non-breeding Season – The species is/was known or thought very likely to occur regularly during the non-breeding season. In the Eurasian and North American contexts, this encompasses ‘winter’. Passage – The species is/was known or thought very likely to occur regularly during a relatively short period(s) of the year on migration between breeding and non-breeding ranges. Seasonal Occurrence Uncertain – The species is/was present, but it is not known if it is present during part or all of the year. |
The IUCN Red List Review Process (
The majority of assessments appearing on the IUCN Red List are carried out by members of the IUCN Species Survival Commission (SSC), appointed Red List Authorities (RLAs), Red List Partners, or participants of IUCN-led assessment projects (
A detailed guidance document ‘Documentation standards and consistency checks for IUCN Red List assessments and species accounts’ (
Extent of Occurrence (EOO) is defined as the area contained within the shortest continuous imaginary boundary which can be drawn to encompass all the known, inferred or projected sites of present occurrence of a taxon, excluding cases of vagrancy (species far out of their typical range). This measure may exclude discontinuities or disjunctions within the overall distributions of taxa (e.g., large areas of obviously unsuitable habitat; but see 'area of occupancy'). Extent of occurrence can often be measured by a minimum convex polygon (the smallest polygon in which no internal angle exceeds 180 degrees and which contains all the sites of occurrence).
Area of Occupancy (AOO) is defined as the area within its 'Extent of Occurrence' (see definition above) which is occupied by a taxon, excluding cases of vagrancy. The measure reflects the fact that a taxon will not usually occur throughout the area of its extent of occurrence, which may, for example, contain unsuitable habitats. The area of occupancy is the smallest area essential at any stage to the survival of existing populations of a taxon (e.g. colonial nesting sites, feeding sites for migratory taxa). The size of the area of occupancy will be a function of the scale at which it is measured, and should be at a scale appropriate to relevant biological aspects of the taxon. The criteria include values in km2, and thus to avoid errors in classification, the area of occupancy should be measured on grid squares (or equivalents) which are sufficiently small.
The definitions above are taken directly from: http://www.iucnredlist.org/static/categories_criteria_2_3#definitions.
Distribution maps display a polygon intended to communicate that a species probably only occurs within its extent, which is based on known occurrences, knowledge of habitat preferences, remaining suitable habitats, elevation (or depth) limitations, and other expert knowledge. Point data, which can include line-based data from transects, polygon data from a defined area, such as a national park, and grid data (observations or survey records from a regular grid) from which these polygons are derived is obtained from published peer-reviewed literature, ‘grey’ literature (academic or government literature that is not formally published), field observations, biodiversity and taxonomic databases such as the Global Biodiversity Information Facility (GBIF) and Ocean Biodiversity Information System (OBIS), museum and other curated collections, or from taxonomic expert knowledge. There is a wide variety in the quality and quantity of these data. There are also online utilities, such as GeoCAT (
The existing IUCN Red List mapping protocol for marine species differs from that for terrestrial species, primarily in that bathymetry may be used to delineate species range limits, much like elevation limitations may be used to limit the ranges of terrestrial species. It also differs from the mapping protocol for freshwater fishes, where drainage basins are typically used for determining and delineating range extents. The IUCN protocol for converting marine observation point data into distribution polygons involves a three-step process: Step One: Plot Observation Points, Step Two: Expand the Range, and Step Three: Refine the Range.
In Step One, point observation data are plotted. Since data often come from a diverse range of formats and sources, methods for plotting data points will vary. All data should be plotted in the Geographic Coordinate System, WGS-1984.
In Step Two of this protocol, the range is extrapolated based on the extent of suitable habitat (ESH) in the area and expert knowledge of the species and its requirements. Surrounding areas of similar habitat may be included. For terrestrial species, there are various other factors such as elevation, temperature, and even natural physical barriers, such as oceans. Marine species range may be affected by depth, water temperature gradients, salinity ranges, photic zone depths, and O2 concentrations. Often, this extrapolation is accomplished by buffering the point data, and then creating a convex polygon that surrounds the available points.
In Step Three, areas that are deemed unsuitable for a species are removed from the extrapolated habitat polygon(s). Note that this extrapolation and elimination of areas may result in discontinuous or non-contiguous polygons. This may result in different Extent of Occurrence and Area of Occupancy, and results in the best representation of the species’ likely occurrence or distribution based on currently available information. The Area Of Occupancy (AOO) will reflect influences from both biotic and abiotic factors.
A ‘Best Practices’ section of the tutorial offers several ‘rules of thumb’ to go by:
The distribution map produced with this protocol represents the taxon’s distribution within its overall range for communication and/or conservation planning purposes; it may not equate to either the spread of extinction risk (Extent Of Occurrence) or the occupied range area (Area Of Occupancy) as defined by the IUCN Red List Categories and Criteria, but can be used to support these measurements.
There are several areas where questions may exist that this protocol doesn’t address such as: what point data sources should be included, what areas should be eliminated, how should seasonality by represented (separate GIS file, separate polygon within the same GIS file…).
Much of the workshop discussion focused on bringing in additional data and data sources. With the advent of numerous portable electronic devices, including Smartphones, with different applications and interfaces and GPS/mapping capabilities, new and exciting sources of species and species/location data are available, which could be included with current datasets. Crowdsourcing, commonly known as ‘citizen science’, is a manner of collecting data and observations in which collaborators who may lack credentials and formal institutional affiliation can contribute to the work of taxonomists and scientists. For example, rather than requiring a master’s degree in ichthyology, a citizen science project might ask if a candidate can learn to identify a particular species of fish using a dichotomous key (
The
To illustrate crowdsourcing, consider several examples:
This small sampling of crowd-sourced data collection applications emphasizes the need to achieve a consensus on whether information collected in this manner can be used to enhance the current point and polygon observation data used to determine range and distribution extent information.
Crowd-sourced data using some of the ‘apps’ mentioned above would require a minimum of data fields be filled; other attributes should be added from existing and expert knowledge or specimen voucher information. The scientific name (binomial), the name of the compiler or submitter, and the citation (organization or app name) should be collected and added to the geo-tagged image information when available. Many of the apps and website entry points currently available fail to generate useable data, because they do not conform to taxonomic standards, or lack georeferencing. Database curators and developers now have access to several 'toolkits', such as that available from
The Museum of Vertebrate Zoology (MVZ) at Berkley publishes a guide (
The traditional sources of biodiversity data include but certainly aren't limited to museum collections, taxonomic monographs, and biodiversity databases, which obtain much of their data from the first two sources. Individual specimen and observations within these collections come from a variety of sources, including published and unpublished (grey) literature, amatuer naturalists, and volunteer recorders (
In addition to biodiversity databases, surveys, often conducted by state and federal fisheries agencies, are another source of biodiversity data. The Texas Parks and Wildlife Department has been conducting seine, gill net and trawl surveys since the 1970’s. The Louisiana Department of Wildlife and Fisheries has been collecting fishery independent data since 1988, from programs utilizing various gear and sampling techniques. The Florida Fisheries-Independent Monitoring program began the same year. The Southeast Area Monitoring and Assessment Program (SEAMAP) Gulf of Mexico component has been operational since 1981, planning, coordinating and conducting surveys for the Gulf States Marine Fisheries Commission. The Secretary of Agriculture in Mexico and the Instituto Nacional de Pesca (National Fisheries Institute) coordinate and conduct scientific and technological research on fisheries.
Biodiversity databases and literature contain vast amounts of distribution and taxonomic information, however the quality, scope and scale of data varies. To address this potential problem, data should be verified and vetted by species experts and other knowledgeable workers before the information can be incorporated into Red List assessments. Taxonomic information can be verified in authoritative taxonomic databases such as the Integrated Taxonomic Information System (ITIS), the World Register of Marine Species (WoRMS), and other initiatives, which count on the assistance of taxonomic experts to keep the information as current as possible.
Biodiversity databases such as GBIF, OBIS, and Red List usually rely on multiple biodoversity and taxonomic databases to keep information current. It is recommended that any change in taxonomy be fully documented and linked to the source of the taxonomic authority. Similarly, information on potential mis-identifications should be provided to avoid potential problems.
Distributional data can have several sources of errors including incomplete or vague locality descriptions, wrong information from original source, transcription of data from hand-written labels and field log books, transposition of latitude and longitude, and GPS or other instrument error or calibration problems. Therefore, biodiversity databases should have fields for accuracy and data confidence, ideally reviewed by staff or an expert. If the data point is considered problematic, it should be flagged as such so that users can evaluate its usefulness. There are numerous database validation tools available.
This paper is not intended to present all the possible combinations of crowdsourcing data or species evaluations, but instead should serve as a starting point for further discussion.
The review process:
Crowd-sourced data:
The authors wish to acknowledge the International Union for Conservation of Nature Redlist Species Evaluators, team members, and leaders, especially the Gulf of Mexico Fishes Assessment Workshop attendees, experts and facilitators from the Species Survival Commission.