DOI

Introduction

This document provides URI identifiers, definitions, and examples for all the attributes (column headers) used in the comma-delimited text files included in the data package (gothenburg_bird_data_package-v2.0.0) available to download at https://zenodo.org/records/15490818.

The document is divided into two main sections:

  1. Darwin Core Attributes: Standard Darwin Core attributes (also known as terms) present in the dataset. For a complete list and detailed definitions of Darwin Core attributes, please refer to https://dwc.tdwg.org/list/.
  2. Custom Attributes: Non-Darwin Core attributes that provide additional information unique to this dataset.

Darwin Core Attributes

occurrenceID

Identifier: http://rs.tdwg.org/dwc/terms/occurrenceID

Definition:
A Universally Unique Identifier (UUID) for the occurrence.

Examples:
f15d5955-39ec-40fc-ab42-b483469ffddf


basisOfRecord

Identifier: http://rs.tdwg.org/dwc/terms/basisOfRecord

Definition:
The type of the record.

Examples:
MachineObservation


scientificName

Identifier: http://rs.tdwg.org/dwc/terms/scientificName

Definition:
The full scientific name, with author and date information if known, after reclassifying and excluding records based on the technical validation. The name follows the scientific name currently valid for the taxon according to the Swedish taxonomic database (dyntaxa.se).

Examples:
Turdus merula (Linnaeus, 1758)
Parus major (Linnaeus, 1758)


eventDate

Identifier: http://rs.tdwg.org/dwc/terms/eventDate

Definition:
The date during which the occurrence record was recorded, following the ISO 8601 date-time standard.

Examples:
2024-05-02
2024-04-29


eventTime

Identifier: http://rs.tdwg.org/dwc/terms/eventTime

Definition:
The local time interval during which the event occurred, following the ISO 8601 date-time standard. The “/” solidus character separates start and end times in the representation of a time interval.

Examples:
06:02:03/06:02:06
23:48:57/23:49:00


decimalLatitude

Identifier: http://rs.tdwg.org/dwc/terms/decimalLatitude

Definition:
The latitude (in decimal degrees) of the acoustic recorder’s location.

Examples:
57.69609


decimalLongitude

Identifier: http://rs.tdwg.org/dwc/terms/decimalLongitude

Definition:
The longitude (in decimal degrees) of the acoustic recorder’s location.

Examples:
11.96408


geodeticDatum

Identifier: http://rs.tdwg.org/dwc/terms/geodeticDatum

Definition:
The coordinate reference system used for the location.

Examples:
EPSG:4326


country

Identifier: http://rs.tdwg.org/dwc/terms/country

Definition:
The name of the country of the occurrence record.

Examples:
Sweden


countryCode

Identifier: http://rs.tdwg.org/dwc/terms/countryCode

Definition:
A two-letter standard abbreviation for the country of the occurrence record.

Examples:
SE


taxonRank

Identifier: http://rs.tdwg.org/dwc/terms/taxonRank

Definition:
The taxonomic rank of the most specific name in the scientificName.

Examples:
species


kingdom

Identifier: http://rs.tdwg.org/dwc/terms/kingdom

Definition:
The full scientific name of the kingdom in which the taxon is classified.

Examples:
Animalia


phylum

Identifier: http://rs.tdwg.org/dwc/terms/phylum

Definition:
The full scientific name of the phylum or division in which the taxon is classified.

Examples:
Chordata


class

Identifier: http://rs.tdwg.org/dwc/terms/class

Definition:
The full scientific name of the class in which the taxon is classified.

Examples:
Aves


order

Identifier: http://rs.tdwg.org/dwc/terms/order

Definition:
The full scientific name of the order in which the taxon is classified.

Examples:
Passeriformes
Anseriformes


family

Identifier: http://rs.tdwg.org/dwc/terms/family

Definition:
The full scientific name of the family in which the taxon is classified.

Examples:
Turdidae
Haematopodidae


genus

Identifier: http://rs.tdwg.org/dwc/terms/genus

Definition:
The full scientific name of the genus in which the taxon is classified.

Examples:
Turdus
Motacilla


taxonID

Identifier: http://rs.tdwg.org/dwc/terms/taxonID

Definition:
The unique LSID (Life Science Identifier) of the taxon according to the Swedish taxonomic database (dyntaxa.se) provided by the Swedish Species Information Center (Artportalen).

Examples:
urn:lsid:dyntaxa.se:Taxon:102998
urn:lsid:dyntaxa.se:Taxon:102964


Custom Attributes

globalSortOrder

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#globalSortOrder

Definition:
A taxon-specific attribute provided by the Swedish Species Information Centre (Artportalen). It is an integer value, which can be used to enable a taxonomic sort order of all Swedish taxa handled in the Swedish taxonomic database (dyntaxa.se).

Examples:
58146
57141


BirdNETClass

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#BirdNETClass

Definition:
The taxon scientific name originally classified by the BirdNET model, prior to the technical validation and reclassification of records.

Examples:
Turdus merula
Parus major


BirdNETConfidence

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#BirdNETConfidence

Definition:
The BirdNET built-in confidence score for species detection, ranging from 0 to 1 (Wood and Kahl, 2024). The dataset includes only species detections with a minimum confidence score of 0.85.

Examples:
0.8608
0.9701


expertValidated

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#expertValidated

Definition:
Indicates whether the occurrence record has been validated by an expert ornithologist.

Examples:
Yes
No


isIsolated

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#isIsolated

Definition:
Indicates whether a bird detection is isolated, i.e., not preceded or followed by another detection of the same species within a 9-seconds window (at a model’s detection sensitivity of 1.0, no segment overlap, and a minimum confidence threshold of 0.1).

Examples:
Yes
No


BirdNETClassAccuracy

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#BirdNETClassAccuracy

Definition:
The accuracy of the BirdNET species classification based on the random subset of validated records by the ornithologist (see Figure 4 in the paper).

Examples:
1.00
0.07


misclassificationProbabilities

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#misclassificationProbabilities

Definition:
The likelihood that a species classified by the BirdNET model is actually another species. The misclassification probability of species A being B is calculated as the ratio of the number of times species A was found to be species B, divided by the total number of the ornithologist-validated records of species A.

Examples:
Anas platyrhynchos (0.04), Corvus cornix (0.94), Corvus monedula (0.02)
Larus canus (1.00)


reclassified

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#reclassified

Definition:
Indicates whether the occurrence record has been reclassified from the original BirdNET classification, following the decision tree shown in Figure 5 of the paper.

Examples:
Yes
No


occurrenceProbability

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#occurrenceProbability

Definition:
An estimation of the likelihood of a species occurrence, based on expert validation and model predictions, where:

  • occurrenceProbability = 1.00: For expert-validated records (expertValidated == "Yes") that were confirmed to be correct or reclassified based on a clear suggestion from the ornithologist (reclassified == "Yes").
  • 0.5 < occurrenceProbability < 1.00: For non-expert-validated records (expertValidated == "No"), this value shows the model-predicted probability of occurrence, based on a trained Random Forest classifier (see the paper for more details).

Recommended thresholds:

  • Maximizing specificity (occurrenceProbability >= 0.83): Prioritizes minimizing false positives, potentially increasing false negatives. It ensures that when a non-expert-validated record is predicted, the prediction is more likely to be correct, but it may miss some true occurrences.
  • Balancing sensitivity and specificity (occurrenceProbability >= 0.75): Optimizes both true positives and true negatives, minimizing both false positives and false negatives. This threshold is determined using Youden’s J statistic (Youden, 1950).

Examples:
1.00
0.51


commonNameSwedish

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#commonNameSwedish

Definition:
The common Swedish name of the taxon.

Examples:
koltrast
blåmes


commonNameEnglish

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#commonNameEnglish

Definition:
The recommended common English name of the taxon according to the Swedish taxonomic database (dyntaxa.se).

Examples:
Common Blackbird
Eurasian Blue Tit


detectionDistanceInMeters

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#detectionDistanceInMeters

Definition:
The expected radius, in meters, around the acoustic recorder within which sounds were detected. The values are suggested based on the maximum detection distances for typical sound amplitudes and frequencies observed in playback experiments by Sethi et al. (2021).

Examples:
20-70


siteName

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#siteName

Definition:
The name of the site where the recorder was located.

Examples:
Chalmerska Vasaparken
Renströmsparken


bcl

Identifier: https://smog-chalmers.github.io/BirdMonitoringGothenburg/#bcl

Definition:
The building type surrounding the recorder, as described in Berghauser Pont et al. (2019a, 2019b), where:

  • bcl2: Compact low-rise buildings
  • bcl3: Dense mid-rise buildings
  • bcl5: Compact mid-rise buildings
  • ref_bcl6: Reference spacious mid-rise buildings
  • ref: Reference non-built green space

Examples:
bcl2
ref_bcl6


References

Berghauser Pont, M., Stavroulaki, G., Bobkova, E., Gil, J., Marcus, L., Olsson, J., Sun, K., Serra, M., Hausleitner, B., Dhanani, A., Legeby, A., 2019a. The spatial distribution and frequency of street, plot and building types across five European cities. Environ. Plan. B Urban Anal. City Sci. 46, 1226–1242. https://doi.org/10.1177/2399808319857450 

Berghauser Pont, M., Stavroulaki, G., Marcus, L., 2019b. Development of urban types based on network centrality, built density and their impact on pedestrian movement. Environ. Plan. B Urban Anal. City Sci. 46, 1549–1564. https://doi.org/10.1177/2399808319852632 

Wood, C.M., Kahl, S., 2024. Guidelines for appropriate use of BirdNET scores and other detector outputs. J. Ornithol. 165, 777–782. https://doi.org/10.1007/s10336-024-02144-5  

Sethi, S.S., Fossøy, F., Cretois, B., Rosten, C.M., 2021. Management relevant applications of acoustic monitoring for Norwegian nature – The Sound of Norway, 31. Norsk institutt for naturforskning (NINA)

Youden, W.J., 1950. Index for rating diagnostic tests. Cancer3, 32–35. DOI: doi: 10.1002/1097-0142(1950)3:1<32::AID-CNCR2820030106>3.0.CO;2-3


Citaion

Eldesoky, A. H., Gil, J., Kindvall, O., Stavroulaki, I., Jonasson, L., Bennet, D., Yang, W., Martínez, A., Lichter, R., Petrou, F., & Berghauser Pont, M. (2025). A bird species occurrence dataset from passive audio recordings across dense urban areas in Gothenburg, Sweden [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15490818


Contact

If you have any questions, contact us at