Using Deep Convolutional Neural Networks and Big Data to Model the Distribution of Birds in the Americas
Species distribution modeling (SDM) is a widely used tool for modeling, predicting and mapping species geographic distribution, which provides essential information to support biological, ecological, or environmental studies and biodiversity conservation. SDM is based on the environmental niche theory describing species responses to environmental conditions, i.e., species-environment relationships, which are highly complex and nonlinear. Yet, existing approaches to SDM often simplify the form of species niche for modeling convenience. Moreover, these approaches often do not exploit spatial configuration of the environmental conditions—which can be very influential to species habitat use—in modeling species distribution. In addition, limited by the availability of species data, most SDM models are trained on relatively small datasets.
It has been well-established that deep neural networks are capable of learning and representing highly complex nonlinear functions (e.g., species environmental niche), and convolutional neural networks can capture spatial patterns in feature layers (e.g., environmental variables) for learning the target function, although they require a large number of labeled training samples (e.g., species occurrence vs. absence). In the domain of biological science, numerous volunteers are contributing and sharing large volumes of species data across the globe through citizen science initiatives such as the eBird citizen science project. Such big species data can be utilized to train deep convolutional neural networks for species distribution modeling.
This project proposes to apply artificial intelligence (AI) and big data for species distribution modeling. Specifically, bird occurrence data from eBird will be used to train deep convolutional neural networks to model, predict and map the geographic distribution of birds in the Americas.