Phytoplankton forms the basis of aquatic food webs, and shifts in community composition reflect changes in environmental conditions. Despite the accepted importance, the processes behind maintaining spatial and temporal community structure and biodiversity at the base of the aquatic food web remain poorly described and understood. A recognized challenge hampering validation of ecological models and meta-analysis is the scarcity of large phytoplankton data sets. Compared to other aquatic data, harmonization of quantitative phytoplankton data sets from different sources and academic institutions has remained a major challenge. Here we demonstrate and examine processes used to compile and harmonize a large multi-sourced phytoplankton data set covering 40 yr of monitoring and over 15 000 quantitative samples from the Baltic Sea. We show differences in the quality of data among countries and analyze autocorrelation scales in field data. Phytoplankton community composition showed positive autocorrelation at a temporal scale of less than 30 d and had a recurrent pattern at a yearly interval. Both total biomass and community composition showed a positive spatial autocorrelation, but the extent of the data determines the autocorrelation scale and strength. We introduce a new strategy to select the best performing model to assess regional taxon richness in phytoplankton field data. The Weibull 4-parameter model showed both the best fit with data and robust parameter estimates at varying sample size.
- Species richness