Spatiotemporal Neural Network for PM2.5 Estimation

We trained a neural network with the above variables to PM_2.5 monitoring data from the AQS network. The relationships between input variables and PM_2.5 could be highly nonlinear with complex interactions. Neural networks have the potential to model any type of nonlinearity.^{71 , 72} The details of the neural network, such as its structure and training method were articulated in the supplementary material. All input variables covered the entire study area, but some of them were not available in early years or had higher proportions of missing values. Missing values were especially common in Terra and Aqua AOD data. To deal with the missing values problem and different temporal coverages, we adopted the following steps. We used a calibration method to fill in the missing values in Aqua AOD data from 2003 to 2012 and Terra AOD data from 2001 to 2012 based on the association of GEOS-Chem outputs and land-use terms with non-missing AOD.⁵⁶ For the other variables with a low fraction of missing values, we interpolated at grid cells with missing values. Regarding temporal coverage, GEOS-Chem outputs, land-use terms, MODIS outputs, and meteorological variables were available throughout the study period. OMI data, Aqua AOD, and Terra AOD were unavailable in earlier years. For years with one or more unavailable variables, we fitted the model with the remaining available variables.
Most previous studies used only in situ variables for modeling. However, information from a neighboring cell can be informative as well. For example, nearby road density, forest coverage and other land-use variables as well as nearby PM_2.5 measurements either influence or correlate with local PM_2.5 measurements. They are informative for modeling and can improve model performance. We accounted for spatial correlation by using convolutional layers in the neural network.⁷³ A convolutional layer is computed by applying a convolution kernel on an input layer. Values from neighboring cells are combined through the use of the kernel function. The kernel takes the form a function (e.g. weighted average with Gaussian weights based on distance) that produces a scalar estimate from the multidimensional inputs. A convolution layer aggregates nearby information and can simulate some form of autocorrelation. We included convolutional layers for land-use terms and nearby PM_2.5 measurements as additional predictor variables to account for spatial autocorrelation. Multiple convolution layers were incorporated to allow the neural network to model even more complex autocorrelation or possible interaction with other variables (Supplementary material). In addition to nearby grid cells, observations from nearby days for the same grid cell can be also informative. To incorporate this, we first fitted a neural network and obtained an initial prediction for PM_2.5. We then computed temporal convolution layers and fitted the neural network again with them (Figure S5).
To validate model results and avoid overfitting, we used 10-fold cross-validation, in which all monitoring sites were randomly divided into 10%-90% splits. The model was trained with 90% of data and predicted PM_2.5 at the remaining 10%. The same process was repeated for other splits. Assembling predicted PM_2.5 at ten 10% testing sets yielded predicted PM_2.5 for all the monitors. We computed correlation between predicted PM_2.5 and monitored PM_2.5. Spatial and temporal R²s were also calculated. Details of calculating R² have been specified in the supplementary material.
The trained neural network was then used to make dailyPM_2.5 predictions for each gridcell (1 km×1 km) for each day.
All programming was implemented in Matlab (version 2014a, The MathWorks, Inc.).

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Di Q., Kloog I., Koutrakis P., Lyapustin A., Wang Y, & Schwartz J. (2016). Assessing PM2.5 Exposures with High Spatiotemporal Resolution across the Continental United States. Environmental science & technology, 50(9), 4712-4721.

Publication 2016

Aod use Cells Forest Grid cell

Corresponding Organization :

Other organizations : Harvard University, National Aeronautics and Space Administration, Goddard Space Flight Center, University of Maryland, Baltimore County

Top 5 similar protocols

Protocol cited in 50 other protocols

Variable analysis

independent variables

GEOS-Chem outputs
Land-use terms
MODIS outputs
Meteorological variables
OMI data
Aqua AOD
Terra AOD

dependent variables

PM2.5

control variables

None explicitly mentioned

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!