TIES 2022

Welcome to the 2022 Annual Meeting of The International Environmetrics Society (TIES)

November 17-18, 2022. Virtual TIES 2022 Annual Meeting

Watch our recorded talks!

The International Environmetrics Society (TIES) is a non-profit organization aimed to foster the development and use of statistical and other quantitative methods in the environmental sciences, environmental engineering and environmental monitoring and protection. To this end, the Society promotes the participation of statisticians, mathematicians, scientists and engineers in the solution of environmental problems and emphasizes the need for collaboration and for clear communication between individuals from different disciplines and between researchers and practitioners. The Society further promotes these objectives by conducting meetings and producing publications, and by encouraging a broad membership of statisticians, mathematicians, engineers, scientists and others interested in furthering the role of statistical and mathematical techniques in service to the environment.

CONFERENCE PROGRAM (Central Time Zone, USA)

Thursday, November 17, 2022: 8:45am CT -- 14:40pm CT

8:45 am -- 8:50 am. Welcome address (Yulia R. Gel, TIES President)
8:50 am -- 9:50 am. [VIDEO] Keynote speech: Raphaël Huser (KAUST, Saudi Arabia). Modeling High-Impact Environmental Extremes in Complex Settings: Recent Progress and Modern Challenges (Organizer & Chair: Yulia R. Gel, University of Texas at Dallas, USA)
9:50 am -- 11:05 am. Session 1. Environmental exposure modeling: between challenges and advances (Organizer & Chair: Monica Pirani, Imperial College, UK)
- 9:50 am -- 10:15 am. [VIDEO] Ben Swallow (University of St Andrews, UK). Bayesian causal inference for environmental exposures using potential outcomes: challenges and opportunities
- 10:15 am -- 10:40 am. [VIDEO] Eliane R. Rodrigues (UNAM, Mexico). A bivariate spatio-temporal model to estimate the risk of occurrences of emergency alerts in Mexico City
- 10:40 am -- 11:05 am. [VIDEO] Joshua Warren (Yale School of Public Health, USA). A Bayesian framework for incorporating exposure uncertainty into health analyses with application to air pollution and stillbirth
11:05 am -- 11:10 am. Break
11:10 am -- 12:25 am. Session 2. Statistical models for animal movements (Organizer & Chair: Alessio Pollice, University of Bari, Italy)
- 11:10 am -- 11:35 am. [VIDEO] Mevin Hooten (The University of Texas at Austin, USA). Animal movement models with mechanistic selection functions
- 11:35 am -- 12:00 am. [VIDEO] Paul Blackwell (University of Sheffield, UK). Bayesian Inference for Diffusion and Piecewise Deterministic Models of Animal Movement
- 12:00 pm -- 12:25 pm. [VIDEO] Gianluca Mastrantonio (Politecnico di Torino, Italy). Modelling the movement of multiple animals that share behavioral features
12:25 am -- 13:25 pm. Session 3. Early career mentoring panel discussion (Organizer & Chair: Amira Elayouty, Cairo University, Egypt)

Please send your questions and queries regarding the mentoring program in advance to mentoring@environmetrics.org. We will also take questions during the live event, but submitting them in advance will help us determine the focus topics. All questions are welcome and encouraged but we can't promise to answer all questions given the session time.

[VIDEO]

- Panelist. Marian Scott (Professor of Environmental Statistics, University of Glasgow, UK)
- Panelist. Sylvia Esterby (Associate Professor Emeritus of Statistics, University of British Columbia Okanagan, Canada)
- Panelist. Nicola Justice (Assistant Professor of Mathematics, Pacific Lutheran University, USA)
13:25 pm -- 14:40 pm. Session 4. Climate resilience: challenges and new perspectives (Organizer & Chair: Ignacio Segovia-Dominguez, NASA-JPL, USA)
- 13:25 pm -- 13:50 pm. [VIDEO] Devan Becker (Wilfrid Laurier University, Canada). Statistics+Data Science: Ignitions and Burn Areas in Space and Time
- 13:50 pm -- 14:15 pm. [VIDEO] Nicholas LaHaye (NASA-JPL, USA). Segmentation of wildfires, smoke plumes, and burn scars using multi-sensor input and unsupervised and supervised machine learning for improved spatiotemporal coverage and facilitation of automated tracking
- 14:15 pm -- 14:40 pm. [VIDEO] Zhiwei Zhen (University of Texas at Dallas, USA). From Geometric Deep Learning to Visualization: Uncovering Hidden Patterns in Climate-induced Biosurveillance and Resilience

Friday, November 18, 2022: 8:45am CT -- 14:10pm CT

8:45 am -- 10:00 am. Session 5. Data fusion for environmental applications (Organizer & Chair: Claire Miller, University of Glasgow, Scotland)
- 8:45 am -- 9:10 am. [VIDEO] Xiaoyu Xiong (University of Exeter, UK). Data fusion with Gaussian processes for estimation of environmental hazard events
- 9:10 am -- 9:35 am. Alejandro Coca-Castro (Alan Turing Institute, UK). Probabilistic downscaling of UK surface soil moisture fusing in-situ records and terrain layers
- 9:35 am -- 10:00 am. [VIDEO] Alicia Gressent (INERIS, France). Data fusion for air quality mapping using low-cost sensor observations
10:00 am -- 11:00 am. [VIDEO] Keynote speech: Auroop R. Ganguly (Northeastern University, USA). Climate science and resilience with physics-guided informatics on big and small data (Organizer & Chair: Monica Pirani, Imperial College, USA)
11:00 am -- 11:10 am. Break
11:10 am -- 12:40 am. TIES working group session (Organizer: TIES Membership Committee, Chair: Monica Pirani, Imperial College, USA)
- 11:10 am -- 11:35 am. [VIDEO] Matthew Wheeler (National Institute of Environmental Health Science, USA). Mixed Bayesian compressed regression for multivariate models for large correlated geospatial datasets
- 11:35 am -- 12:00 pm. [VIDEO] Marta Blangiardo (Imperial College London, UK). A dependent Bayesian Dirichlet Process model for source apportionment of particle number size distribution
- 12:00 pm -- 12:25 pm. [VIDEO] Christopher Wikle (University of Missouri, USA). An Illustration of Model Agnostic Explainability Methods Applied to Environmental Data
- 12:25 pm -- 12:40 pm. [VIDEO] Joint Discussion
12:40 am -- 13:55 pm. Session 6. Climate justice: the data science perspective (Organizer: Yuzhou Chen, Temple University, USA, Chair: Yulia Gel, UTDallas, USA)
- 12:40 pm -- 13:05 pm. Marco Tedesco (Columbia University, USA). Integrating Public Socioeconomic, Physical Risk, and Housing Data for Climate Justice Metrics: Miami and Tampa test cases
- 13:05 pm -- 13:30 pm. [VIDEO] Lelia Marie Hampton (Massachusetts Institute of Technology, USA). Opportunities for Machine Learning for Climate Justice
- 13:30 pm -- 13:55 pm. [VIDEO] Yuzhou Chen (Temple University, USA). Assessing Urban Form and Climate Justice with Deep Learning
13:55 pm -- 14:10 pm. Farewell remarks
14:10 pm. [VIDEO] Adjourn

Each invited-session talk takes 25 min, including Q&A.

Please see further details below.

November 17th, 2022

Plenary Talk

8:50 am

(Central Time)

Raphaël Huser

Associate Professor of Statistics

King Abdullah University of Science and Technology (KAUST)

Saudi Arabia

Modeling High-Impact Environmental Extremes in Complex Settings: Recent Progress and Modern Challenges

(click to see the full abstract and bio)

Abstract

Rare, low-probability events often lead to the biggest impacts. Therefore, the development of cutting-edge statistical approaches for modeling, predicting and quantifying environmental risks associated with natural hazards is of utmost importance. Climate scientists and related stakeholders, such as engineers and insurers, have indeed realized that under climate change, the greatest environmental, ecological, and infrastructural risks and damages, are often caused by changes in the intensity, frequency, spatial extent, and persistence of extreme events, rather than changes in their average behavior. However, while datasets are often massive in modern day applications, extreme events are always scarce by nature. This makes it very challenging to provide reliable risk assessment and prediction, especially when extrapolation to yet-unseen levels is required. To overcome these limitations, specialized extreme-value models and efficient inference methods have been developed in the recent past. In this presentation, I will first provide an overview of recent progress we have made to develop novel methodology that transcend classical extreme-value theory, with a focus on new flexible sub-asymptotic models for spatial extremes, which improve the prediction of rare events with unknown tail dependence structure. I will then describe modern methodological obstacles that arise with “big models” for massive and complex extremes data, and will show how statistical machine learning can help solve some of these challenges. In order to estimate high quantiles in complex spatio-temporal settings, I will describe a novel partially-interpretable neural network framework for extreme quantile regression, which combines the pragmatism, predictive skill, and computational efficiency of deep learning methods with the strength and resilience of theoretically-justified extreme-value methods. I will finally end the talk with some thoughts on open questions and possible solutions.

Bio

Raphaël Huser is an Associate Professor of Statistics in the Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division at the King Abdullah University of Science and Technology (KAUST), Saudi Arabia, where he leads the Extreme Statistics (extSTAT) research group. He started his career at KAUST initially as a Postdoctoral Research Fellow in 2014, and was then appointed Assistant Professor of Statistics in 2015, before transitioning to his current role as an Associate Professor in 2022. Before joining KAUST, Huser received his Ph.D. degree in statistics from the Swiss Institute of Technology (EPFL), Switzerland, in 2013. He also holds a B.S. in Mathematics and an M.S. in Applied Mathematics from EPFL. Huser has received several awards for his research work, including the the 2014 EPFL Doctorate Award, the 2015 Lambert Award from the Swiss Statistical Association, the 2019 ENVR Early Investigator Award from the Section on Statistics and the Environment (ENVR) of the American Statistical Association (ASA), and now the 2022 Abdel El-Shaarawi Early Investigator Award from The International Environmetrics Society (TIES). He is also an Elected Member of the International Statistical Institute (ISI), and is currently serving as an Associate Editor for five statistics journals, namely Environmetrics, Extremes, the Journal of Agricultural, Biological and Environmental Statistics, the Journal of the Royal Statistical Society: Series C (Applied Statistics), and Statistics and Computing. Huser's research focuses on the development of new flexible and theoretically-motivated statistical models, as well as computationally efficient inference methods, for extreme events in complex systems arising in various applications from environmental sciences and finance. His work aims at making an impact in statistics of extremes and beyond, by improving models, prediction, and quantification of risk associated with extreme events in high-dimensional, spatio-temporal, non-stationary settings.

RECORDING AVAILABLE <click here>

Session 1. Environmental exposure modeling: between challenges and advances

Ben Swallow (University of St Andrews, UK)

Lecturer in Statistics. School of Mathematics and Statistics.

Read more about the presentation...

Bayesian causal inference for environmental exposures using potential outcomes: challenges and opportunities

Abstract. The study of environmental exposure impacts on human health are regularly conducted using correlative regression models combined with observational data. However, to more formally allocate a causal mechanism to the the study, alternative paradigms need to be utilised. In this talk, I will review the potential outcomes framework and discuss remaining challenges and opportunities for determining causal effects in observational data.

Bio. Ben's research is in Bayesian computational modelling for complex systems, including mechanistic models and stpatio-temporal data. He completed his PhD at St Andrews on inference and model selection in multivariate ecological citizen science data and followed by postdocs on emulation and inference in stochastic high-dimensional systems. Recent work has focused on epidemiological data, from both data-driven and methodological perspectives. His research spans applications in ecology, epidemiology and systems biology.

RECORDING AVAILABLE <click here>

Eliane R. Rodrigues (UNAM, Mexico)

Researcher. Instituto de Matematicas.

Read more about the presentation...

A bivariate spatio-temporal model to estimate the risk of occurrences of emergency alerts in Mexico City

Abstract. In order to reduce population exposure to high levels of pollution and, therefore, the health hazard that comes with this exposure, Mexico City environmental authorities have implemented a series of preventive measures. Among them are emergency alerts which are declared whenever high levels of ozone and/or PM10 occur. In the present talk a bivariate spatio-temporal model is considered to predict local pollution emergencies and to assess compliance to the Mexican air quality standards. Hourly ozone and PM10 measurements from 24 stations across Mexico City collected during 2017 are analyzed. As a consequence of the results obtained, future pollutant levels using current weather conditions and recent pollutants concentrations may be predicted as well as the probability of future pollution emergencies. On one hand, we have high predicted probabilities of pollution emergencies limited to a few time periods in a year. On the other hand, we have high predicted probabilities of exceedances of the Mexican air quality standards nearly on a daily basis. This is a joint work with Philip A. White, Alan E. Gelfand, and Guadalupe Tzintzun.

Bio. Eliane R. Rodrigues is a Researcher at Institute of Mathematics at the National University of Mexico (UNAM). She holds a Ph.D. in Applied Probability from the Queen Mary and Westfield College, University of London, UK; a M.Sc. in Probability from the University of Brasilia, Brazil; and a undergraduate degree in Mathematics from the State University of Sao Paulo (UNESP), Brazil. Her field of research is applied probability with focus on stochastic modeling with applications to air pollution, genetics, and epidemiology.

RECORDING AVAILABLE <click here>

Joshua Warren (Yale School of Public Health, USA)

Associate Professor. Biostatistics.

Read more about the presentation...

A Bayesian framework for incorporating exposure uncertainty into health analyses with application to air pollution and stillbirth

Abstract. Studies of the relationships between environmental exposures and adverse health outcomes often rely on a two-stage statistical modeling approach, where exposure is modeled/predicted in the first stage and used as input to a separately fit health outcome analysis in the second stage. Uncertainty in these predictions is frequently ignored, or accounted for in an overly simplistic manner when estimating the associations of interest. Working in the Bayesian setting, we propose a flexible kernel density estimation (KDE) approach for fully utilizing posterior output from the first stage modeling/prediction to make accurate inference on the association between exposure and health in the second stage, derive the full conditional distributions needed for efficient model fitting, detail its connections with existing approaches, and compare its performance through simulation. Our KDE approach is shown to generally have improved performance across several settings and model comparison metrics. Using competing approaches, we investigate the association between lagged daily ambient fine particulate matter levels and stillbirth counts in New Jersey (2011–2015), observing an increase in risk with elevated exposure 3 days prior to delivery. The newly developed methods are available in the R package KDExp.

Bio. Joshua Warren is an associate professor in the Department of Biostatistics at the Yale School of Public Health. He received his Ph.D. in statistics from North Carolina State University in 2011. Dr. Warren’s research focuses on statistical methods in public health with an emphasis on environmental health problems. Much of his work involves introducing spatial and spatiotemporal models in the Bayesian setting to learn more about associations between environmental exposures, such as air pollution, and various health outcomes including preterm birth, low birth weight, and congenital anomalies. He also has interest in developing and applying spatiotemporal models in collaborative settings such as epidemiology, geography, nutrition, and glaucoma research. His theoretical and methodological interests include multiple topics in spatial/spatiotemporal modeling and Bayesian nonparameterics.

RECORDING AVAILABLE <click here>

Session 2. Statistical models for animal movements

Mevin Hooten (The University of Texas at Austin, USA)

Professor. Statistics and Data Sciences.

Read more about the presentation...

Animal movement models with mechanistic selection functions

Abstract. A suite of statistical methods are used to study animal movement. Most of these methods treat animal telemetry data in one of three ways: as discrete processes, as continuous processes, or as point processes. We briefly review each of these approaches and then focus in on the latter. In the context of point processes, so-called resource selection analyses are among the most common way to statistically treat animal telemetry data. However, most resource selection analyses provide inference based on approximations of point process models. The forms of these models have been limited to a few types of specifications that provide inference about relative resource use and, less commonly, probability of use. For more general spatio-temporal point process models, the most common type of analysis often proceeds with a data augmentation approach that is used to create a binary data set that can be analyzed with conditional logistic regression. We show that the conditional logistic regression likelihood can be generalized to accommodate a variety of alternative specifications related to resource selection. We then provide an example of a case where a spatio-temporal point process model coincides with that implied by a mechanistic model for movement expressed as a partial differential equation derived from first principles of movement. We demonstrate that inference from this form of point process model is intuitive (and could be useful for management and conservation) by analyzing a set of telemetry data from a mountain lion in Colorado, USA, to understand the effects of spatially explicit environmental conditions on movement behavior of this species.

Bio. Mevin Hooten is a Professor at The University of Texas at Austin, an ASA Fellow, and Distinguished Achievement Award winner from the ASA Section on Statistics and the Environment. He has authored over 160 publications including 3 textbooks, one of which is on the topic of animal movement modeling. His research focuses on the development of statistical methods for ecological and environmental data using Bayesian and spatio-temporal approaches.

RECORDING AVAILABLE <click here>

Paul Blackwell (University of Sheffield, UK)

Professor. School of Mathematics and Statistics.

Read more about the presentation...

Bayesian Inference for Diffusion and Piecewise Deterministic Models of Animal Movement

Abstract. Animal movement takes place in continuous time, and there are distinct advantages to formulating suitable models in continuous time too, particularly in dealing with missing or irregular and in combining sources of data. I will talk about some recent developments in inference for two broad classes of models: those in which the animal's position or velocity follows a diffusion process, switching between parameter sets representing different behaviours, and those in which the animal's velocity is a piecewise constant function of time. Exploiting a formal analogy between movement modelling and continuous-time MCMC algorithms, all of these models can be extended to incorporate the modelling of resource selection.

Bio. Paul Blackwell is a statistician and modeller, whose first degree was in mathematics from the University of Warwick. His main research is in Bayesian statistics applied in ecology and environmental science. A particular interest is in the modelling of wildlife movement and its implications for understanding behaviour and resource use, and for synthesis of telemetry and survey data. He has also worked on multi-model ensemble methods in ecosystem modelling, methodology for the construction of calibration curves for radiocarbon dating, forest growth models, layer counting in ice cores, and a range of other applications in ecology, environmental science and engineering. He recently held a Leverhulme Research Fellowship.

RECORDING AVAILABLE <click here>

Gianluca Mastrantonio (Politecnico di Torino, Italy)

Associate Professor. Mathematical Sciences.

Read more about the presentation...

Modelling the movement of multiple animals that share behavioral features

Abstract. We propose a model that can be used to infer the behaviour of multiple animals, which is defined as a set of hidden Markov models, based on the sticky hierarchical Dirichlet process, with a shared base-measure, and a step and turn with an attractive point (STAP) emission distribution. The latent classifications represent the behaviour assumed by the animals, which is described by the STAP parameters. Given the latent classifications, the animals are independent. Hence, the animals may share, in different behaviours, the set or a subset of the parameters, allowing us to investigate the similarities between them. The number of latent behaviours, for each animal, is estimated as a model parameter through the Dirichlet process.

Bio. Gianluca Mastrantonio is an Associate Professor, in the Department of Mathematical Sciences at the Politecnico di Torino (Italy). His interests are in Bayesian methods for statistical modelling and computation with particular attention on environmental applications. His main contributions are in the fields of circular statistics, Spatio-temporal modelling, mixture models, genetic data and animal movement.

RECORDING AVAILABLE <click here>

Session 3. Early career mentoring panel discussion

RECORDING AVAILABLE <click here>

Marian Scott (University of Glasgow, UK)

Professor of Environmental Statistics. Departments of Math and Statistics.

Sylvia Esterby (University of British Columbia Okanagan, Canada)

Associate Professor Emeritus. Department of Statistics.

Nicola Justice (Pacific Lutheran University, USA)

Assistant Professor. Department of Mathematics.

Session 4. Climate resilience: challenges and new perspectives

Devan Becker (Wilfrid Laurier University, Canada)

Assistant Professor. Mathematics.

Read more about the presentation...

Statistics+Data Science: Ignitions and Burn Areas in Space and Time

Abstract. When fires are expected to be large, do we also expect more fires to occur? Using data from British Columbia, we developed a novel statistical model to investigate this idea. Our model involves a spatially continuous model that finds areas where large fires are associated with increased ignitions. Some unexpected parameter estimates lead to new insights into fire behaviour, and the spatial estimates help to inform our intuition about the relationship between ignitions and burn areas. In an extension, we augment our analysis with an unsupervised machine learning technique in order to detect patterns in our spatial estimates and provides inspiration for future research avenues.

Bio. After completing his BSc in Mathematics at Wilfrid Laurier University, he moved to Western University for his MSc and PhD in Statistics. In these studies, he primarily focused on spatial point process modelling with applications to locations of forest fires as well as shot locations in professional hockey games. His postdoctoral research was also at Western, but in the department of Pathology and Laboratory Medicine. He started soon after the global pandemic, and so his research pivoted to COVID-19 research. He developed models for detecting new genetic variants using genetic sequence data sampled from wastewater. In his current position, he is building on these models by extending them through time and space, and detecting variants without knowing their genetic sequence.

RECORDING AVAILABLE <click here>

Nicholas LaHaye (NASA-JPL, USA)

Data Scientist. Jet Propulsion Laboratory, Caltech.

Read more about the presentation...

Segmentation of wildfires, smoke plumes, and burn scars using multi-sensor input and unsupervised and supervised machine learning for improved spatiotemporal coverage and facilitation of automated tracking

Abstract. At present, to detect and track wildfires and associated smoke plumes, instrument-specific retrieval algorithms need to be developed and applied to individual instruments. Such development is labor intensive and requires domain-specific parameters and instrument-specific calibration metrics, alongside manual efforts to track retrieved objects across multiple scenes. Previously, we developed an unsupervised machine learning method that uses level 1 (L1) radiances from various satellite and airborne imagers, as well as the fusion of datasets, where applicable, as input. This method was tested on data collected during the Fire Influence on Regional to Global Environments and Air Quality (FIREX-AQ) campaign. The clustered output of the unsupervised models accurately identifies wild fires, smoke plumes, and burn scars, and establishes the basis for future automated tracking capabilities. In this work, we build off of the initial capabilities to improve detection quality and the ability to represent the shape of the smoke plumes in order to better facilitate automated tracking. These updates involve using supervised machine learning in conjunction with the unsupervised models for improved speed and performance and utilization of shape approximations and cluster distribution information to assist with certainty, when tracking plumes.

Bio. Nick LaHaye is a Data Scientist at the Jet Propulsion Laboratory (JPL). He earned is PhD. in Computational and Data Sciences from Chapman University in Orange, California, USA, and has been at JPL for 10 years. His current research includes the development and application of a software system called SIT-FUSE which applies unsupervised machine learning for data fusion, feature extraction, image segmentation, and instance tracking. As well as researching and building new functionality, he is currently involved with applying the software in different research projects across various Earth Science domains to segment wildfires, smoke, and volcanic ash plumes, characterize harmful algal blooms, and identify water bodies in Multi-Look FFSAR scenes from Sentinel-6.

RECORDING AVAILABLE <click here>

Zhiwei Zhen (University of Texas at Dallas, USA)

Ph.D. Candidate. Department of Mathematical Sciences.

Read more about the presentation...

From Geometric Deep Learning to Visualization: Uncovering Hidden Patterns in Climate-induced Biosurveillance and Resilience

Abstract. Virtually all aspects of our societal functioning -- from food security to energy supply to healthcare -- depend on the dynamics of environmental factors. Nevertheless, the social dimensions of weather and climate are noticeably less explored. By harnessing the strength of geometric deep learning and NASA satellite-based observations, we build a consensus machine learning model to investigate the complex interrelationships between environmental factors and vulnerability of different socio-economic groups to infectious diseases, with a particular focus on clinical severity of COVID-19. We also develop a new interactive visualization interface that can be used by healthcare professionals, policy makers and other stakeholders to better understand resilience of various communities to adverse climate and weather events.

Bio. Research Assistant focus on Geometric Machine Learning, Anomaly Detection and Graph Classification tasks.

RECORDING AVAILABLE <click here>

November 18th, 2022

Plenary Talk

10:00 am

(Central Time)

Auroop R. Ganguly

Professor / Chief Scientist

Northeastern University / US DOE’s Pacific Northwest National Laboratory

USA

Climate science and resilience with physics-guided informatics on big and small data

(click to see the full abstract and bio)

Abstract

Global climate and earth system models (ESMs), which numerically solve partial differential equations with high performance simulations, continue to have knowledge gaps and exhibit intrinsic variability for stakeholder relevant variables and resolutions. Data-driven sciences integrated with process understanding, especially the physics or biogeochemistry that may not be fully captured within the simulations, are critical to improve model parameterizations, develop a comprehensive characterization of variability and uncertainty, and extract scientific insights from archived model simulations. Furthermore, data-driven discrete event simulations have been proposed to incorporate societal dimensions such as management of watersheds in the land component of earth system models. While data from archived model simulations and remote sensors are Big in terms of volume and velocity, in-situ sensor or historical data in general may be short and noisy, especially for extremes. The first part of this presentation will rely on our work at the Sustainability and Data Sciences Laboratory (SDS Lab) and the extant literature to understand how integrated physics and data-driven sciences, such as extreme value statistics, spatiotemporal machine learning or nonlinear dynamics, designed for both big and small data, can help address these challenges. The second part of the presentation will focus on how statistics and machine learning can address challenges in three areas of critical and growing importance: prediction and predictability of short-term extreme weather or hydrological events in a changing climate, credible downscaling of climate model simulations at scales of relevance to stakeholders, and inference of nonlinear dependence and data-driven causality leading to attributions in climate science. The presentation will conclude with a short discussion on making climate science actionable by relying not just on governmental or intergovernmental action but also through innovations in the private sector via large corporations and sustainable startups.

Bio

Auroop R. Ganguly is a Professor at Northeastern University with a joint appointment at the US DOE’s Pacific Northwest National Laboratory as a Chief Scientist. He has twenty-four years of full-time professional experience in the US spanning academia, government national laboratory, and the private industry. He has published in interdisciplinary venues such as Nature and PNAS, authored award-winning papers in top-tier machine learning conferences as well as in disciplinary journals in the geosciences and civil engineering, cited in all three of the United Nations recent IPCC AR6 assessment reports, invited in United Nations review panels, and has been quoted in global and national media outlets such as Newsweek and the New York Times. Ganguly is a PhD from MIT and a Fellow of the American Society of Civil Engineers.

RECORDING AVAILABLE <click here>

Session 5. Data fusion for environmental applications

Xiaoyu Xiong (University of Exeter, UK)

Postdoctoral research fellow. Mathematics and Statistics.

Read more about the presentation...

Data fusion with Gaussian processes for estimation of environmental hazard events

Abstract. Environmental hazard events such as extra-tropical cyclones or windstorms that develop in the North Atlantic can cause severe societal damage. Environmental hazard is quantified by the hazard footprint, a spatial area describing potential damage. However, environmental hazards are never directly observed, so estimation of the footprint for any given event is primarily reliant on station observations (e.g., wind speed in the case of a windstorm event) and physical model hindcasts. Both data sources are indirect measurements of the true footprint, and here we present a general statistical framework to combine the two data sources for estimating the underlying footprint. The proposed framework extends current data fusion approaches by allowing structured Gaussian process discrepancy between physical model and the true footprint, while retaining the elegance of how the ""change of support"" problem is dealt with. Simulation is used to assess the practical feasibility and efficacy of the framework, which is then illustrated using data on windstorm Imogen.

Bio. Xiaoyu is a Postdoctoral Research Fellow in the Department of Mathematics and Statistics at the University of Exeter. Her research interests focus on using statistical and machine learning models as a tool for solving problems in the real world, such as making better decisions under uncertainty or answering questions about processes we have data for. She specialises in Gaussian processes (GPs) modelling, uncertainty quantification (UQ) and their applications. Xiaoyu currently works in a project investigating UQ methods for propagating and quantifying uncertainty in hierarchies of numerical codes. Between Feb 2021 and Oct 2022, she worked in the ‘Uncertainty Quantification for Expensive COVID-19 Simulation Models’ project, where she used GP emulator and history matching to calibrate a high-resolution spatial Covid-19 simulation model in real-time to enable fast high-resolution forecasts of Covid-19 spread with accurate uncertainty in real-time and under policy interventions. Between Dec 2017 and Jan 2021, she worked in the project ‘Big data methods for improving windstorm footprint prediction (BigFoot)’, where she developed data blending frameworks based on GPs for improved wind gust speed prediction accuracy. Xiaoyu received her PhD in Computing Science (Machine Learning) in 2017 from the University of Glasgow. The topic of her PhD was Adaptive Multiple Importance Sampling for Gaussian processes.

RECORDING AVAILABLE <click here>

Alejandro Coca-Castro (Alan Turing Institute, UK)

Posdoctoral Research Associate. Data Science for Science and Hummanities.

Read more about the presentation...

Probabilistic downscaling of UK surface soil moisture fusing in-situ records and terrain layers

Abstract. The access to high spatial resolution soil moisture data has mainly been provided by either high-resolution satellite observations or by downscaling existing coarse-resolution satellite- and/or process-based soil moisture datasets. In this talk we will present advances in the latter focused on introducing and discussing the performance of convolutional neural processes (ConvNPs) models to probabilistically downscale surface soil moisture in the UK. We train the models using data from the COSMOS-UK sensor network (~70k observations), Global Multi-resolution Terrain Elevation Data 2010 (GMTED2010) and coarse climate estimates including top layer soil moisture (0–7 cm) from ERA5 reanalysis gridded data at 25 km spacing across the globe. The target resolution is guided by ERA5-Land product, a replay of the land component of ERA5 with a finer native resolution of 9 km but requiring specialist expertise and enormous quantities of supercomputer time. We assess the performance of the best trained ConvNPs against the reference ERA-5 Land top layer soil moisture dataset and benchmarks methods (naïve Gaussian process, linear regression, and nearest neighbour interpolation). The comparison against naive approaches and reference dataset has led to map performance gains and investigating the most optimal hyperparameters and feature configuration of ConvNPs. While the preliminary results are promissory, further work is being conducted to generate more spatially coherent and consistent downscaled predictions. In this regard, we will improve the model performance by injecting further contextual information driving soil moisture including slope, land cover and use, soil porosity, among others.

Bio. Alejandro is a Postdoctoral Research Associate at The Alan Turing Institute, with a background in Physical Geography. His research focuses on modelling Earth systems and Environmental phenomena using artificial intelligence and data science. His current research involves the development of probabilistic data-driven models for the intelligent fusion of data from a wide range of sources (satellite, reanalysis, in-situ surface sensors, among others) to help predicting environmental and climate variables in terrestrial biomes.

Alicia Gressent (INERIS, France)

Research engineer. Atmospheric modeling and environmental mapping.

Read more about the presentation...

Data fusion for air quality mapping using low-cost sensor observations

Abstract. The recent technological developments and the strong increased interest for public information lead to a fast-growing use of sensors for air quality monitoring. This work aims to take the best of these sensors despite the very high related uncertainty to contribute to i) the public awareness, ii) the monitoring of air quality, iii) the assessment of the individual exposure and iv) the improvement of modeling and emission inventories. We used PM10 observations in Nantes (a French city) provided by sensors that have been installed in the city center and deployed on service vehicles in November 2018 for air quality mapping to show the potential added-value with respect to the dispersion model (ADMS-Urban) calculations (Gressent et al., 2020). For this purpose, data fusion was performed by combining preprocessed sensor observations and annual average of the dispersion model based on a universal kriging procedure. A sensitivity study highlights the importance to estimate accurately the measurement uncertainty of the devices to ensure relevant air quality mapping. In addition, efforts still need to be done on the sampling design to ensure the spatial representativeness of the observations and on the optimization of the sensor deployment to get more accurate and consistent estimates.

Bio. Alicia GRESSENT is a research engineer in the Atmospheric Modelling and Environmental Mapping Unit of the National Institute for the Industrial Environment and Risks (INERIS) since 2018. She holds a PhD in Atmospheric Sciences from University Paul Sabatier Toulouse III and worked as a postdoctoral associate at MIT (Massachusetts Institute of Technology, USA) in the Center for Global Change Sciences. She has authored and co-authored papers on atmospheric chemistry modeling. Her main expertise at INERIS is in the field of pollutants mapping and pollution modelling using geostatistical techniques as well as pollution dispersion on the urban scale. She is part of the Emergency Situation Response Unit (CASU) at INERIS that provides public authorities with immediate decision-making support in the event of observed or imminent technological dangers to humans or the environment. Alicia Gressent has been involved in the tasks exploring new data flows in air quality mapping and establishing methodology for city ranking within the European Environment Agency’s European Topic Centers ETC/ATNI and ETC/HE.

RECORDING AVAILABLE <click here>

TIES working group session

Matthew Wheeler (National Institute of Environmental Health Science, USA)

Scientist/Researcher. Biostatistics and Computational Biology Branch.

Read more about the presentation...

Mixed Bayesian compressed regression for multivariate models for large correlated geospatial datasets

Abstract. Modeling complex high-dimensional geostatistical data presents many computational challenges, which has led to substantive algorithmic developments, beyond the possible need for high-performance computing. Even with these developments, the challenges for model fitting and inference handling multivariate inference, especially within a Bayesian statistical framework, are still substantial. Here, we offer an extension of the efficient new sampling algorithm developed by Moran and Wheeler (2022), named as Fast Increased Fidelity Approximate Gaussian Process (FIFA-GP), to multivariate spatial data observed at fixed locations of a region. This algorithm takes advantage of H-matrices approximation of the matrices comprising the GP posterior covariance, and allows to move from a cubic complexity to a near linear complexity. We demonstrate the scalability of the proposed approach using synthetic data as well as existing geospatial ecological data.

Bio. Matt Wheeler earned his Ph.D. in Biostatistics from the University of North Carolina at Chapel Hill. He researches non-parametric Bayesian methods applied to environmental health for various problems, including toxicity testing and high dimensional data. His research has garnered numerous awards, and in 2016, President Obama awarded him the President's Early Career Award for Scientists and Engineers.

RECORDING AVAILABLE <click here>

Marta Blangiardo (Imperial College London, UK)

Professor of Biostatistics. Epidemiology and Biostatistics.

Read more about the presentation...

A dependent Bayesian Dirichlet Process model for source apportionment of particle number size distribution

Abstract. The relationship between particle exposure and health risks has been well established in recent years. Particulate matter (PM) is made up of different components coming from several sources, which might have different level of toxicity. Hence, identifying these sources is an important task in order to implement effective policies to improve air quality and population health. The problem of identifying sources of particulate pollution has already been studied in the literature. However, current methods require an a priori specification of the number of sources and do not include information on covariates in the source allocations. Here, we propose a novel Bayesian non-parametric approach to overcome these limitations. In particular, we model source contribution using a Dirichlet process as a prior for source profiles, which allows us to estimate the number of components that contribute to particle concentration rather than fixing this number beforehand. To better characterise them we also include meteorological variables (wind speed and direction) as covariates within the allocation process via a flexible Gaussian kernel. We apply the model to apportion particle number size distribution measured near London Gatwick Airport (UK) in 2019. When analyzing this data, we are able to identify the most common PM sources, as well as new sources that have not been identified with the commonly used methods.

Bio. I am a professor of Biostatistics in the Department of Epidemiology and Biostatistics and I lead the Biostatistics and Data Science theme of the MRC Centre for Environment and Health (http://www.environment-health.ac.uk/). My research contributions are related to the development of statistical models, particularly in the context of Bayesian hierarchical models, and mostly in the field of environmental epidemiology, where the data are characterised by spatial (and/or temporal) structure. In particular I have worked on statistical disease mapping using methods that take into account the spatial and spatio-temporal structures and dependencies, as well as on models to integrate multiple data sources to better characterise environmental exposures. More recently I have been working on Bayesian hierarchical models where interrupted time-series were coupled with spatial correlation to better account for the complex nature of the data to evaluate the effect of a policies on reproductive and mental health. I am part-time seconded at the Alan Turing Institute (www.turing.ac.uk), working with the RSS-Turing Lab to use spatio-temporal models to answer a range of research questions related to COVID-19 incidence and prevalence. I have a PhD in Applied Statistics from the University of Florence (Italy) and a degree in Statistics, Demography and Social Sciences from the university of Milan-Bicocca (Italy).

RECORDING AVAILABLE <click here>

Christopher Wikle (University of Missouri, USA)

Curators' Distinguished Professor and Chair. Statistics.

Read more about the presentation...

An Illustration of Model Agnostic Explainability Methods Applied to Environmental Data

Abstract. Historically, two primary criticisms statisticians have of machine learning and deep neural models is their lack of uncertainty quantification and the inability to do inference (i.e., to explain what inputs are important). Explainable AI has developed in the last few years as a sub-discipline of computer science and machine learning to mitigate these concerns (as well as concerns of fairness and transparency in deep modeling). In 2021-22, a TIES working group brought together a broad group of statisticians and environmental modelers from the United States, Canada, and India to explore these issues in the context of modern environmental models. In this talk, our focus is on explaining which inputs are important in models for predicting environmental data. In particular, we focus on three general methods for explainability that are model agnostic and thus applicable across a breadth of models without internal explainability: “feature shuffling”, “interpretable local surrogates”, and “occlusion analysis”. We describe particular implementations of each of these and illustrate their use with a variety of models, all applied to the problem of long-lead forecasting monthly soil moisture in the North American corn belt given sea surface temperature anomalies in the Pacific Ocean.

Bio. Christopher K. Wikle is Curators’ Distinguished Professor and Chair of Statistics at the University of Missouri (MU), with additional appointments in Soil, Environmental and Atmospheric Sciences and the Truman School of Public Affairs. He received a PhD co-major in Statistics and Atmospheric Science in 1996 from Iowa State University. He was research fellow at the National Center for Atmospheric Research from 1996-1998, after which he joined the MU Department of Statistics. His research interests are in spatial and spatio-temporal statistics applied to environmental, ecological, geophysical, agricultural and federal survey applications, with particular interest in dynamics. His work has been concerned with formulating computationally efficient deep hierarchical Bayesian models motivated by scientific principles, with more recent work at the interface of deep neural models in machine learning. Awards include elected Fellow of the American Statistical Association (ASA), Institute of Mathematical Statistics (IMS), elected Fellow of the International Statistical Institute (ISI), Distinguished Alumni Award from the College of Liberal Arts and Sciences at Iowa State University, ASA Environmental (ENVR) Section Distinguished Achievement Award, co-awardee 2017 ASA Statistical Partnership Among Academe, Industry, and Government (SPAIG) Award, the MU Chancellor’s Award for Outstanding Research and Creative Activity in the Physical and Mathematical Sciences, the Outstanding Graduate Faculty Award, and Outstanding Undergraduate Research Mentor Award. His book Statistics for Spatio-Temporal Data (co-authored with Noel Cressie) was the 2011 PROSE Award winner for excellence in the Mathematics Category by the Association of American Publishers and the 2013 DeGroot Prize winner from the International Society for Bayesian Analysis. His latest book, Spatio-Temporal Statistics with R, with Andrew Zammit-Mangion and Noel Cressie, was published in 2019 and won the 2019 Taylor and Francis award for Outstanding Reference/Monograph in the Science and Medicine category. Dr. Wikle is Associate Editor for several journals and is one of six inaugural members of the Statistics Board of Reviewing Editors for Science.

RECORDING AVAILABLE <click here>

JOINT DISCUSSION - RECORDING AVAILABLE <click here>

Session 6. Climate justice: the data science perspective

Lelia Marie Hampton (Massachusetts Institute of Technology, USA)

Ph.D. Student. Electrical Engineering and Computer Science.

Read more about the presentation...

Opportunities for Machine Learning for Climate Justice

Abstract. Climate change does not impact groups equally. Marginalized groups, including racially marginalized people, people in the Global South, disabled people, women, children, and so on, experience and will experience climate impacts at a more severe level. However, some impacts of climate change on various groups are still not well understood. For instance, climate health and epidemiology is a budding field where we do not understand as much as we could. In a world with growing data supplies and computational power, as well as advanced statistical inference, we can gain more insight than ever into our data. Machine learning offers an opportunity to provide insightful inference and analysis on these issues. For example, the emerging field of deep causal inference can provide a basis for direct and indirect effects of climate on human health and infectious diseases. In this talk, we aim to present some open areas of opportunities to address questions in climate justice with machine learning.

Bio. Lelia Marie Hampton is a Ph.D. student in Computer Science at the Massachusetts Institute of Technology. Their research interests are (in no order) applied machine learning (e.g., global pandemics, climate justice, online harassment), Black feminist philosophy of AI, and AI safety (i.e. long tails, distribution shift). They are an MIT Presidential Fellow, Alfred P. Sloan Scholar, and Social and Ethical Responsibilities of Computing (SERC) Scholar. Currently, they serve as the co-president of both the Black Graduate Student Association and the Academy of Courageous Minority Engineers. They earned a Bachelor of Science in Computer Science, Summa Cum Laude, with minors in comparative women's studies and mathematics from Spelman College (Class of 2020) where they were inducted into Phi Beta Kappa. During undergrad, they interned at the MIT Media Lab, Microsoft Research, Georgia Tech Research Institute, and NASA.

RECORDING AVAILABLE <click here>

Marco Tedesco (Columbia University, USA)

Read more about the presentation...

Integrating Public Socioeconomic, Physical Risk, and Housing Data for Climate Justice Metrics: Miami and Tampa test cases

Yuzhou Chen (Temple University, USA)

Assistant Professor. Department of Computer and Information Sciences.

Read more about the presentation...

Assessing Urban Form and Climate Justice with Deep Learning

Abstract. Extracting features embedded in images is a standard routine in the computer vision applications and is essential to earth and urban observation research when applying such techniques to satellite or aerial imagery. Deep learning architecture has gained popularity in facilitating common analysis tasks such as land use and land cover (LULC) classification and has been increasingly utilized in transfer learning approach to study socio-demographic aspects of the urban environment. However, extracting interpretable urban structures currently relies on applying deep learning models with high-resolution imagery and whether the success is transferable to lower resolution settings is yet to be fully examined. To capture and explore such important connectivity information, in the study, we utilize a superpixel graph representation of satellite dataset and propose S$^2$-GNN on superpixel graphs, i.e., a novel superpixel Graph Neural Network (GNN) based model to unsupervised representation learning based on mutual information maximization between graph-level representation and simplicial complex-level representation. Following our proposed approach, S$^2$-GNN can encode both global structural and higher-order (sub)structures features into graph-level representations. We conduct a case study in urbanized San Diego County regions using the 30m resolution satellite imagery from Landsat 8 (Collection 2, Level 2, Tier 1) at the scale of $15\times 15$ pixels for each tile. Extensive experiments on satellite air pollution datasets in San Diego county suggest that S$^2$-GNN significantly outperforms Convolutional Neural Network (CNN) baselines on unsupervised image classification.

Bio. Yuzhou Chen is a tenure-track Assistant Professor in Department of Computer and Information Sciences at Temple University. Before that, he was a Postdoctoral scholar in Department of Electrical and Computer Engineering at Princeton University. He received his Ph.D. in Statistics at Southern Methodist University in 2021. He was a research fellow at the Lawrence Berkeley National Laboratory, National Renewable Energy Laboratory, and INRIA, respectively. He received his M.S. degree from University of Texas at Dallas. His main research interests are in geometric deep learning, topological data analysis, knowledge discovery in graphs and spatio-temporal data, with applications to power systems, biosurveillance and blockchain data analytics.

RECORDING AVAILABLE <click here>

ADJOURN - RECORDING AVAILABLE <click here>