Overview
This vignette benchmarks heatwave3 against heatwaveR on a realistic gridded SST dataset, the full OSTIA South-West Indian Ocean reanalysis (400 lon × 200 lat, ca. 49,800 ocean pixels, 14,276 daily time steps, 1982–2021).
We compare three configurations:
- heatwaveR, serial per-pixel R + C++ (the standard approach)
- heatwave3 (1 thread), pure C++ with single-threaded execution
- heatwave3 (12 threads), pure C++ with OpenMP parallelism
Test data
sst_file <- "/Volumes/OceanData/OSTIA_East_Coast_MHW/SWIO_Jan1982-Dec2021.nc"
# 400 lon × 200 lat at 0.05° resolution
# 15–35°E, 38–28°S (Agulhas Current system)
# 14,276 daily time steps (1982-01-01 to 2021-12-31)
# ca. 49,796 ocean pixels (remainder is land)heatwave3: single-threaded
library(heatwave3)
clim_file <- tempfile(fileext = ".nc")
event_file <- tempfile(fileext = ".nc")
t_clim <- system.time(
ts2clm3(sst_file, clim_file,
climatologyPeriod = c("1982-01-01", "2011-12-31"),
var_name = "analysed_sst", n_threads = 1L)
)
t_event <- system.time(
detect_event3(sst_file, clim_file, event_file,
var_name = "analysed_sst", n_threads = 1L)
)
cat("Climatology:", round(t_clim[3], 1), "sec\n")
cat("Detection: ", round(t_event[3], 1), "sec\n")
cat("Total: ", round(t_clim[3] + t_event[3], 1), "sec\n")heatwave3: 12 threads
t_clim_12 <- system.time(
ts2clm3(sst_file, clim_file,
climatologyPeriod = c("1982-01-01", "2011-12-31"),
var_name = "analysed_sst", n_threads = 12L)
)
t_event_12 <- system.time(
detect_event3(sst_file, clim_file, event_file,
var_name = "analysed_sst", n_threads = 12L)
)
cat("Climatology:", round(t_clim_12[3], 1), "sec\n")
cat("Detection: ", round(t_event_12[3], 1), "sec\n")
cat("Total: ", round(t_clim_12[3] + t_event_12[3], 1), "sec\n")heatwaveR: serial baseline
heatwaveR processes one pixel at a time. We time a 20-pixel sample (data pre-loaded to isolate computation from I/O) and extrapolate to the full grid.
library(heatwaveR)
library(ncdf4)
nc <- nc_open(sst_file)
time_raw <- ncvar_get(nc, "time")
dates <- as.Date("1981-01-01") + time_raw / 86400
# Pre-load one longitude column (200 lat pixels)
sst_col <- ncvar_get(nc, "analysed_sst",
start = c(200, 1, 1), count = c(1, 200, -1))
nc_close(nc)
ocean_idx <- which(!is.na(sst_col[, 1]))
n_test <- 20L
t_hwr <- system.time({
for (j in ocean_idx[1:n_test]) {
clm <- ts2clm(data.frame(t = dates, temp = as.numeric(sst_col[j, ])),
climatologyPeriod = c("1982-01-01", "2011-12-31"))
ev <- detect_event(clm)
}
})
per_pixel <- t_hwr[3] / n_test
n_ocean <- 49796 # pre-counted
estimated <- per_pixel * n_ocean
cat("Per pixel: ", round(per_pixel, 3), "sec\n")
cat("Estimated total:", round(estimated), "sec (",
round(estimated / 60, 1), "min)\n")Results
Benchmarked on an Apple M3 Pro (12-core), macOS 15.5, R 4.5.3.
| Method | Climatology | Detection | Total | Speedup |
|---|---|---|---|---|
| heatwaveR (serial) | n/a | n/a | ca. 4,116 sec (68.7 min) | 1× |
| heatwave3 (1 thread) | 174 sec | 82 sec | 257 sec (4.3 min) | 16× |
| heatwave3 (12 threads) | 60 sec | 55 sec | 115 sec (1.9 min) | 36× |
The speedup comes from three sources:
- C++ algorithms. Climatology (sliding-window percentile) and event detection (run-length encoding, gap joining, 19 metrics) are implemented in pure C++ rather than R + data.table.
- OpenMP parallelism. Each pixel is independent, so OpenMP distributes pixels across cores with zero coordination overhead.
- Direct NetCDF I/O. Reads the full data slab in one C call via libnetcdf, avoiding R’s per-pixel ncdf4 random-access overhead.
Scaling notes
- The 12-thread run is 2.2× faster than single-threaded, not 12×, because the full 400×200 grid must first be read into memory (ca. 900 MB for this dataset) and the NetCDF I/O is single-threaded. The parallelism benefit is in the computation phase only.
- For grids that fit in memory, I/O accounts for roughly 40–60% of wall time. Larger grids with more ocean pixels show better parallel scaling as the compute fraction grows.
- On Linux with gcc OpenMP (which has lower thread-creation overhead than macOS libomp), parallel scaling is typically better.
Benguela region: daily files, full pipeline
This benchmark applies heatwave3 to a production-scale analysis, the Benguela upwelling system, using 16,049 individual daily OSTIA files (1982–2025) with the WMO 1991–2020 climatological baseline.
The subsetted OSTIA files cover 7–20°E, 35–17°S at 0.05° resolution (260 lon × 360 lat = 93,600 pixels). This is a region with strong upwelling variability and a mix of ocean and land pixels.
library(heatwave3)
ostia_dir <- "/Volumes/OceanData/Tom/OSTIA"
clim_file <- tempfile(fileext = ".nc")
event_file <- tempfile(fileext = ".nc")
# Climatology from ca. 16,000 daily files (WMO 1991-2020 baseline)
t_clim <- system.time(
ts2clm3(
file_in = ostia_dir,
file_out = clim_file,
climatologyPeriod = c("1991-01-01", "2020-12-31"),
var_name = "analysed_sst",
n_threads = 12L
)
)
# Event detection on the same daily files
t_event <- system.time(
detect_event3(
file_in = ostia_dir,
clim_file = clim_file,
file_out = event_file,
var_name = "analysed_sst",
n_threads = 12L
)
)Benguela results
| Step | Time | Detail |
|---|---|---|
| Read + merge 16,049 daily files | included | sorted, deduplicated, assembled into 3D cube |
| Climatology (12 threads) | 95 sec | 30-year WMO baseline, 93,600 pixels × 366 DOYs |
| Event detection (12 threads) | 108 sec | threshold exceedance, RLE, gap joining, 19 metrics |
| Total | 203 sec (3.4 min) | 8.4 million events detected |
For comparison, processing 93,600 pixels serially with heatwaveR at ca. 0.083 sec/pixel would take ca. 2.2 hours. heatwave3 completes the same analysis in 3.4 minutes, a ca. 38× speedup.
