Optimizing rasterio window reads for memory efficiency
Read a large GeoTIFF in constant, predictable RAM by aligning rasterio.windows.Window to the file’s native block_shapes, passing out_dtype directly to src.read(), and operating on arrays in-place. The minimal pattern:
import rasterio
from rasterio.windows import Window
import numpy as np
with rasterio.open("sentinel2_B04.tif") as src:
block_h, block_w = src.block_shapes[0]
window = Window(0, 0, block_w, block_h) # one native tile
chunk = src.read(window=window, out_dtype=np.float32, masked=True)
That single call reads exactly one compressed tile, casts it to float32 inside GDAL, and tracks nodata without an extra allocation — your RAM footprint stays flat no matter how large the file is on disk.
Why this arises in remote sensing workflows
Multi-gigabyte satellite scenes, DEMs, and LiDAR-derived grids cannot be loaded into RAM in one call on a typical workstation. Yet the naive src.read() — no window, no dtype control — materialises the entire raster and silently promotes uint16 reflectance to float64, consuming up to eight times the raw pixel memory. This pattern is part of the Handling Pixel Resolution and Scaling workflow, which governs how spatial resolution choices propagate through every read, resample, and write operation in a pipeline. For the broader context of how raster storage is structured see Core Raster Fundamentals & STAC Mapping.
The problem is ubiquitous at the I/O boundary: processing scripts written to test on 100 MB tiles break silently at production when scenes are 4–12 GB. Block-aligned windowed reads are the correct architectural response.
Environment and setup
| Package | Minimum version | Role on this page |
|---|---|---|
rasterio |
1.3 | Windowed reads, block_shapes, masked reads |
numpy |
1.24 | In-place array operations, dtype constants |
psutil |
5.9 | Peak RSS monitoring (validation only) |
pip install "rasterio>=1.3" "numpy>=1.24" "psutil>=5.9"
How block structure controls memory cost
Remote sensing formats — GeoTIFF, Cloud-Optimized GeoTIFF (COG) — store pixels in compressed tiles or strips rather than as one contiguous array. When you request a window that cuts across tile boundaries, GDAL must decompress every intersecting tile in full, extract the pixels that fall inside your window, and discard the rest. A window offset by a single pixel from the tile grid can force GDAL to decompress four tiles instead of one.
The diagram below shows a 4 × 3 tile grid. Two windows of identical pixel size have very different I/O costs depending on their alignment.
Rasterio exposes the native tile dimensions through src.block_shapes, a list of (height, width) tuples — one per band. Aligning your chunk size to these values keeps GDAL operating at maximum efficiency.
Complete working example
The function below iterates a raster in block-aligned chunks, reads each one with out_dtype and masked=True, and processes in-place. It works on local GeoTIFFs and on COGs read over HTTP with equal efficiency.
import rasterio
from rasterio.windows import Window
import numpy as np
from collections.abc import Generator
def iter_windows_aligned(
src_path: str,
chunk_px: int = 2048,
out_dtype: np.dtype = np.float32,
) -> Generator[tuple[np.ma.MaskedArray, Window], None, None]:
"""
Yield (chunk, window) pairs covering the full raster.
Windows are snapped to the dataset's native block grid so GDAL
decompresses only the tiles it actually needs. out_dtype is
applied at the I/O layer — no intermediate array is ever allocated.
"""
with rasterio.open(src_path) as src:
# block_shapes is a list of (rows, cols) tuples, one per band.
# Bands in a well-formed GeoTIFF share the same block shape.
block_h, block_w = src.block_shapes[0]
# Round chunk_px up to the nearest block multiple so windows
# land on exact tile boundaries regardless of chunk_px choice.
win_h = max(block_h, (chunk_px // block_h) * block_h)
win_w = max(block_w, (chunk_px // block_w) * block_w)
height, width = src.height, src.width
for row_off in range(0, height, win_h):
for col_off in range(0, width, win_w):
# Clamp to raster extents — the last row/column of
# windows is almost always smaller than win_h / win_w.
h = min(win_h, height - row_off)
w = min(win_w, width - col_off)
window = Window(col_off, row_off, w, h)
# out_dtype → GDAL casts during decompression (zero extra alloc).
# masked=True → MaskedArray tracks nodata without a separate bool array.
chunk: np.ma.MaskedArray = src.read(
window=window,
out_dtype=out_dtype,
masked=True,
)
# In-place operation: no new array allocated.
# Replace fill_value with 0 so downstream math is safe.
np.ma.fix_invalid(chunk, copy=False, fill_value=0)
yield chunk, window
# --- Usage example ---
for chunk, win in iter_windows_aligned("dem_10m.tif", chunk_px=4096):
# chunk shape: (bands, win_height, win_width)
mean_elevation = float(chunk.mean())
print(f"Window {win}: mean elevation = {mean_elevation:.1f} m")
Key decisions explained
win_h = max(block_h, (chunk_px // block_h) * block_h)— roundschunk_pxdown to a block multiple, ensuring the window origin always falls on a tile boundary. Without this, achunk_px=2000against 256-px tiles produces misaligned offsets from the second iteration onwards.out_dtypeinsrc.read()— forces GDAL to cast during decompression. Calling.astype(np.float32)after the read allocates the fulluint16array first and then a secondfloat32copy;out_dtypeavoids the first allocation entirely.masked=True— returnsnumpy.ma.MaskedArray. This encodes nodata without allocating a separate boolean array of identical shape, and it propagates correctly through NumPy reductions.np.ma.fix_invalid(chunk, copy=False)— normalises masked values in-place before arithmetic. Operations such aschunk * 0.0001on unmasked fill values can inject extreme numbers into statistics.
Variant patterns
Variant 1 — Dtype promotion guard for spectral index pipelines
When computing a ratio such as NDVI that naturally produces float64 intermediate values, keep float32 throughout by assigning with the out parameter:
import rasterio
from rasterio.windows import Window
import numpy as np
with rasterio.open("sentinel2_8band.tif") as src:
block_h, block_w = src.block_shapes[0]
window = Window(0, 0, block_w * 4, block_h * 4)
# Read NIR (band 4) and Red (band 3) as float32
data = src.read(
indexes=[4, 3],
window=window,
out_dtype=np.float32,
masked=True,
)
nir, red = data[0], data[1]
# Allocate output once; reuse in every iteration
ndvi = np.empty_like(nir)
np.divide(nir - red, nir + red + 1e-8, out=ndvi)
# ndvi stays float32 — no silent float64 promotion
This pattern feeds directly into spectral index calculation pipelines where band-math chains can otherwise silently widen to float64 at the first arithmetic operator.
Variant 2 — Overlapping windows for neighbourhood filters
Convolution, gradient, and morphological filters need surrounding context at chunk edges to avoid seam artifacts. Expand each window by a pad, clamp to bounds, then slice back to the valid region after reading:
import rasterio
from rasterio.windows import Window
import numpy as np
PAD = 32 # pixel overlap on each side
with rasterio.open("slope_raster.tif") as src:
block_h, block_w = src.block_shapes[0]
height, width = src.height, src.width
win_h, win_w = block_h * 8, block_w * 8
for row_off in range(0, height, win_h):
for col_off in range(0, width, win_w):
# Valid (unpadded) extents
h = min(win_h, height - row_off)
w = min(win_w, width - col_off)
# Padded extents — clamped to raster bounds
p_col = max(0, col_off - PAD)
p_row = max(0, row_off - PAD)
p_col_end = min(width, col_off + w + PAD)
p_row_end = min(height, row_off + h + PAD)
padded_window = Window(
p_col, p_row,
p_col_end - p_col,
p_row_end - p_row,
)
chunk = src.read(window=padded_window, out_dtype=np.float32, masked=True)
# Slice back to the valid region (exclude padding)
row_slice = slice(row_off - p_row, row_off - p_row + h)
col_slice = slice(col_off - p_col, col_off - p_col + w)
valid = chunk[:, row_slice, col_slice]
# Apply filter to `chunk`, extract results from `valid`
_ = valid # placeholder for downstream filter
Variant 3 — COG over HTTP (no local download)
Cloud-Optimized GeoTIFFs store tiles at known byte offsets. Block-aligned windows map directly to HTTP range requests, so the windowed-read pattern works unchanged over HTTPS or S3 — GDAL’s VSICURL driver fetches only the required byte ranges:
import rasterio
from rasterio.windows import Window
import numpy as np
COG_URL = "https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/36/N/YF/2023/7/S2A_36NYF_20230715_0_L2A/B04.tif"
with rasterio.open(COG_URL) as src:
block_h, block_w = src.block_shapes[0] # typically 512×512 for Sentinel-2 COGs
window = Window(0, 0, block_w, block_h) # fetch exactly one HTTP range
chunk = src.read(window=window, out_dtype=np.float32, masked=True)
print(f"Block shape: {block_h}×{block_w} | Chunk shape: {chunk.shape}")
STAC items discovered via pystac-client carry asset HREFs that can be passed directly here — no intermediate download step required.
Common errors
AttributeError: 'NoneType' object has no attribute '__iter__' when unpacking block_shapes
Cause: the file has no internal tiling — strips are used instead, and block_shapes[0] is (1, width). Fix: guard with block_h, block_w = src.block_shapes[0]; block_w = block_w if block_w < src.width else 512 and set a sensible fallback chunk width.
Silent float64 promotion — memory use doubles mid-pipeline
Cause: a scalar operation such as chunk * 0.0001 (Python float is float64) widens the array. Fix: use np.float32(0.0001) as the scalar, or pass out=chunk to keep the result in the same dtype buffer.
rasterio.errors.WindowError: window extends beyond the dataset bounds
Cause: the clamping logic is wrong — usually min(win_w, width - col_off) was written as min(win_w, width). Fix: always subtract the current offset when clamping window height and width.
Related
- Handling Pixel Resolution and Scaling — parent workflow; covers how native resolution, overview levels, and resampling strategy set the context for every read operation.
- Understanding Cloud-Optimized GeoTIFF Structure — how COG internal tile layout maps to HTTP range requests; block-aligned windows are the bridge between the two.
- How to Read COG Headers Without Downloading Full Files — inspect block shape, overview levels, and compression type before committing to a chunking strategy.
- Spectral Index Calculation Pipelines — cross-pillar link: band-math pipelines that operate on windowed reads benefit directly from the dtype-control patterns on this page.
- Advanced Resampling and Upscaling Techniques — when resolution changes between read and write, resampling occurs inside the window; understanding block alignment first prevents compounding overhead.