Can I use bilinear resampling on a Sentinel-2 SCL cloud mask?

No. Bilinear resampling generates fractional values between integer class codes, which breaks bitwise operations and np.unique() checks. Always use nearest neighbor for SCL and any other categorical layer.

Should I reproject or resample first?

Always reproject first. Resampling after reprojection introduces a second interpolation pass that compounds spectral error. Chain both operations in a single rasterio.warp.reproject() call to apply one transformation.

Which resampling method preserves NDVI accuracy when upscaling 20 m bands to 10 m?

Bilinear is the standard choice: it is fast, avoids the ringing artifacts that cubic convolution can produce near high-contrast edges, and introduces only mild smoothing that does not materially shift NDVI distributions in most study areas.

Choosing the right resampling method for Sentinel-2

Use Resampling.nearest for Sentinel-2 QA masks and the Scene Classification Layer, and Resampling.bilinear (or Resampling.cubic) for continuous reflectance bands. The minimal snippet:

import rasterio
from rasterio.enums import Resampling

# Reflectance band: 20 m → 10 m
with rasterio.open("S2_B05_20m.tif") as src:
    data = src.read(
        1,
        out_shape=(src.height * 2, src.width * 2),
        resampling=Resampling.bilinear,
    )

# QA / SCL mask: nearest neighbor preserves integer class codes
with rasterio.open("S2_SCL_20m.tif") as src:
    scl = src.read(
        1,
        out_shape=(src.height * 2, src.width * 2),
        resampling=Resampling.nearest,
    )

Why this choice arises in Sentinel-2 workflows

Sentinel-2’s MultiSpectral Instrument delivers 13 bands at three native resolutions: 10 m (B2, B3, B4, B8), 20 m (B5–B8A, B11, B12), and 60 m (B1, B9, B10). Almost every pixel-wise operation — index calculation, classification, or time-series stacking — requires aligning these to a single common grid. The resampling algorithm you choose controls radiometric fidelity, mask integrity, and whether downstream models see physically meaningful values.

This page is part of the Advanced Resampling & Upscaling Techniques workflow, which sits inside the broader Satellite Processing Workflows & Index Pipelines.

Environment & setup

Package	Minimum version	Role
`rasterio`	1.3	Band I/O, `Resampling` enum, in-memory warp
`numpy`	1.24	Array operations, histogram comparison
`scipy`	1.11	KS-test for histogram validation (optional)

Install with:

pip install "rasterio>=1.3" "numpy>=1.24" "scipy>=1.11"

Algorithm reference

Before writing code it helps to understand what each kernel actually computes, because the error modes differ:

Resampling.nearest — assigns the closest input pixel’s value without any arithmetic. No fractional outputs are ever produced. Mandatory for the Sentinel-2 Scene Classification Layer (SCL), cloud probability masks, and any integer-coded label layer. Applying bilinear or cubic to these layers generates in-between floats (e.g. 3.7 between cloud class 3 and shadow class 4) that break np.unique() checks and astype(np.uint8) conversions.
Resampling.bilinear — weighted average of the four nearest pixels. Fast, memory-efficient, and the practical default for upscaling 20 m reflectance bands to 10 m. Introduces slight smoothing that suppresses high-frequency sensor noise but can soften abrupt field boundaries. Almost always preferable to cubic when working with cloud and shadow masking pipelines where edge integrity matters only after masking.
Resampling.cubic — cubic convolution across a 4 × 4 pixel neighbourhood. Sharper edges than bilinear; best for visual composites and precision index derivation. Prone to overshoot (ringing) near high-contrast transitions such as water-land or urban-rural edges. Clamp outputs to [0, 10000] before writing to prevent negative reflectance or values beyond the sensor’s DN range.
Resampling.average — arithmetic mean of all source pixels that fall inside the target cell. Required when downscaling 10 m data to 25 m, 100 m, or 500 m. Preserves energy conservation. Use for feeding regional models that ingest coarser spatial data.
Resampling.mode — most frequent integer value in the target cell. The only correct choice for categorical downscaling; prevents artificial class generation that average would produce.

Complete working example

The function below aligns an entire Sentinel-2 L2A scene to a uniform 10 m grid, applying the correct algorithm per layer type. It writes two outputs: a reflectance GeoTIFF stack and a co-registered SCL mask.

from __future__ import annotations

import numpy as np
import rasterio
from rasterio.enums import Resampling
from rasterio.transform import Affine

# Band paths: keys are band names, values are (file_path, is_categorical)
BAND_REGISTRY: dict[str, tuple[str, bool]] = {
    "B02": ("S2_B02_10m.tif", False),  # Blue reflectance — native 10 m
    "B03": ("S2_B03_10m.tif", False),  # Green reflectance — native 10 m
    "B04": ("S2_B04_10m.tif", False),  # Red reflectance — native 10 m
    "B08": ("S2_B08_10m.tif", False),  # NIR reflectance — native 10 m
    "B05": ("S2_B05_20m.tif", False),  # Red-edge — must upsample 20 m → 10 m
    "B11": ("S2_B11_20m.tif", False),  # SWIR — must upsample 20 m → 10 m
    "SCL": ("S2_SCL_20m.tif", True),   # Scene Classification — categorical
}

TARGET_RESOLUTION_M = 10  # target grid in metres


def resample_band(
    src_path: str,
    target_height: int,
    target_width: int,
    is_categorical: bool,
) -> tuple[np.ndarray, Affine, dict]:
    """
    Read one Sentinel-2 band and resample it to (target_height, target_width).

    Returns the array, the rescaled affine transform, and the rasterio metadata dict.
    """
    method = Resampling.nearest if is_categorical else Resampling.bilinear

    with rasterio.open(src_path) as src:
        # Compute how much the native pixel size differs from the target
        native_pixel_m = src.res[0]                # e.g. 20.0 for a 20 m band
        scale = native_pixel_m / TARGET_RESOLUTION_M  # 2.0 when upscaling 20→10

        # Read with out_shape triggers GDAL's warp engine inline — no temp file needed
        data = src.read(
            1,
            out_shape=(target_height, target_width),
            resampling=method,
        )

        # Rescale the affine transform so the written GeoTIFF has correct georef
        new_transform: Affine = src.transform * src.transform.scale(
            src.width / target_width,   # x scale factor
            src.height / target_height, # y scale factor
        )

        meta = src.meta.copy()
        meta.update(
            {
                "height": target_height,
                "width": target_width,
                "transform": new_transform,
                "dtype": "int16" if not is_categorical else "uint8",
            }
        )

    return data, new_transform, meta


def build_aligned_stack(
    output_reflectance: str,
    output_scl: str,
    ref_band: str = "B02",
) -> None:
    """
    Align all bands to the 10 m grid defined by ref_band, write a reflectance
    stack and a co-registered SCL mask.
    """
    # Derive target dimensions from the reference (native 10 m) band
    with rasterio.open(BAND_REGISTRY[ref_band][0]) as ref:
        target_h, target_w = ref.height, ref.width
        base_meta = ref.meta.copy()

    reflectance_bands = {
        name: resample_band(path, target_h, target_w, categorical)
        for name, (path, categorical) in BAND_REGISTRY.items()
        if not categorical
    }

    # Write multi-band reflectance GeoTIFF (band order: B02, B03, B04, B08, B05, B11)
    band_order = ["B02", "B03", "B04", "B08", "B05", "B11"]
    stack = np.stack(
        [reflectance_bands[b][0] for b in band_order],
        axis=0,
    )

    out_meta = base_meta.copy()
    out_meta.update({"count": len(band_order), "dtype": "int16"})

    with rasterio.open(output_reflectance, "w", **out_meta) as dst:
        dst.write(stack)

    # Write SCL mask
    scl_data, _, scl_meta = resample_band(
        BAND_REGISTRY["SCL"][0], target_h, target_w, is_categorical=True
    )
    with rasterio.open(output_scl, "w", **scl_meta) as dst:
        dst.write(scl_data, 1)

    print(f"Wrote {output_reflectance}  shape={stack.shape}")
    print(f"Wrote {output_scl}  unique SCL classes: {np.unique(scl_data)}")


if __name__ == "__main__":
    build_aligned_stack(
        output_reflectance="S2_L2A_10m_stack.tif",
        output_scl="S2_SCL_10m.tif",
    )

Variant patterns

Reprojection + resampling in a single call

When moving from the native Sentinel-2 UTM zone to an equal-area projection such as EPSG:3035 (Europe) or EPSG:6933 (global), chain both operations inside a single rasterio.warp.reproject() call. Splitting them — reproject then resample separately — applies two successive interpolation passes that compound spectral error and inflate RMSE during ground-truth validation. The one-pass approach, covered in the mastering CRS transformations workflow, prevents this.

import numpy as np
import rasterio
from rasterio.crs import CRS
from rasterio.enums import Resampling
from rasterio.warp import calculate_default_transform, reproject

TARGET_CRS = CRS.from_epsg(3035)  # ETRS89 / LAEA Europe — equal-area


def reproject_and_resample(
    src_path: str,
    dst_path: str,
    target_res_m: float = 10.0,
    is_categorical: bool = False,
) -> None:
    """
    Reproject a Sentinel-2 band to TARGET_CRS while resampling to target_res_m
    in a single GDAL warp call — avoids double-interpolation.
    """
    method = Resampling.nearest if is_categorical else Resampling.bilinear

    with rasterio.open(src_path) as src:
        transform, width, height = calculate_default_transform(
            src.crs,
            TARGET_CRS,
            src.width,
            src.height,
            *src.bounds,
            resolution=target_res_m,  # request explicit pixel size in the target CRS
        )

        dst_meta = src.meta.copy()
        dst_meta.update(
            {
                "crs": TARGET_CRS,
                "transform": transform,
                "width": width,
                "height": height,
            }
        )

        with rasterio.open(dst_path, "w", **dst_meta) as dst:
            for band_idx in src.indexes:
                reproject(
                    source=rasterio.band(src, band_idx),
                    destination=rasterio.band(dst, band_idx),
                    src_transform=src.transform,
                    src_crs=src.crs,
                    dst_transform=transform,
                    dst_crs=TARGET_CRS,
                    resampling=method,
                )

Dask-enabled resampling for large scenes or time series

For a full Sentinel-2 tile (110 × 110 km at 10 m = ~100 million pixels per band) or a time-series stack, load via xarray and rioxarray with Dask chunking. This keeps memory bounded and allows parallel computation before writing to a Cloud-Optimized GeoTIFF.

import xarray as xr
import rioxarray  # noqa: F401  — registers .rio accessor on xr.DataArray

# Open the 20 m B05 band as a lazy Dask-backed DataArray
b05 = xr.open_dataarray(
    "S2_B05_20m.tif",
    engine="rasterio",
    chunks={"x": 2048, "y": 2048},  # ~16 MB per chunk at int16
)

# Upsample to 10 m — rio.reproject_match aligns to a reference DataArray
b02 = xr.open_dataarray("S2_B02_10m.tif", engine="rasterio")

# reproject_match uses bilinear by default; pass Resampling.nearest for masks
b05_10m = b05.rio.reproject_match(b02)

# Compute and write; Dask executes in parallel across chunks
b05_10m.rio.to_raster("S2_B05_10m_dask.tif", driver="GTiff")

Aggregating to coarser grids for regional modeling

When feeding a spectral index calculation pipeline at coarser scales (e.g. integrating with MODIS or ERA5 data), simple pixel centre sampling discards sub-pixel information. Use Resampling.average to preserve energy conservation across the aggregated cell:

import rasterio
from rasterio.enums import Resampling

SCALE_FACTOR = 25  # 10 m → 250 m

with rasterio.open("S2_NDVI_10m.tif") as src:
    out_h = src.height // SCALE_FACTOR
    out_w = src.width // SCALE_FACTOR

    ndvi_250m = src.read(
        1,
        out_shape=(out_h, out_w),
        resampling=Resampling.average,  # area-weighted mean over 25×25 cells
    )

    new_transform = src.transform * src.transform.scale(
        src.width / out_w,
        src.height / out_h,
    )

    meta = src.meta.copy()
    meta.update({"height": out_h, "width": out_w, "transform": new_transform})

    with rasterio.open("S2_NDVI_250m.tif", "w", **meta) as dst:
        dst.write(ndvi_250m, 1)

Common errors

ValueError: cannot convert float NaN to integer Caused by applying Resampling.bilinear to the SCL layer, which produces fractional values, followed by astype(np.uint8). Fix: always pass Resampling.nearest for any layer opened from an SCL or QA file.

Negative reflectance values after cubic resampling Cubic convolution overshoots near high-contrast edges (water-land, urban-rural). Clamp immediately after reading: data = np.clip(data, 0, 10000). Consider switching to bilinear for coastal or mixed study areas where edge artifacts compound.

rasterio.errors.CRSError: Input shapes do not share CRS Occurs when stacking a reprojected band (e.g. EPSG:3035) with a band still in its native UTM zone. Reproject all layers to the common CRS before resampling. Keep a single canonical CRS defined at pipeline entry and propagate it through every step using rasterio.warp.reproject().

Advanced Resampling & Upscaling Techniques — parent workflow covering the full range of interpolation strategies, super-resolution, and multi-sensor fusion.
Cloud & Shadow Masking Strategies — apply SCL-based masks after resampling to avoid edge-mismatch artifacts.
Spectral Index Calculation Pipelines — build NDVI, EVI, and SWIR indices on the aligned band stack produced here.
Mastering CRS Transformations in rasterio — one-pass reproject-and-resample patterns that avoid double-interpolation error.
Handling Pixel Resolution & Scaling — affine transform mechanics and scaling factors that underpin every resampling call.