Reading images¶
The fitsy.ImageHdu class wraps an image extension. Pixel
data is returned as numpy.ndarray whose shape is the
reverse of axes (numpy is row-major while
FITS lists axes fastest-first).
"""Read image pixels: NAXIS / numpy axis order, dtype handling.
Run from the repo root:
python examples/python/reading_images.py
"""
import fitsy
with fitsy.open("examples/data/ngc2403.fits.gz") as f:
hdu = f[0]
print("axes (NAXIS1, NAXIS2):", hdu.axes) # [1448, 2172]
print("data.shape (numpy): ", hdu.data.shape) # (2172, 1448)
assert hdu.axes == [hdu.data.shape[1], hdu.data.shape[0]]
# `data` is returned in physical units with an appropriate dtype
# (BZERO / BSCALE / BLANK applied automatically).
print("dtype:", hdu.data.dtype)
data is always returned in physical units with an appropriate
numpy dtype, matching the behavior of astropy.io.fits.
Working with large images¶
hdu.data materializes the entire array in memory. fitsy reads
file bytes lazily: fitsy.open() parses every HDU’s header
up front (small, bounded by the FITS 2880-byte block size) but
defers reading each HDU’s pixel/table bytes until that HDU is
actually accessed. Opening a 50-GB multi-extension mosaic costs
only the headers; touching f[3].data reads HDU 3’s data section
and nothing else.
The numpy array returned by hdu.data is always a freshly
allocated, native-byte-order array. FITS stores numeric data
big-endian, so on every modern little-endian host the read path
must byteswap to deliver a native dtype – and byteswapping
inherently allocates a fresh buffer. fitsy embraces this:
one explicit copy on read, then every downstream operation runs
at full native speed.
For images that do not fit in RAM, or when only a small region is
needed, use section:
with fitsy.open("big.fits") as f:
hdu = f[0]
tile = hdu.section[100:200, 100:200] # reads and decodes only this slice
The section reader uses positional pread to fetch only the
requested rows – the full image is never materialised. (One
exception: when the header carries non-trivial BSCALE /
BZERO / BLANK scaling, the partial-region read falls back
to materialising the full array so the scaling pass can run.
Touching hdu.data first will preload the array, after which
section reads slice the in-memory copy for free.)
When the file is opened with mode='update', assigning to a
section writes only the affected pixel bytes back to disk via
positional pwrite – no full-image rewrite is performed:
with fitsy.open("big.fits", mode="update") as f:
f[0].section[100:200, 100:200] = patch
In-place writes are not crash-safe: a process death mid-patch can
leave the file with some rows updated and others not. This matches
astropy’s mmap-backed update path. Callers that need atomicity
should snapshot the file before patching (or use a temp-file +
rename rewrite via writeto).
In-place writes require contiguous slicing (start:stop with
step 1, plus non-negative integer indices) on an HDU with
identity scaling (BSCALE=1, BZERO=0, no BLANK).
Anything else (fancy indexing, negative steps, Ellipsis,
boolean masks, or scaled HDUs) raises a ValueError so that
the caller can either narrow the key or explicitly opt into a
full-file rewrite via hdu.data[...] = value.
Limitations of lazy reads¶
The lazy-read and in-place-write paths above apply to plain image HDUs (BITPIX-encoded pixel arrays). Other HDU kinds currently materialise their data section in full on first access:
Tile-compressed images (BINTABLE with
ZIMAGE = T, unwrapped via the RustFitsFile::imageaccessor) are decompressed tile-by-tile into one contiguous numpy array on the first.dataaccess. Section reads and in-place writes are not supported for compressed HDUs – the underlying tiles must be re-encoded as a unit, sohdu.section[...] = patchfalls through to a full-image rewrite on the nextflush().Random-groups HDUs hold the entire data section in a single numpy structured array; there is no streaming accessor.
ASCII tables and BINTABLE HDUs read all rows eagerly into the per-HDU data cache. Per-row or per-column streaming is not yet implemented; for very wide or very long tables consider opening the file, processing one HDU at a time, and dropping references to release memory.
If you need partial reads from these formats today, the workaround
is to open the file, materialise the HDU you need, immediately
extract the subset you care about, and let the rest go out of
scope. The Rust FitsFile::verify_checksums helper streams
each HDU’s data section in 1-MiB chunks without populating the
cache, so it is safe to call on arbitrarily large files.