Parallel Processing

Parallel processing allows you to speed up data workflows by performing operations simultaneously. However, the HDF5 library maintains complex internal states that can be easily corrupted if multiple workers attempt to write to the file at the exact same moment.

The Safety Rule: Always Lock

h5lite is not inherently safe for concurrent writing.

While the underlying HDF5 library may support thread-safety for specific low-level operations, h5lite utilizes HDF5’s High-Level APIs (specifically the Dimension Scales API) to manage R attributes like names and dimnames. These High-Level APIs are not thread-safe.

Therefore, strictly follow this rule:

If multiple processes or threads access the same HDF5 file, you must use an external locking mechanism (mutex or file lock) to serialize the write operations.

Without locking, you risk race conditions that can corrupt your data or the HDF5 file structure itself.

mirror server hosted at Truenetwork, Russian Federation.