Hash Utilities
- class vernamveil.blake3(data=b'', key=None, length=32)
Bases:
object
A hashlib-style BLAKE3 hash object using the C backend.
This class provides a BLAKE3 hash object with a hashlib-like interface, using the C backend for fast hashing. It accumulates data via update() calls and processes all chunks in C upon digest().
Initialise a BLAKE3 hash object.
- Parameters:
data (bytes or bytearray or memoryview) – Initial data to hash. Defaults to an empty byte string.
key (bytes or bytearray or memoryview, optional) – Optional key for keyed hashing. If None, no key is used.
length (int) – Desired output length in bytes. Default is 32 bytes.
- property block_size: int
The size of the internal block used for hashing.
- Returns:
64 bytes, which is the standard block size for BLAKE3.
- Return type:
int
- copy()
Return a copy of the current blake3 hash object.
- digest(length=None)
Compute the BLAKE3 hash of the accumulated data with optional keying and length.
- Return type:
bytearray
- Parameters:
length (int, optional) – Desired output length in bytes. If None, uses the default length set during initialisation.
- Returns:
The BLAKE3 hash digest of the accumulated data, optionally keyed and of specified length.
- Return type:
bytearray
- Raises:
RuntimeError – If the C-backed BLAKE3 module is not available.
- property digest_size: int
The size of the hash output in bytes.
- Returns:
32 bytes by default, can be set during initialisation.
- Return type:
int
- hexdigest(length=None)
Compute the BLAKE3 hash of the accumulated data and return it as a hexadecimal string.
- Return type:
str
- Parameters:
length (int, optional) – Desired output length in bytes. If None, uses the default length set during initialisation.
- Returns:
The BLAKE3 hash digest of the accumulated data as a hexadecimal string.
- Return type:
str
- property name: str
The name of the hash algorithm.
- Returns:
The name of the hash algorithm, which is “blake3”.
- Return type:
str
- update(data)
Update the hash object with additional data.
- Return type:
None
- Parameters:
data (bytes or bytearray or memoryview) – Data to add to the hash.
- vernamveil.fold_bytes_to_uint64(hashes, fold_type='view')
Fold each row of a 2D uint8 hash output into a uint64 integer (big-endian).
- Parameters:
hashes (np.ndarray[tuple[int, int], np.dtype[np.uint8]]) – 2D array of shape (n, H) where H >= 8.
fold_type (Literal["full", "view"]) – Folding strategy. “view”: Fastest; reinterprets the first 8 bytes as uint64. “full”: Slower; folds all bytes in the row using bitwise operations. Default is “view”.
- Returns:
1D array of length n, each element is the folded uint64 value of the corresponding row.
- Return type:
np.ndarray[tuple[int], np.dtype[np.uint64]]
- Raises:
ValueError – If the input array is not 2D or has less than 8 columns.
ValueError – If fold_type is not ‘full’ or ‘view’.
- vernamveil.hash_numpy(i, seed=None, hash_name='blake2b', hash_size=None)
Compute a 2D NumPy array of uint8 by applying a hash function to each index, optionally using a seed as a key.
If no seed is provided, the index is hashed directly.
This function optionally uses cffi to call a custom C library, which wraps an optimised C implementation (with OpenMP and OpenSSL) for efficient, parallelised hashing from Python. If the C module isn’t available a NumPy fallback is used.
- Parameters:
i (np.ndarray[tuple[int], np.dtype[np.uint64]]) – NumPy array of indices (dtype should be unsigned 64-bit integer).
seed (bytes or bytearray, optional) – The seed bytes are prepended to the index. If None, hashes only the index.
hash_name (Literal["blake2b", "blake3", "sha256"]) – Hash function to use (“blake2b”, “blake3” or “sha256”). The blake3 is only available if the C extension is installed. Defaults to “blake2b”.
hash_size (int, optional) – Size of the hash output in bytes. Should be 64 for blake2b, larger than 0 for blake3 and 32 for sha256. If None, the default size for the selected hash algorithm is used. Defaults to None.
- Returns:
A 2D array of shape (n, H) where H is the hash output size in bytes. Each row contains the full hash output for the corresponding input.
- Return type:
np.ndarray[tuple[int, int], np.dtype[np.uint8]]
- Raises:
ValueError – If the hash_size is not 64 for blake2b, larger than 0 for blake3 or 32 for sha256.
ValueError – If a hash algorithm is not supported.
ValueError – If hash_name is “blake3” but the C extension is not available.