Hash Utilities

class vernamveil.blake3(data=b'', key=None, length=32)

Bases: object

A hashlib-style BLAKE3 hash object using the C backend.

This class provides a BLAKE3 hash object with a hashlib-like interface, using the C backend for fast hashing. It accumulates data via update() calls and processes all chunks in C upon digest().

Initialise a BLAKE3 hash object.

Parameters:
  • data (bytes or bytearray or memoryview) – Initial data to hash. Defaults to an empty byte string.

  • key (bytes or bytearray or memoryview, optional) – Optional key for keyed hashing. If None, no key is used.

  • length (int) – Desired output length in bytes. Default is 32 bytes.

property block_size: int

The size of the internal block used for hashing.

Returns:

64 bytes, which is the standard block size for BLAKE3.

Return type:

int

copy()

Return a copy of the current blake3 hash object.

Return type:

blake3

Returns:

A new blake3 object with the same state as the current one.

Return type:

blake3

digest(length=None)

Compute the BLAKE3 hash of the accumulated data with optional keying and length.

Return type:

bytearray

Parameters:

length (int, optional) – Desired output length in bytes. If None, uses the default length set during initialisation.

Returns:

The BLAKE3 hash digest of the accumulated data, optionally keyed and of specified length.

Return type:

bytearray

Raises:

RuntimeError – If the C-backed BLAKE3 module is not available.

property digest_size: int

The size of the hash output in bytes.

Returns:

32 bytes by default, can be set during initialisation.

Return type:

int

hexdigest(length=None)

Compute the BLAKE3 hash of the accumulated data and return it as a hexadecimal string.

Return type:

str

Parameters:

length (int, optional) – Desired output length in bytes. If None, uses the default length set during initialisation.

Returns:

The BLAKE3 hash digest of the accumulated data as a hexadecimal string.

Return type:

str

property name: str

The name of the hash algorithm.

Returns:

The name of the hash algorithm, which is “blake3”.

Return type:

str

update(data)

Update the hash object with additional data.

Return type:

None

Parameters:

data (bytes or bytearray or memoryview) – Data to add to the hash.

vernamveil.fold_bytes_to_uint64(hashes, fold_type='view')

Fold each row of a 2D uint8 hash output into a uint64 integer (big-endian).

Parameters:
  • hashes (np.ndarray[tuple[int, int], np.dtype[np.uint8]]) – 2D array of shape (n, H) where H >= 8.

  • fold_type (Literal["full", "view"]) – Folding strategy. “view”: Fastest; reinterprets the first 8 bytes as uint64. “full”: Slower; folds all bytes in the row using bitwise operations. Default is “view”.

Returns:

1D array of length n, each element is the folded uint64 value of the corresponding row.

Return type:

np.ndarray[tuple[int], np.dtype[np.uint64]]

Raises:
  • ValueError – If the input array is not 2D or has less than 8 columns.

  • ValueError – If fold_type is not ‘full’ or ‘view’.

vernamveil.hash_numpy(i, seed=None, hash_name='blake2b', hash_size=None)

Compute a 2D NumPy array of uint8 by applying a hash function to each index, optionally using a seed as a key.

If no seed is provided, the index is hashed directly.

This function optionally uses cffi to call a custom C library, which wraps an optimised C implementation (with OpenMP and OpenSSL) for efficient, parallelised hashing from Python. If the C module isn’t available a NumPy fallback is used.

Parameters:
  • i (np.ndarray[tuple[int], np.dtype[np.uint64]]) – NumPy array of indices (dtype should be unsigned 64-bit integer).

  • seed (bytes or bytearray, optional) – The seed bytes are prepended to the index. If None, hashes only the index.

  • hash_name (Literal["blake2b", "blake3", "sha256"]) – Hash function to use (“blake2b”, “blake3” or “sha256”). The blake3 is only available if the C extension is installed. Defaults to “blake2b”.

  • hash_size (int, optional) – Size of the hash output in bytes. Should be 64 for blake2b, larger than 0 for blake3 and 32 for sha256. If None, the default size for the selected hash algorithm is used. Defaults to None.

Returns:

A 2D array of shape (n, H) where H is the hash output size in bytes. Each row contains the full hash output for the corresponding input.

Return type:

np.ndarray[tuple[int, int], np.dtype[np.uint8]]

Raises:
  • ValueError – If the hash_size is not 64 for blake2b, larger than 0 for blake3 or 32 for sha256.

  • ValueError – If a hash algorithm is not supported.

  • ValueError – If hash_name is “blake3” but the C extension is not available.