simpleml.persistables.hashing
¶
Mixin classes to handle hashing
Module Contents¶
Classes¶
Mixin class to hash any object |
|
A subclass of pickler, to do cryptographic hashing, rather than |
|
Special case the hasher for when numpy is loaded. |
|
Class used to ensure the hash of Sets is preserved |
|
Class used to hash objects that won’t normally pickle |
Functions¶
|
Quick calculation of a hash to identify uniquely Python objects |
-
class
simpleml.persistables.hashing.
CustomHasherMixin
[source]¶ Bases:
object
Mixin class to hash any object
-
classmethod
custom_hasher
(cls, object_to_hash, custom_class_proxy=type(object.__dict__))[source]¶ Adapted from: https://stackoverflow.com/questions/5884066/hashing-a-dictionary Makes a hash from a dictionary, list, tuple or set to any level, that contains only other hashable types (including any lists, tuples, sets, and dictionaries). In the case where other kinds of objects (like classes) need to be hashed, pass in a collection of object attributes that are pertinent. For example, a class can be hashed in this fashion:
custom_hasher([cls.__dict__, cls.__name__])
A function can be hashed like so:
custom_hasher([fn.__dict__, fn.__code__])
python 3.3+ changes the default hash method to add an additional random seed. Need to set the global PYTHONHASHSEED=0 or use a different hash function
-
classmethod
-
class
simpleml.persistables.hashing.
Hasher
(hash_name='md5')[source]¶ Bases:
Pickler
A subclass of pickler, to do cryptographic hashing, rather than pickling.
This takes a binary file for writing a pickle data stream.
The optional protocol argument tells the pickler to use the given protocol; supported protocols are 0, 1, 2, 3 and 4. The default protocol is 3; a backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced.
The file argument must have a write() method that accepts a single bytes argument. It can thus be a file object opened for binary writing, an io.BytesIO instance, or any other custom object that meets this interface.
If fix_imports is True and protocol is less than 3, pickle will try to map the new Python 3 names to the old module names used in Python 2, so that the pickle data stream is readable with Python 2.
-
class
simpleml.persistables.hashing.
NumpyHasher
(hash_name='md5', coerce_mmap=False)[source]¶ Bases:
simpleml.persistables.hashing.Hasher
Special case the hasher for when numpy is loaded.
- hash_name: string
The hash algorithm to be used
- coerce_mmap: boolean
Make no difference between np.memmap and np.ndarray objects.
-
class
simpleml.persistables.hashing.
_ConsistentSet
(set_sequence)[source]¶ Bases:
object
Class used to ensure the hash of Sets is preserved whatever the order of its items.
-
class
simpleml.persistables.hashing.
_MyHash
(*args)[source]¶ Bases:
object
Class used to hash objects that won’t normally pickle
-
simpleml.persistables.hashing.
hash
(obj, hash_name='md5', coerce_mmap=False)[source]¶ Quick calculation of a hash to identify uniquely Python objects containing numpy arrays. Parameters ———– hash_name: ‘md5’ or ‘sha1’
Hashing algorithm used. sha1 is supposedly safer, but md5 is faster.
- coerce_mmap: boolean
Make no difference between np.memmap and np.ndarray