simpleml.persistables.hashing module¶
Mixin classes to handle hashing
-
class
simpleml.persistables.hashing.
CustomHasherMixin
[source]¶ Bases:
object
Mixin class to hash any object
-
custom_hasher
(object_to_hash, custom_class_proxy=<class 'mappingproxy'>)[source]¶ Adapted from: https://stackoverflow.com/questions/5884066/hashing-a-dictionary Makes a hash from a dictionary, list, tuple or set to any level, that contains only other hashable types (including any lists, tuples, sets, and dictionaries). In the case where other kinds of objects (like classes) need to be hashed, pass in a collection of object attributes that are pertinent. For example, a class can be hashed in this fashion:
custom_hasher([cls.__dict__, cls.__name__])
A function can be hashed like so:
custom_hasher([fn.__dict__, fn.__code__])
python 3.3+ changes the default hash method to add an additional random seed. Need to set the global PYTHONHASHSEED=0 or use a different hash function
-
-
class
simpleml.persistables.hashing.
Hasher
(hash_name='md5')[source]¶ Bases:
pickle._Pickler
A subclass of pickler, to do cryptographic hashing, rather than pickling.
-
dispatch
= {<class 'NoneType'>: <function _Pickler.save_none>, <class 'bool'>: <function _Pickler.save_bool>, <class 'int'>: <function _Pickler.save_long>, <class 'float'>: <function _Pickler.save_float>, <class 'bytes'>: <function _Pickler.save_bytes>, <class 'str'>: <function _Pickler.save_str>, <class 'tuple'>: <function _Pickler.save_tuple>, <class 'list'>: <function _Pickler.save_list>, <class 'dict'>: <function _Pickler.save_dict>, <class 'set'>: <function Hasher.save_set>, <class 'frozenset'>: <function _Pickler.save_frozenset>, <class 'function'>: <function _Pickler.save_global>, <class 'type'>: <function Hasher.save_global>, <class 'builtin_function_or_method'>: <function Hasher.save_global>}¶
-
-
class
simpleml.persistables.hashing.
NumpyHasher
(hash_name='md5', coerce_mmap=False)[source]¶ Bases:
simpleml.persistables.hashing.Hasher
Special case the hasher for when numpy is loaded.
-
simpleml.persistables.hashing.
hash
(obj, hash_name='md5', coerce_mmap=False)[source]¶ Quick calculation of a hash to identify uniquely Python objects containing numpy arrays. Parameters ———– hash_name: ‘md5’ or ‘sha1’
Hashing algorithm used. sha1 is supposedly safer, but md5 is faster.- coerce_mmap: boolean
- Make no difference between np.memmap and np.ndarray