mgkit.utils.dictionary module¶
Dictionary utils
-
class
mgkit.utils.dictionary.
HDFDict
(file_name, table, cast=<type 'int'>)¶ Bases:
object
New in version 0.3.1.
Used a table in a HDFStore (from pandas) as a dictionary. The table must be indexed to perform well. Read only.
Note
the dictionary cannot be modified and exception:ValueError will be raised if the table is not in the file
-
mgkit.utils.dictionary.
apply_func_to_values
(dictionary, func)¶ New in version 0.1.12.
Assuming a dictionary whose values are iterables, func is applied to each element of the iterable, retuning a set of all transformed elements.
Parameters: - dictionary (dict) – dictionary whose values are iterables
- func (func) – function to apply to the dictionary values
Returns: dictionary with transformed values
Return type:
-
class
mgkit.utils.dictionary.
cache_dict_file
(iterator, skip_lines=0)¶ Bases:
object
New in version 0.3.0.
Used to cache the result of a function that yields a tuple (key and value). If the value is found in the internal dictionary (as the class behave), the correspondent value is returned, otherwise the iterator is advanced until the key is found.
Example
>>> from mgkit.io.blast import parse_accession_taxa_table >>> i = parse_accession_taxa_table('nucl_gb.accession2taxid.gz', key=0) >>> d = cache_dict_file(i) >>> d['AH001684'] 4400
-
next
()¶
-
-
mgkit.utils.dictionary.
combine_dict
(keydict, valuedict)¶ Combine two dictionaries when the values of keydict are iterables. The combined dictionary has the same keys as keydict and the its values are sets containing all the values associated to keydict values in valuedict.
Resulting dictionary will be
Parameters: Return dict: combined dictionary
-
mgkit.utils.dictionary.
combine_dict_one_value
(keydict, valuedict)¶ Combine two dictionaries by the value of the keydict is used as a key in valuedict and the resulting dictionary is composed of keydict keys and valuedict values.
Same as
comb_dict()
, but each value in keydict is a single element that is key in valuedict.Parameters: Return dict: combined dictionary
-
mgkit.utils.dictionary.
filter_nan
(ratios)¶ Returns a dictionary with the NaN values taken out
-
mgkit.utils.dictionary.
filter_ratios_by_numbers
(ratios, min_num)¶ Returns from a dictionary only the items for which the length of the iterables that is the value of the item, is equal or greater of min_num.
Parameters: Return dict: filtered dictionary
-
mgkit.utils.dictionary.
find_id_in_dict
(s_id, s_dict)¶ Finds a value ‘s_id’ in a dictionary in which the values are iterables. Returns a list of keys that contain the value.
Parameters: Return list: list of keys in which d was found
-
mgkit.utils.dictionary.
link_ids
(id_map, black_list=None)¶ Given a dictionary whose values (iterables) can be linked back to other keys, it returns a dictionary in which the keys are the original keys and the values are sets of keys to which they can be linked.
Becomes:
Parameters: - id_map (dict) – dictionary of keys to link
- black_list (iterable) – iterable of values to skip in making the links
Return dict: linked dictionary
-
mgkit.utils.dictionary.
merge_dictionaries
(dicts)¶ New in version 0.3.1.
Merges keys and values from a list/iterable of dictionaries. The resulting dictionary’s values are converted into sets, with the assumption that the values are one of the following: float, str, int, bool
-
mgkit.utils.dictionary.
reverse_mapping
(map_dict)¶ Given a dictionary in the form: key->[v1, v2, .., vN], returns a dictionary in the form: v1->[key1, key2, .., keyN]
Parameters: map_dict (dict) – dictionary to reverse Return dict: reversed dictionary
-
mgkit.utils.dictionary.
split_dictionary_by_value
(value_dict, threshold, aggr_func=<function median>, key_filter=None)¶ Splits a dictionary, whose values are iterables, based on a threshold:
- one in which the result of aggr_func is lower than the threshold (first)
- one in which the result of aggr_func is equal or greater than the threshold (second)
Parameters: - valuedict (dict) – dictionary to be splitted
- threshold (number) – must be comparable to threshold
- aggr_func (func) – function used to aggregate the dictionary values
- key_filter (iterable) – if specified, only these key will be in the resulting dictionary
Returns: two dictionaries