mzutils package

Submodules

mzutils.control_models module

Bases: object

predict(state)[source]

Parameters:: state – control state of the system. The state is a list of length len_c, each of the element cooresponding to an action.

mzutils.ctsv_funcs module

mzutils.ctsv_funcs.append_tsv(file_path, rows)[source]

Parameters:

file_path –
rows – a list of rows to be written in the tsv file. The rows are lists of items.

Returns:

mzutils.ctsv_funcs.beautify_csv_lines(lst: list)[source]: the list contain sub_lists with different lengths. This function helps to write them with paddings. return back list of sub_lists of rows to write to csv.

mzutils.ctsv_funcs.beautify_csv_lines_horizontal(lst: list)[source]: the list contain sub_lists with different lengths. This function helps to write them with paddings. return back list of sub_lists with the same length.

mzutils.ctsv_funcs.find_max_sub_list_length(lst: list)[source]: pick the left longest sub_list

mzutils.ctsv_funcs.read_tsv(file_path)[source]: read a tsv into a nested python list. :param file_path: :return:

mzutils.ctsv_funcs.save_tsv_as_csv(tsv_file, csv_file=None)[source]

mzutils.ctsv_funcs.segment_large_csv(file_path, destination_path, segmentation_length, duplicate_header=False)[source]: segment a large file to several smaller files to a destination. If duplicate_header is True, the first line of the original large file will be duplicated to every segmented files, results in the length of segmented file = segmentation_length + 1. which also means that :param file_path: :param destination_path: :param segmentation_length: :param duplicate_header: :return: how many files are segmented.

mzutils.ctsv_funcs.segment_large_tsv(file_path, destination_path, segmentation_length, duplicate_header=False)[source]: segment a large file to several smaller files to a destination. If duplicate_header is True, the first line of the original large file will be duplicated to every segmented files, results in the length of segmented file = segmentation_length + 1. which also means that :param file_path: :param destination_path: :param segmentation_length: :param duplicate_header: :return: how many files are segmented.

mzutils.ctsv_funcs.write_tsv(file_path, rows)[source]

Parameters:

file_path –
rows – a list of rows to be written in the tsv file. The rows are lists of items.

Returns:

mzutils.data_structures module

class mzutils.data_structures.SeedData(save_path, seeds, resume_from={})[source]

Bases: object

A dictionary that aims to average the evaluated mean_episode_return accross different random seed. Also controls where to resume the experiments from.

append(algo_name, test_reward, seed)[source]

load()[source]

resume_checker(current_positions)[source]: current_positions has the same shape as self.resume_from return True if the current loop still need to be skipped.

save()[source]

setter(algo_name, test_reward, seed)[source]

class mzutils.data_structures.SimplePriorityQueue(maxsize=0)[source]

Bases: object

a simple wrapper around heapq.

>>> q = SimplePriorityQueue()
>>> q.put((2, "Harry"))
>>> q.put((3, "Charles"))
>>> q.put((1, "Riya"))
>>> q.put((4, "Stacy"))
>>> q.put((0, "John"))
>>> print(q.nlargest(3))
[(4, 'Stacy'), (3, 'Charles'), (2, 'Harry')]
>>> print(q.nsmallest(8))
[(1, 'Riya'), (2, 'Harry'), (3, 'Charles'), (4, 'Stacy')]
>>> print(q.get())
(1, 'Riya')
>>> print(q.get())
(2, 'Harry')
>>> print(q.get())
(3, 'Charles')
>>> print(q.get())
(4, 'Stacy')
>>> print(q.get())
None

get()[source]

nlargest(n, key=None)[source]

nsmallest(n, key=None)[source]

put(element)[source]

mzutils.gym_space_management module

mzutils.gym_space_management.denormalize_spaces(space_normalized, max_space=None, min_space=None, skip_columns=None, fill_value=0.0)[source]: same as above, and space_normalized can be the whole normalized original space or just one row in the normalized space

mzutils.gym_space_management.list_of_str_to_numpy_onehot_dict(lst)[source]: create a onehot lookup dictionary according to the list of strings passed in

mzutils.gym_space_management.normalize_spaces(space, max_space=None, min_space=None, skip_columns=None, fill_value=0.0)[source]

normalize each column of observation/action space to be in [-1,1] such that it looks like a Box space can be the whole original space (X by D) or just one row in the original space (D,) :param space: numpy array :param max_space: numpy array, the maximum value of each column of the space, normally

we would get this from reading the dataset or prior knowledge

Parameters:

min_space – numpy array, the minimum value of each column of the space, normally we would get this from reading the dataset or prior knowledge
skip_columns – numpy array or list, columns to skip from normalization
fill_value – float, the value to fill in the normalized space if the original space is masked here.

so, if you don’t want a part of the space to be normalized, you can pass in a masked array. The value will be automatically filled with fill_value in the normalized space. The returned will be tuple of filled_re_space, max_space, min_space, and the original re_space with mask. e.g. a = np.array(range(24), dtype=np.float64).reshape(4,6) a = np.where(a > 21, np.nan, a) a = np.ma.array(a, mask=np.isnan(a)) b, max, min = mzutils.normalize_spaces(a)

mzutils.json_funcs module

mzutils.json_funcs.dump_json(dictionary, file_path)[source]

Parameters:

dict –
file_path –

Returns:

mzutils.json_funcs.load_json(file_path)[source]

Parameters:: file_path –
Returns:: dict object

mzutils.list_funcs module

mzutils.list_funcs.flatten(lst)[source]: convert nested list to a list of elements.

mzutils.list_funcs.pad_list(lst: list, length: int, element='')[source]: pad a list to length with elements. returned list will have length equal to length.

mzutils.list_funcs.pop_indices(lst, indices)[source]: pop the lst given a list or tuple of indices. this function modifies lst directly inplace. >>> pop_indices([1,2,3,4,5,6], [0,4,5]) >>> [2, 3, 4]

mzutils.list_funcs.remove_elements_from_list(lst: list, elements: list)[source]: remove elements from lst.

mzutils.list_funcs.split_list_with_len(lst: list, length: int)[source]: return a list of sublists with len == length (except the last one)

mzutils.numpy_funcs module

mzutils.os_funcs module

class mzutils.os_funcs.TimeRecorder[source]

Bases: object

get_times()[source]

get_times_str()[source]

record(name='')[source]

mzutils.os_funcs.basename_and_extension(file_path)[source]

>>> file_path="a/b.c"
>>> basename_and_extension(file_path)
('b', '.c')
:param file_path:
:return:

mzutils.os_funcs.clean_dir(dir_path, just_files=True)[source]: Clean up a directory. :param dir_path: :param just_files: If just_files=False, also remove all directory trees in that directory. :return:

mzutils.os_funcs.documents_segementor_on_word_length(documents_dir, store_dir, max_length, language='english', clean_store_dir=False)[source]: segment a long document to several small documents based on the nltk tokenized word length. sentence structure will be kept. :param documents_dir: where all documents located. :param store_dir: where to store segmented documents. :param max_length: document segments’ max length. :param language: for the use of nltk, default english. :param clean_store_dir: :return: number of documents after segmented.

mzutils.os_funcs.get_checkpoints_in_loc(in_path, keywords=['checkpoint-'], files_or_folders='folders')[source]: This function will loop through in_path to find all files/folders that includes all keywords if files_or_folders=’files’/’folders’. again, in_path can be file path or dir path. The function is meant to grab all checkpoint-XXXX in a folder.

mzutils.os_funcs.get_things_in_loc(in_path, just_files=True, endswith=None)[source]: in_path can be file path or dir path. This function return a list of file paths in in_path if in_path is a dir, or within the parent path of in_path if it is not a dir. just_files=False will let the function go recursively into the subdirs. :endswith: None or a list of file extensions (to end with).

mzutils.os_funcs.helper_check_existance_and_add_timestamp(store_dir, name)[source]

mzutils.os_funcs.helper_document_segmentor(documents_dir, store_dir, name, max_length, language)[source]

mzutils.os_funcs.helper_save_documents(store_dir, name, documents)[source]

mzutils.os_funcs.loop_through_copy_files_to_one_dir(looped_dir, target_dir, include_link=False)[source]: function to loop through nested directories and copy all the files to a target directory. :param looped_dir: :param target_dir: a directory string. :return:

mzutils.os_funcs.loop_through_return_abs_file_path(looped_dir)[source]: function to loop through nested directories and return file absolute path in a list. :param looped_dir: :return: list

mzutils.os_funcs.loop_through_store_files_to_list(looped_dir, encoding='utf-8')[source]: function to loop through nested directories and store the content of all files into a list separately. This function does not care about symbolic link inside the nested directories. :param looped_dir: :param encoding: :return: list

mzutils.os_funcs.loop_through_store_lines_to_list(looped_dir, encoding='utf-8')[source]: function to loop through nested directories and store the lines of all files into a list. This function does not care about symbolic link inside the nested directories. :param looped_dir: :param encoding: :return: list

mzutils.os_funcs.mkdir_p(dir_path)[source]: mkdir -p functionality in python :param dir_path: :return:

mzutils.os_funcs.parent_dir_and_name(file_path)[source]

>>> file_path="a/b.c"
>>> parent_dir_and_name(file_path)
('/root/.../a', 'b.c')
:param file_path:
:return:

mzutils.os_funcs.save__init__args(values, underscore=False, overwrite=False, subclass_only=False)[source]: Use in __init__() only; assign all args/kwargs to instance attributes. To maintain precedence of args provided to subclasses, call this in the subclass before super().__init__() if save__init__args() also appears in base class, or use overwrite=True. With subclass_only==True, only args/kwargs listed in current subclass apply. usage: >>> class AgentModel: … def __init__( … self, … meta_info_attr_size=7, … obs_shape=(3, 64, 64), … reward_shape=(1,), … n_agents=1, … obs_last_action=False, … obs_agent_id=True, … rnn_hidden_dim=64, … based_on=’observation’, … n_actions=11, … use_cuda=True,): … save__init__args(locals()) >>> a=AgentModel() >>> a.rnn_hidden_dim >>> 64

mzutils.os_funcs.set_local_vars_from_yaml(yaml_loc, name_space_dict)[source]: set local variables from yaml file. :param yaml_loc: your yaml file location. :param name_space_dict: a dictionary that contains the local variables. e.g. locals() :return: None for example, if your yaml file contains a variable called num_workers and the value of which is an integer 4, then >>> set_local_vars_from_yaml(‘path_to_file.yaml’, locals()) >>> num_workers 4

mzutils.os_funcs.unzip_all(dir_path, target_path, endswith='.zip')[source]

mzutils package

Subpackages

Submodules

mzutils.control_models module

mzutils.ctsv_funcs module

mzutils.data_structures module

mzutils.gym_space_management module

mzutils.json_funcs module

mzutils.list_funcs module

mzutils.numpy_funcs module

mzutils.os_funcs module

mzutils.parser module

mzutils.probabilities_funcs module

mzutils.serialize_funcs module

mzutils.string_funcs module

mzutils.tf_funcs module

mzutils.torch_funcs module

Module contents