ml4co_kit.utils.file_utils

The utilities used to download and compress files.

Functions

check_file_path(file_path)

Check if the directory of the file path exists, if not, create it.

compress_folder(folder, compress_path)

Compresses a folder into the specified output format.

download(file_path, url[, md5, retries])

The specific download function with three different methods.

extract_archive(archive_path, extract_path)

Extracts an archive into the specified extract path.

get_md5(file_path)

Get the md5 of the specified file.

pull_file_from_huggingface(repo_id, ...[, ...])

split_txt_file(file_path, lines_per_file, ...)

Split the txt file into multiple parts.

ml4co_kit.utils.file_utils.check_file_path(file_path: str)[source]

Check if the directory of the file path exists, if not, create it.

ml4co_kit.utils.file_utils.compress_folder(folder: str, compress_path: str)[source]

Compresses a folder into the specified output format. Raise ValueError if the out put file format is un supported.

Parameters:
  • folder – string, the path to the folder to be compressed.

  • compress_path – string, the path to the output file.

Supported formats:
  • .zip: ZIP format

  • .tar.gz: tar.gz format

Example
# We also use tsp_uniform for illustration
# if you haven't download it, please download it as mentioned above.
>>>from ml4co_kit import compress_folder

# Compress the folder
>>>compress_folder("dataset/tsp_uniform_20240825","dataset/tsp_uniform_20240825.tar.gz")
ml4co_kit.utils.file_utils.download(file_path: str, url: str, md5: str | None = None, retries: int = 5)[source]

The specific download function with three different methods.

ml4co_kit.utils.file_utils.extract_archive(archive_path: str, extract_path: str)[source]

Extracts an archive into the specified extract path. Raise ValueError if the out put file format is un supported.

Parameters:
  • archive_path – string, path to the archive file.

  • extract_path – string, path to the extraction directory.

Supported formats:
  • .zip: ZIP format

  • .tar.gz: tar.gz format

Example
# We also use tsp_uniform for illustration
# if you haven't download it, please download it as mentioned above.
>>> from ml4co_kit import extract_archive

# Extracts the archive
>>> extract_archive("dataset/tsp_uniform_20240825.tar.gz","dataset/tsp_uniform_20240825")
ml4co_kit.utils.file_utils.get_md5(file_path: str)[source]

Get the md5 of the specified file.

ml4co_kit.utils.file_utils.pull_file_from_huggingface(repo_id: str, repo_type: str, filename: str, save_path: str, hf_token: str | None = None)[source]
ml4co_kit.utils.file_utils.split_txt_file(file_path: str | Path, lines_per_file: int, save_dir: str)[source]

Split the txt file into multiple parts.