antimatter.filetype.infer#

Module Contents#

Functions#

infer_filetype(→ str)

Infers the file type of a given file using both libmagic and file extension

infer_by_extension(→ str)

Infers the file type based on the file extension. This function maps common

infer_by_magic(→ str)

Infers the file type using the libmagic library, which analyzes the content

antimatter.filetype.infer.infer_filetype(path: str) str#

Infers the file type of a given file using both libmagic and file extension methods. It first tries to identify the file type using libmagic. If libmagic is not available or cannot determine the file type, it falls back to identifying based on the file extension.

Parameters:

path – The path to the file whose file type needs to be inferred.

Returns:

The inferred file type as a string (‘txt’, ‘csv’, ‘json’, ‘parquet’), or an empty string if the file type cannot be determined.

antimatter.filetype.infer.infer_by_extension(path: str) str#

Infers the file type based on the file extension. This function maps common file extensions to their respective file types.

Parameters:

path – The path to the file including its extension.

Returns:

The inferred file type as a string (‘txt’, ‘csv’, ‘json’, ‘parquet’), or an empty string if the extension does not match known types.

antimatter.filetype.infer.infer_by_magic(path: str) str#

Infers the file type using the libmagic library, which analyzes the content of the file. This method is more reliable than inferring based on the file extension but requires the libmagic library to be installed. If libmagic is not installed, this function returns an empty string.

Note: The identification of some file types like ‘parquet’ might be unreliable as it often returns generic MIME types such as ‘application/octet-stream’.

Parameters:

path – The path to the file for which the file type is to be inferred.

Returns:

The inferred file type as a string (‘txt’, ‘csv’, ‘json’), or an empty string if the file type cannot be determined or libmagic is not available.