docx_parser_converter.docx_to_txt.docx_to_txt_converter module

class docx_parser_converter.docx_to_txt.docx_to_txt_converter.DocxToTxtConverter(docx_file: bytes, use_default_values: bool = True)[source]

Bases: object

Class to convert DOCX files to plain text format.

convert_to_txt(indent: bool = False, extract_tables: bool = True) str[source]

Convert the DOCX document to plain text.

Parameters:
  • indent (bool) – Whether to apply indentation. Default is False.

  • extract_tables (bool) – Whether to extract table contents. Default is True.

Returns:

Plain text representation of the document.

Return type:

str

Example

txt_content = converter.convert_to_txt(indent=True, extract_tables=True)
save_txt_to_file(txt_content: str, output_path: str) None[source]

Save the generated plain text to a file.

Parameters:
  • txt_content (str) – The plain text content.

  • output_path (str) – The output file path.

Example

converter.save_txt_to_file(txt_content, 'output_path.txt')