This guide offers concise steps to swiftly install and validate your unstructured
installation. For more comprehensive installation guide, please refer to this page.
pip install unstructured
Plain text files, HTML, XML, JSON, and Emails are immediately supported without any additional dependencies.
If you need to process other document types, you can install the extras required by following the Full Installation
unstructured
installation is successful and ready for use.
The following section will cover basic concepts and usage patterns in unstructured
. After reading this section, you should be able to:
partition
function.
unstructured
.
unstructured
repo.
Before running the code in this make sure you’ve installed the unstructured
library and all dependencies using the instructions in the Quick Start section.
partition
function. The partition
function will detect the filetype of the source document and route it to the appropriate partitioning function. You can try out the partition function by running the cell below.
partition
function uses libmagic for filetype detection. If libmagic
is not present and the user passes a filename, partition
falls back to detecting the filetype using the file extension. libmagic
is required if you’d like to pass a file-like object to partition
. We highly recommend installing libmagic
and you may observe different file detection behaviors if libmagic
is not installed`.
unstructured
library. In a few minutes, you’ll have a basic workflow set up and running!
For more detailed information about specific components or advanced features, explore the rest of the documentation.