If you haven’t installed Docker on your machine, you can find the installation guide here.
We build multi-platform images to support both x86_64 and Apple silicon hardware. Using docker pull should download the appropriate image for your architecture. However, if needed, you can specify the platform with the –platform flag, e.g., –platform linux/amd64.
We create Docker images for every push to the main branch. These images are tagged with the respective short commit hash (like fbc7a69) and the application version (e.g., 0.5.5-dev1). The most recent image also receives the latest tag. To use these images, pull them from our repository:
You can also build your own Docker image. If you only plan to parse a single type of data, you can accelerate the build process by excluding certain packages or requirements needed for other data types. Refer to the Dockerfile to determine which lines are necessary for your requirements.
Copy
Ask AI
make docker-build# start a bash shell inside the running Docker containermake docker-start-bash
Once inside the running Docker container, you can directly test the library using Python’s interactive mode:
Copy
Ask AI
python3>>> from unstructured.partition.pdf import partition_pdf>>> elements = partition_pdf(filename="example-docs/layout-parser-paper-fast.pdf")>>> from unstructured.partition.text import partition_text>>> elements = partition_text(filename="example-docs/fake-text.txt")