Depending on your need, Unstructured
provides OCR-based and Transformer-based models to detect elements in the documents. The models are useful to detect the complex layout in the documents and predict the element types.
strategy
to hi_res
as shown above.
unstructured
and unstructured-api
libraries, we are deprecating the model_name
parameter. Please use hi_res_model_name
parameter when specifing a model.
detectron2_onnx
is a Computer Vision model by Facebook AI that provides object detection and segmentation algorithms with ONNX Runtime. It is the fastest model with the hi_res
strategy.
yolox
is a single-stage real-time object detector that modifies YOLOv3 with a DarkNet53 backbone.
yolox_quantized
: runs faster than YoloX and its speed is closer to Detectron2.
chipper
(beta version): the Chipper model is Unstructured’s in-house image-to-text model based on transformer-based Visual Document Understanding (VDU) models.
Unstructured
will download the model specified in UNSTRUCTURED_HI_RES_MODEL_NAME
environment variable. If not defined, it will download the default model.
There are three ways you can use the non-default model as follows:
partition
function.UnstructuredDetectronModel
class in unstructured-inference
library.
The UnstructuredDetectronModel
class in unstructured_inference.models.detectron2
uses the faster_rcnn_R_50_FPN_3x
model pretrained on DocLayNet
. But any model in the model zoo can be used by using different construction parameters. UnstructuredDetectronModel
is a light wrapper around the LayoutParser’s Detectron2LayoutModel
object, and accepts the same arguments.
Using Your Own Object Detection Model
To seamlessly integrate your custom detection and extraction models into unstructured_inference
pipeline, start by wrapping your model within the UnstructuredObjectDetectionModel
class. This class acts as an intermediary between your detection model and Unstructured workflow.
Ensure your UnstructuredObjectDetectionModel
subclass incorporates two vital methods:
predict
method, which should be designed to accept a PIL.Image.Image
type and return a list of LayoutElements
, facilitating the communication of your model’s results.
initialize
method is essential for loading and prepping your model for inference, guaranteeing its readiness for any incoming tasks.