You can change a workflow’s preconfigured strategy only through Custom workflow settings.
- VLM: For the highest-quality transformation of these file types:
.bmp,.gif,.heic,.jpeg,.jpg,.pdf,.png,.tiff, and.webp. - High Res: For all other supported file types, and for the generation of bounding box coordinates.
- Fast: For text-only documents.
Images and tables in PDF files
The differences between the various partitioning strategies can be more clearly demonstrated by the ways each of these strategies handle images and tables within PDF files. For example, the Fast partitioning strategy skips processing images altogether in PDF files:
For tables, the Fast strategy interprets table cells in PDF files as a mixture of title, list, and uncategorized text elements:
The High Res strategy, by itself, processes images in PDF files sometimes with limited output:
However, when combined with the image description enrichment, the High Res strategy can process images in PDF files with better result output:
For tables, the High Res strategy processes tables in PDF files with the table’s text and an HTML representation of the table as output:
When combined with the table description and tables to HTML enrichments, the High Res strategy can process tables in PDF files with even richer result output:
The VLM strategy processes images in PDF files with image summaries and text as HTML elements as output. The following example shows GPT-4o by OpenAI being used. If
the Auto strategy is selected in this example, Unstructured will route to the VLM strategy for processing:
For tables, the VLM strategy processes tables in PDF files with the table’s text and an HTML representation of the table as output, similar to the High Res strategy.
The following example shows GPT-4o by OpenAI being used. If the Auto strategy is selected in this example, Unstructured will route to the VLM strategy for processing:
Handwriting and multilanguage characters in PDF files
The differences between the various partitioning strategies can be more clearly demonstrated by the ways each of these strategies handle handwriting and multilanguage characters within PDF files. For example, the Fast partitioning strategy skips processing handwriting altogether in PDF files. The Fast strategy processes multilanguage characters in PDF files with limited output, depending on the language. In the following example, Japanese hiragana characters are processed as text, but the output can be very difficult to work with:
For handwriting, the High Res strategy typically produces unusable results, for example:
For multilanguage characters, the High Res strategy also typically produces unusable results, for example failing to recognize Japanese hiragana characters:
The VLM strategy can produce great results for handwriting, such as this example that uses GPT-4o by OpenAI:
The VLM strategy also has great support for recognizing multilanguage characters, such as this example that uses GPT-4o by OpenAI to recognize Japanese hiragana characters:
Supported languages
Fast partitioning accepts any text inputs, though automatic language detection of those inputs is restricted to langdetect. High Res partitioning leverages Tesseract OCR. For the list of languages that Tesseract supports, see: Languages/Scripts supported in different versions of Tesseract. Language support for VLM depends on the model used. The list of supported languages for a particular model is maintained by that model’s provider. For the list of languages that each model supports, see the following, where provided:-
Anthropic
- Claude 3.5 Sonnet: Arabic, Bengali, Chinese (Simplified), English, French, German, Hindi, Indonesian, Italian, Japanese, Korean, Portuguese (Brazil), Spanish, Swahili, and Yoruba are mentioned. (Source)
-
OpenAI
- GPT-4o: Arabic, Chinese, English, French, German, Gujarati, Hindi, Italian, Japanese, Korean, Marathi, Persian, Portuguese, Russian, Spanish, Tamil, Telugu, Turkish, Urdu, and Vietnamese are mentioned. (Source)
-
Amazon Bedrock
- Claude 3.5 Sonnet: “English, Spanish, Japanese, and multiple other languages” (Source)
- Claude 3 Opus: “English, Spanish, Japanese, and multiple other languages” (Source)
- Claude 3 Haiku: “English, Spanish, Japanese, and multiple other languages” (Source)
- Claude 3 Sonnet: “English, Spanish, Japanese, and multiple other languages” (Source)
- Amazon Nova Pro: “200+ languages” (Source)
- Amazon Nova Lite: “200+ languages” (Source)
- Meta Llama 3.2 90B Instruct: “English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai” (Source)
- Meta Llama 3.2 11B Instruct: “English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai” (Source)

