Instructions to use google/pix2struct-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/pix2struct-base with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "image-to-text" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("image-to-text", model="google/pix2struct-base")# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/pix2struct-base") model = AutoModelForImageTextToText.from_pretrained("google/pix2struct-base") - Notebooks
- Google Colab
- Kaggle
cannot import name 'Pix2StructForConditionalGeneration' and AttributeError in 'AutoProcessor'
#2
by pathikg - opened
I avoided Pix2struct error with using transformers 4.28.0.dev0 version. I don't know this is the right way but, while using the pip-installed version, couldn't handle this error. Abut second error AutoProcessor has dependency on torchvision. You can try to install it if you haven't install.
Pix2Struct was introduced on the transformers : v4.28.0, you can check it here : https://github.com/huggingface/transformers/releases
To ensure you have the latest version, run : !pip install --upgrade transformers
Once the installation is complete, you should be able to use Pix2Struct in your code.

