Transformers documentation

PP-Chart2Table

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

This model was released on 2025-05-20 and added to Hugging Face Transformers on 2026-03-18.

PP-Chart2Table

PyTorch

Overview

PP-Chart2Table is a SOTA multimodal model developed by the PaddlePaddle team, specializing in chart parsing for both Chinese and English. Its high performance is driven by a novel “Shuffled Chart Data Retrieval” training task, which, combined with a refined token masking strategy, significantly improves its efficiency in converting charts to data tables. The model is further strengthened by an advanced data synthesis pipeline that uses high-quality seed data, RAG, and LLMs persona design to create a richer, more diverse training set. To address the challenge of large-scale unlabeled, out-of-distribution (OOD) data, the team implemented a two-stage distillation process, ensuring robust adaptability and generalization on real-world data.

Model Architecture

PP-Chart2Table adopts a multimodal fusion architecture that combines a vision tower for chart feature extraction and a language model for table structure generation, enabling end-to-end chart-to-table conversion.

Usage

Single input inference

The example below demonstrates how to classify image with PP-Chart2Table using Pipeline or the AutoModel.

Pipeline
AutoModel
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="PaddlePaddle/PP-Chart2Table_safetensors")

# PPChart2TableProcessor uses hardcoded "Chart to table" instruction internally via chat template
conversation = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "url": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/chart_parsing_02.png",
            },
        ],
    },
]
result = pipe(text=conversation)
print(result[0]["generated_text"])

Batched inference

Here is how you can do it with PP-Chart2Table using Pipeline or the AutoModel:

Pipeline
AutoModel
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="PaddlePaddle/PP-Chart2Table_safetensors")

# PPChart2TableProcessor uses hardcoded "Chart to table" instruction internally via chat template
conversation = [
    {
        "role": "user",
        "content": [
            {
                "type": "image",
                "url": "https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/chart_parsing_02.png",
            },
        ],
    },
]
result = pipe(text=[conversation, conversation])
print(result[0][0]["generated_text"])

PPChart2TableConfig

class transformers.PPChart2TableConfig

< >

( transformers_version: str | None = None architectures: list[str] | None = None output_hidden_states: bool | None = False return_dict: bool | None = True dtype: typing.Union[str, ForwardRef('torch.dtype'), NoneType] = None chunk_size_feed_forward: int = 0 is_encoder_decoder: bool = False id2label: dict[int, str] | dict[str, str] | None = None label2id: dict[str, int] | dict[str, str] | None = None problem_type: typing.Optional[typing.Literal['regression', 'single_label_classification', 'multi_label_classification']] = None vision_config: dict | transformers.configuration_utils.PreTrainedConfig | None = None text_config: dict | transformers.configuration_utils.PreTrainedConfig | None = None image_token_index: int = 151859 image_seq_length: int = 576 tie_word_embeddings: bool = True )

Parameters

  • vision_config (Union[dict, ~configuration_utils.PreTrainedConfig], optional) — The config object or dictionary of the vision backbone.
  • text_config (Union[dict, ~configuration_utils.PreTrainedConfig], optional) — The config object or dictionary of the text backbone.
  • image_token_index (int, optional, defaults to 151859) — The image token index used as a placeholder for input images.
  • image_seq_length (int, optional, defaults to 576) — Sequence length of one image embedding.
  • tie_word_embeddings (bool, optional, defaults to True) — Whether to tie weight embeddings according to model’s tied_weights_keys mapping.

This is the configuration class to store the configuration of a Pp Chart2TableModel. It is used to instantiate a Pp Chart2Table model according to the specified arguments, defining the model architecture. Instantiating a configuration with the defaults will yield a similar configuration to that of the PaddlePaddle/PP-Chart2Table_safetensors

Configuration objects inherit from PreTrainedConfig and can be used to control the model outputs. Read the documentation from PreTrainedConfig for more information.

Example:

>>> from transformers import GotOcr2ForConditionalGeneration, PPChart2TableConfig

>>> # Initializing a PPChart2Table style configuration
>>> configuration = PPChart2TableConfig()

>>> # Initializing a model from the PaddlePaddle/PP-Chart2Table_safetensors style configuration
>>> model = GotOcr2ForConditionalGeneration(configuration)  # underlying architecture is Got Ocr 2

>>> # Accessing the model configuration
>>> configuration = model.config

PPChart2TableImageProcessor

class transformers.PPChart2TableImageProcessor

< >

( **kwargs: typing_extensions.Unpack[transformers.processing_utils.ImagesKwargs] )

Parameters

  • **kwargs (ImagesKwargs, optional) — Additional image preprocessing options. Model-specific kwargs are listed above; see the TypedDict class for the complete list of supported arguments.

Constructs a PPChart2TableImageProcessor image processor.

PPChart2TableImageProcessorPil

class transformers.PPChart2TableImageProcessorPil

< >

( **kwargs: typing_extensions.Unpack[transformers.processing_utils.ImagesKwargs] )

Parameters

  • **kwargs (ImagesKwargs, optional) — Additional image preprocessing options. Model-specific kwargs are listed above; see the TypedDict class for the complete list of supported arguments.

Constructs a PPChart2TableImageProcessor image processor.

PPChart2TableProcessor

class transformers.PPChart2TableProcessor

< >

( image_processor = None tokenizer = None chat_template = None **kwargs )

Parameters

  • image_processor (PPChart2TableImageProcessor) — The image processor is a required input.
  • tokenizer (tokenizer_class) — The tokenizer is a required input.
  • chat_template (str) — A Jinja template to convert lists of messages in a chat into a tokenizable string.

Constructs a PPChart2TableProcessor which wraps a image processor and a tokenizer into a single processor.

PPChart2TableProcessor offers all the functionalities of PPChart2TableImageProcessor and tokenizer_class. See the ~PPChart2TableImageProcessor and ~tokenizer_class for more information.

Update on GitHub