Exploring Hugging Face Jobs with a Gradio interface — toward Job Recipes

Community Article Published March 10, 2026

You may have noticed that Hugging Face recently introduced Jobs, a way to remotely execute Python scripts as UV workloads on managed compute with selectable hardware flavors.

While quietly experimenting with these new tools, I wanted a nicer UX without having to use the CLI or a terminal, and a practical way to launch and monitor Jobs without constantly jumping between pages.

So I ended up building a Gradio Space around it.

The initial goal was simple: run batch generative workloads — text → audio, image or video models — on remote hardware instead of locally.

But once the first version worked, the interesting part quickly shifted. Running the model itself wasn’t the hard part. Everything around the job was.

Where should prompts live?
Where do outputs go?
How do you track what happened during a run?
How do you explore results afterward?

The Space gradually evolved into an interface designed to keep the entire workflow in one place.

Treating Jobs as runs

Instead of thinking of a Job as just a script execution, the app treats it as a run.

Each run stores its prompts, parameters, worker code and outputs in Hugging Face datasets. Because the structure is predictable, the Space can reconstruct the run afterward, preview results and link outputs back to the prompts that produced them.

Jobs are also tracked in real time.

The app continuously polls their status and displays key information such as job state, hardware flavor, runtime, billed runtime and estimated compute cost. All runs launched during the session appear in a Job Explorer, which makes it easy to monitor multiple jobs at once and inspect results as soon as they complete.

The idea was simply to keep the information that matters visible from the Space, and reduce context switching.

Building on the Hugging Face ecosystem

Another nice aspect of the project is that it relies almost entirely on existing components of the Hugging Face ecosystem.

The interface runs as a Gradio Space.
Workloads execute through Hugging Face Jobs.
Inputs and outputs are stored in Datasets.
Models can be integrated using Diffusers pipelines and information available in the model card.

Because these pieces already work well together, it becomes possible to orchestrate an entire workflow while staying inside the Space.

Results can also be explored directly from the dataset artifacts written by the job worker, since they expose the full structure of each run.

Adding models through plugins

While building the app another pattern quickly appeared: most models require the same ingredients.

A UI describing parameters.
A payload describing the run.
A worker that executes the model.

To make that easier, the Space uses a simple plugin structure.

Each model defines a Gradio UI plugin describing the interface and payload, along with a minimal job worker implementing the model logic and declaring its dependencies.

The worker focuses only on loading the model and generating outputs. Everything else — batching prompts, generating manifests, saving artifacts and uploading results — is handled automatically by the runtime.

For each launched job, the final worker script that actually ran the job is generated automatically and stored alongside the run inputs. It can also be inspected directly from the interface, which makes it easy to see exactly how the job executed.

Once this structure was in place, integrating new models became mostly declarative.

Toward Job Recipes

After integrating several models this way, the structure started to feel very consistent.

Most model integrations follow the same pattern: a plugin defining the interface and payload, and a minimal worker describing the model logic and dependencies.

Because this structure is so predictable, it turns out to be surprisingly easy for a language model to generate fully compatible model plugins and workers automatically.

In practice this has proven to be very robust: once the conventions are known, an LLM can produce workers that fit directly into the system and run correctly inside the Jobs environment.

This starts to hint at an interesting direction: a structure where Job recipes can be generated automatically, using simple conventions that describe how a model should run as a Job.

What comes next

For now the Space focuses on prompt-based generation models: text → audio, image and video.

These models were a natural starting point because they benefit from GPU execution and batch processing.

The next step is to explore how the same structure can apply to other types of inference workloads.

More details about the system and its conventions are available directly in the app for those who want to explore further.

If you're interested in Hugging Face Jobs or want to try adding models, feel free to explore and contribute the Space.

—> https://huggingface.co/spaces/fffiloni/text2any-batch-hf-jobs-orchestration

Community

This comment has been hidden (marked as Spam)

Sign up or log in to comment