Instructions to use OpenGenerativeAI/Bifrost-R1-32B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use OpenGenerativeAI/Bifrost-R1-32B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="OpenGenerativeAI/Bifrost-R1-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("OpenGenerativeAI/Bifrost-R1-32B")
model = AutoModelForCausalLM.from_pretrained("OpenGenerativeAI/Bifrost-R1-32B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use OpenGenerativeAI/Bifrost-R1-32B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "OpenGenerativeAI/Bifrost-R1-32B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGenerativeAI/Bifrost-R1-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/OpenGenerativeAI/Bifrost-R1-32B

SGLang

How to use OpenGenerativeAI/Bifrost-R1-32B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "OpenGenerativeAI/Bifrost-R1-32B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGenerativeAI/Bifrost-R1-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "OpenGenerativeAI/Bifrost-R1-32B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "OpenGenerativeAI/Bifrost-R1-32B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use OpenGenerativeAI/Bifrost-R1-32B with Docker Model Runner:
```
docker model run hf.co/OpenGenerativeAI/Bifrost-R1-32B
```

Bifrost-R1-32B / README.md

futureHQ

Create README.md

0cc2d0d verified over 1 year ago

preview code

Raw

History Blame Contribute Delete

2.52 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- Qwen/Qwen2.5-32B
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- Bifröst
	- Bifrost
	- code
	- reasoning
	inference:
	parameters:
	temperature: 0
	widget:
	- messages:
	- role: user
	content: >-
	Generate secure production code for [task] in python with proper input
	validation, current cryptographic standards, least privilege principles,
	comprehensive error handling, secure logging, and defense-in-depth.
	Include security-focused comments and explain critical security decisions.
	Follow OWASP/NIST standards.
	---

	## Bifröst-R1-32B (Reasoning)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a834a8895fd6416e29576f/sAXfe0cQdULI_GEVxBstw.png)

	Bifröst-R1-32B (Reasoning) is an advanced AI model built upon qwen2 architecture, specifically fine-tuned for secure and efficient enterprise-grade code generation with reasoning. Designed to meet rigorous standards of safety, accuracy, and reliability, Bifröst empowers organizations to streamline software development workflows while prioritizing security and compliance.

	### Model Details
	- Model Name: Bifröst-R1-32B
	- Base Architecture: qwen2
	- Application: Enterprise Secure Code Generation
	- Release Date: 08-March-2025

	### Intended Use
	Bifröst is designed explicitly for:
	- Generating secure, efficient, and high-quality code.
	- Supporting development tasks within regulated enterprise environments.
	- Enhancing productivity by automating routine coding tasks without compromising security.

	### Features
	- Security-Focused Training: Specialized training regimen emphasizing secure coding practices, vulnerability reduction, and adherence to security standards.
	- Enterprise-Optimized Performance: Tailored to support various programming languages and enterprise frameworks with robust, context-aware suggestions.
	- Compliance-Driven Design: Incorporates features to aid in maintaining compliance with industry-specific standards (e.g., GDPR, HIPAA, SOC 2).

	### Limitations
	- Bifröst should be used under human supervision to ensure code correctness and security compliance.
	- Model-generated code should undergo appropriate security and quality assurance checks before deployment.

	### Ethical Considerations
	- Users are encouraged to perform regular audits and compliance checks on generated outputs.
	- Enterprises should implement responsible AI practices to mitigate biases or unintended consequences.