Quickstart

Get started with Standard Model Biomedicine’s biological world model. This guide walks you through setting up your environment and downloading SMB-v1-Structure.

New: This quickstart now features SMB-v1-Structure — our flagship JEPA-based multimodal foundation model for oncology.

Prerequisites

GPU support is strongly recommended. SMB-v1-Structure (1.7B parameters) requires approximately 16GB GPU memory for inference.

Before you begin, ensure you have the following installed:

Python 3.10+ — Required for running the models
pip — Python package manager
CUDA (recommended) — For GPU acceleration with NVIDIA GPUs
Git — For cloning repositories

One-Command Setup (Recommended)

Run the quickstart script to automatically configure your environment:

curl -fsSL https://raw.githubusercontent.com/standardmodelbio/quickstart/main/quickstart.sh -o quickstart.sh && source quickstart.sh

This script will:

Create a Python 3.10 virtual environment named standard_model
Install PyTorch with CUDA support (if available)
Install HuggingFace libraries (transformers, datasets, accelerate)
Install SMB utilities (smb-biopan-utils)
Download the Standard Model to your local machine

Manual Installation

If you prefer to set up your environment manually:

Create Virtual Environment

python3 -m venv standard_model
source standard_model/bin/activate

Install PyTorch

CUDA 12.x
CUDA 11.x
CPU Only

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Install Dependencies

pip install transformers datasets accelerate huggingface_hub pandas
pip install git+https://github.com/standardmodelbio/smb-biopan-utils.git

Download SMB-v1-Structure

from huggingface_hub import snapshot_download

snapshot_download("standardmodelbio/SMB-v1-1.7B-Structure")

Environment Activation

After setup, activate your environment for usage:

source standard_model/bin/activate

Verify Installation

Verify that everything is working correctly:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Check PyTorch and CUDA
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")

# Load SMB-v1-Structure
model_id = "standardmodelbio/SMB-v1-1.7B-Structure"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    device_map="auto"
)

print("SMB-v1-Structure loaded successfully!")
print(f"Model parameters: {sum(p.numel() for p in model.parameters()):,}")

SMB-v1-Structure is a world model, not a text generator. It predicts patient states in latent space, not tokens. See the Embeddings Guide for the full workflow.

Download Other Models

Download additional models from the Standard Model family:

from huggingface_hub import snapshot_download

snapshot_download("standardmodelbio/smb-ehr-4b")

Troubleshooting

CUDA Not Detected

Ensure NVIDIA drivers are up to date. Run nvidia-smi to verify GPU is accessible.

Out of Memory

SMB-v1-Structure requires ~16GB GPU memory. Use torch.float16 or quantization for smaller GPUs.

Model Access Denied

Some models may require authentication. Run huggingface-cli login with your token.

Slow Download

Model downloads can be large (several GB). Ensure stable connection and sufficient disk space.

Reducing Memory Usage

For GPUs with less memory, use half-precision or quantization:

import torch
from transformers import AutoModelForCausalLM

# Load in bfloat16 (half precision + f32 range)
model = AutoModelForCausalLM.from_pretrained(
    "standardmodelbio/SMB-v1-1.7B-Structure",
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Or use 8-bit quantization (requires bitsandbytes)
# pip install bitsandbytes
model = AutoModelForCausalLM.from_pretrained(
    "standardmodelbio/SMB-v1-1.7B-Structure",
    trust_remote_code=True,
    load_in_8bit=True,
    device_map="auto"
)

Hardware Requirements

Model	Parameters	Min GPU Memory	Recommended
SMB-v1-Structure	1.7B	16 GB	32 GB
SMB-EHR-4B	4B	24 GB	48 GB
SMB-Vision-Base	97.2M	4 GB	8 GB
SMB-Language-8B	8B	32 GB	80 GB

Next Steps

Embeddings

Extract patient embeddings with dummy data examples.

Linear Probing

Train classifiers on embeddings for prediction tasks.

Models

Explore the full model catalog and capabilities.

HuggingFace

Browse all models on HuggingFace.

Get Started

Model Families

Guides

Prerequisites

One-Command Setup (Recommended)

Manual Installation

Environment Activation

Verify Installation

Download Other Models

Troubleshooting

CUDA Not Detected

Out of Memory

Model Access Denied

Slow Download

Reducing Memory Usage

Hardware Requirements

Next Steps

Embeddings

Linear Probing

Models

HuggingFace

Get Started

Model Families

Guides

​Prerequisites

​One-Command Setup (Recommended)

​Manual Installation

​Environment Activation

​Verify Installation

​Download Other Models

​Troubleshooting

CUDA Not Detected

Out of Memory

Model Access Denied

Slow Download

​Reducing Memory Usage

​Hardware Requirements

​Next Steps

Embeddings

Linear Probing

Models

HuggingFace

Prerequisites

One-Command Setup (Recommended)

Manual Installation

Environment Activation

Verify Installation

Download Other Models

Troubleshooting

Reducing Memory Usage

Hardware Requirements

Next Steps