An open description standard for defining cross-platform, multi-format AI models.

model.yaml banner

What is model.yaml?

AI models are often available in multiple formats and variants, while different machines support diverse engines such as llama.cpp and MLX. This diversity can make it challenging for end users to reason about the choices available to them.

model.yaml addresses this by providing a standard description format that defines a model, along with multiple possible sources (potentially in different formats) of a model. It delegates the responsibility of determining the most suitable variant to download and the appropriate engine to use to the client program (e.g. LM Studio), while allowing showing simplified information to the user.

Models in LM Studio's model catalog are all defined using model.yaml.

model: org/model-name
base:
  - key: hf-repo/model-name-instruct-gguf
    sources:
      - type: huggingface
        user: hf-user
        repo: Model-Name-Instruct-GGUF
  - key: hf-repo/model-name-instruct-MLX
    sources:
      - type: huggingface
        user: another-hf-user
        repo: Model-Name-Instruct-MLX
metadataOverrides:
  compatibilityTypes:
    - gguf
    - safetensors

Specification Draft 1.0

The model.yaml format defines a structured way to specify AI models, their concrete sources, configurations, and metadata. Feel free to contribute to this open standard as it evolves.

You can find a TypeScript implementation of the specification in lmstudio-js.

Core Fields

model Required

The identifier for the model in the format organization/name. This determines where the model will be published and how it's referenced.

model: qwen/qwen3-8b

base Required

Defines the underlying model(s) that this virtual model points to. Can be either:

  • A string referencing another virtual model, forming a chain
  • An array of concrete model specifications with download sources
base:
  - key: lmstudio-community/qwen3-8b-gguf
    sources:
      - type: huggingface
        user: lmstudio-community
        repo: Qwen-3-8B-GGUF

Configuration & Metadata

metadataOverrides Optional

Overrides metadata for the model, which can differ from the base model. This helps platforms understand the model's capabilities.

metadataOverrides:
  domain: llm
  architectures:
    - llama
  compatibilityTypes:
    - gguf
    - safetensors
  paramsStrings:
    - 1B
  minMemoryUsageBytes: 1000000000
  contextLengths:
    - 131072
  trainedForToolUse: mixed
  vision: false
domain

The domain type of the model (e.g., llm, embedding).

architectures

Array of model architecture names (e.g., llama, qwen2).

compatibilityTypes

Array of format types the model supports (e.g., gguf, safetensors).

paramsStrings

Human-readable parameter size labels (e.g., 1B, 7B).

minMemoryUsageBytes

Minimum RAM required to load the model in bytes.

contextLengths

Array of supported context window sizes.

trainedForToolUse

Whether the model supports tool use (true, false, or mixed).

vision

Whether the model supports processing images (true, false, or mixed).

config Optional

Built-in configurations for the model, applying preset configurations for loading or runtime operation.

config:
  operation:
    fields:
      - key: llm.prediction.topKSampling
        value: 20
      - key: llm.prediction.minPSampling
        value:
          checked: true
          value: 0

Customization

customFields Optional

User-configurable options that affect the model's behavior. Each field can trigger effects like changing variables or modifying the system prompt.

customFields:
  - key: enableThinking
    displayName: Enable Thinking
    description: Enable the model to think before answering.
    type: boolean
    defaultValue: true
    effects:
      - type: setJinjaVariable
        variable: enable_thinking
key

Unique identifier for the field.

displayName

Human-readable name shown in UI.

description

Explains the field's purpose.

type

Data type (boolean or string).

defaultValue

Initial value.

effects

What effects to apply.

Complete Example

model: qwen/qwen3-8b
base:
  - key: lmstudio-community/qwen3-8b-gguf
    sources:
      - type: huggingface
        user: lmstudio-community
        repo: Qwen-3-8B-GGUF
metadataOverrides:
  domain: llm
  architectures:
    - llama
  compatibilityTypes:
    - gguf
    - safetensors
  paramsStrings:
    - 1B
  minMemoryUsageBytes: 1000000000
  contextLengths:
    - 131072
  trainedForToolUse: mixed
  vision: false
config:
  operation:
    fields:
      - key: llm.prediction.topKSampling
        value: 20
      - key: llm.prediction.minPSampling
        value:
          checked: true
          value: 0
customFields:
  - key: enableThinking
    displayName: Enable Thinking
    description: Enable the model to think before answering.
    type: boolean
    defaultValue: true
    effects:
      - type: setJinjaVariable
        variable: enable_thinking