Skip to content

Latest commit

 

History

History
79 lines (61 loc) · 1.3 KB

File metadata and controls

79 lines (61 loc) · 1.3 KB
title:max.nn.kv_cache
type:module
lang:python
wrapper_class:rst-module-autosummary

max.nn.kv_cache

.. automodule:: max.nn.kv_cache
   :no-members:

.. currentmodule:: max.nn.kv_cache

Cache configuration

.. autosummary::
   :nosignatures:
   :toctree: generated
   :template: autosummary/class.rst

   KVCacheBuffer
   KVCacheParamInterface
   KVCacheParams
   KVCacheQuantizationConfig
   KVConnectorType
   KVCacheMemory
   MultiKVCacheParams
   ReplicatedKVCacheMemory

Cache inputs

.. autosummary::
   :nosignatures:
   :toctree: generated
   :template: autosummary/class.rst

   KVCacheInputs
   KVCacheInputsPerDevice
   BatchCharacteristics
   PagedCacheValues

Attention dispatch

.. autosummary::
   :nosignatures:
   :toctree: generated
   :template: autosummary/class.rst

   AttentionDispatchResolver
   AttnKey
   MHAAttnKey
   MLAAttnKey

Metrics

.. autosummary::
   :nosignatures:
   :toctree: generated
   :template: autosummary/class.rst

   KVCacheMetrics

Functions

.. autosummary::
   :nosignatures:
   :toctree: generated
   :template: autosummary/function.rst

   build_max_lengths_tensor
   compute_max_seq_len_fitting_in_cache
   compute_num_device_blocks
   compute_num_host_blocks
   estimated_memory_size