vllm.v1.spec_decode ¶
Modules:
| Name | Description |
|---|---|
eagle | |
extract_hidden_states | |
medusa | |
metrics | |
ngram_proposer | |
ngram_proposer_gpu | GPU-accelerated N-gram proposer using fully async PyTorch tensor operations. |
suffix_decoding | |
utils | |