vllm.compilation.monitor ¶
monitor_profiling_run ¶
monitor_profiling_run() -> Generator[None, None, None]
Context manager that times the initial profiling run.
Asserts that no backend compilation occurs during the profiling run (all compilation should have completed before this point).
Source code in vllm/compilation/monitor.py
monitor_torch_compile ¶
monitor_torch_compile(
vllm_config: VllmConfig,
message: str = "torch.compile took %.2f s in total",
) -> Generator[None, None, None]
Context manager that times torch.compile and manages depyf debugging.
On normal exit: logs the compile time and exits depyf. On exception: cleans up depyf without logging (compilation failed).