Torch-TensorRT JIT Backend API
TorchTensorRTJitBackend
aitune.torch.backend.TorchTensorRTJitBackend
Bases: Backend
Torch TensorRT thought torch.compile(backend="torch_tensorrt").
Backend does not use intermediate formats, and compiled model is not stored.
Initialize TorchTensorRTJitBackend.
Parameters:
-
config(TorchTensorRTJitBackendConfig | None, default:None) –Configuration for torch compile with torch_tensorrt
Source code in aitune/torch/backend/torch_tensorrt_jit_backend.py
device
property
Get the device of the backend.
Returns:
-
device–The device the module is using.
activate
Activates backend.
After activating, the backend should be ready to do inference.
Source code in aitune/torch/backend/backend.py
build
Build the model with the given arguments.
Building a backend should be idempotent i.e. do not cause side effects. A model is not necessarily pure functional and can have an internal state (like kv cache for LLMs). That is why build can call a sample of inputs at most once so that subsequent calls have exact same state as the first call for the given sample.
After building, the backend should be activated.
Source code in aitune/torch/backend/backend.py
deactivate
Deactivates backend.
After deactivating, the backend cannot be used to do inference.
Source code in aitune/torch/backend/backend.py
deploy
Deploys the backend.
After deploying, the backend is ready to do inference. Backend cannot be deactivated anymore.
Parameters:
-
device(device | None) –The device to deploy the backend on.
Source code in aitune/torch/backend/backend.py
describe
from_dict
classmethod
Creates a backend from a state_dict.
Source code in aitune/torch/backend/torch_tensorrt_jit_backend.py
infer
Run inference with the given arguments.
Parameters:
-
args(Any, default:()) –Variable length argument list.
-
kwargs(Any, default:{}) –Arbitrary keyword arguments.
Returns:
-
Any(Any) –The result of the inference.
Source code in aitune/torch/backend/backend.py
key
to_dict
Returns the state_dict of the backend.
Source code in aitune/torch/backend/torch_tensorrt_jit_backend.py
TorchTensorRTJitBackendConfig
aitune.torch.backend.TorchTensorRTJitBackendConfig
dataclass
TorchTensorRTJitBackendConfig(compile_config=(lambda: TorchTensorRTConfig(enabled_precisions={float16}))(), fullgraph=False, dynamic_shapes=None, autocast_enabled=False, autocast_dtype=None)
Bases: BackendConfig
Configuration for torch.compile(backend="torch_tensorrt").
describe
Describe the backend configuration. Display only changed fields.
Source code in aitune/torch/backend/torch_tensorrt_jit_backend.py
from_dict
classmethod
Convert dict to TorchTensorRTJitBackendConfig.
key
Returns the keys of the backend configuration.
to_dict
Convert TorchTensorRTJitBackendConfig to dict.
to_json
Saves the backend configuration to a file.