TorchAO Backend API
TorchAOBackend
aitune.torch.backend.TorchAOBackend
Bases: Backend
Backend that does torch quantization.
Supported quantizations
- int8wo
- int8dq
- fp8wo
- fp8dq
If you would like to use customize quantization, you can pass in a quantization config.
Initializes backend.
Parameters:
-
config(TorchAOBackendConfig | None, default:None) –The configuration to use.
Source code in aitune/torch/backend/torchao_backend.py
device
property
Get the device of the backend.
Returns:
-
device–The device the module is using.
activate
Activates backend.
After activating, the backend should be ready to do inference.
Source code in aitune/torch/backend/backend.py
build
Build the model with the given arguments.
Building a backend should be idempotent i.e. do not cause side effects. A model is not necessarily pure functional and can have an internal state (like kv cache for LLMs). That is why build can call a sample of inputs at most once so that subsequent calls have exact same state as the first call for the given sample.
After building, the backend should be activated.
Source code in aitune/torch/backend/backend.py
deactivate
Deactivates backend.
After deactivating, the backend cannot be used to do inference.
Source code in aitune/torch/backend/backend.py
deploy
Deploys the backend.
After deploying, the backend is ready to do inference. Backend cannot be deactivated anymore.
Parameters:
-
device(device | None) –The device to deploy the backend on.
Source code in aitune/torch/backend/backend.py
describe
from_dict
classmethod
Creates a backend from a state_dict.
Source code in aitune/torch/backend/torchao_backend.py
infer
Run inference with the given arguments.
Parameters:
-
args(Any, default:()) –Variable length argument list.
-
kwargs(Any, default:{}) –Arbitrary keyword arguments.
Returns:
-
Any(Any) –The result of the inference.
Source code in aitune/torch/backend/backend.py
key
to_dict
Returns the state_dict of the backend.
Source code in aitune/torch/backend/torchao_backend.py
TorchAOBackendConfig
aitune.torch.backend.TorchAOBackendConfig
dataclass
Bases: BackendConfig
Configuration for TorchAOBackend.
__post_init__
Post init for TorchAOBackendConfig.
Source code in aitune/torch/backend/torchao_backend.py
describe
Returns the description of the backend.
Source code in aitune/torch/backend/torchao_backend.py
from_dict
classmethod
key
Returns the key of the backend configuration.
Source code in aitune/torch/backend/torchao_backend.py
to_dict
to_json
Saves the backend configuration to a file.