ONNX Runtime GenAI is a library that provides optimised inference for generative AI models — particularly LLMs — built on top of the ONNX Runtime engine. This component provides a server CLI and a class-based generic API with callback functions to use ONNX Runtime GenAI from 4D.