# GitHub – EricLBuehler/mistral.rs: Swift LLM Inference

Mistral.rs accelerates LLM inference with versatility.

Mistral.rs is a speedy and flexible platform designed to enhance inference for large language models. This platform is equipped to work with a variety of devices, incorporating efficient quantization and encourages usage through straightforward applications. Embracing compatibility with the OpenAI API through an HTTP server and Python bindings, Mistral.rs accommodates several models such as Llama, Phi, and Qwen. Furthermore, it integrates accelerators like Apple Metal and CUDA, alongside features that streamline operations like continuous batching and prefix caching. As an open-source initiative, Mistral.rs invites collaboration and code contributions from the community.

Read more: [

Github

](https://github.com/EricLBuehler/mistral.rs)