About
Latest AI News
Free Prompt Course
- Prompting 101
- Prompt Techniques
  - Chain-of-Thought

# GitHub – EricLBuehler/mistral.rs: Swift LLM Inference

Apr 30, 2024

—

Mistral.rs accelerates LLM inference with versatility.

Mistral.rs is a speedy and flexible platform designed to enhance inference for large language models. This platform is equipped to work with a variety of devices, incorporating efficient quantization and encourages usage through straightforward applications. Embracing compatibility with the OpenAI API through an HTTP server and Python bindings, Mistral.rs accommodates several models such as Llama, Phi, and Qwen. Furthermore, it integrates accelerators like Apple Metal and CUDA, alongside features that streamline operations like continuous batching and prefix caching. As an open-source initiative, Mistral.rs invites collaboration and code contributions from the community.