G2 takes pride in showing unbiased reviews on user satisfaction in our ratings and reports. We do not allow paid placements in any of our ratings, rankings, or reports. Learn about our scoring methodologies.
Mistral-7B-v0.1 is a small, yet powerful model adaptable to many use-cases. Mistral 7B is better than Llama 2 13B on all benchmarks, has natural coding abilities, and 8k sequence length. It’s released
BLOOM-560m is a transformer-based language model developed by BigScience, designed to facilitate research in large language models (LLMs). It serves as a pre-trained base model capable of generating h
Granite-3.1-3B-A800M-Base is a state-of-the-art language model developed by IBM, designed to handle complex natural language processing tasks with high efficiency. This model employs a sparse Mixture
Microsoft Azure’s Phi 3 model redefining large-scale language model capabilities in the cloud.
BLOOM-1b1 is a multilingual language model developed by the BigScience Workshop, designed to generate human-like text across 48 languages. As a transformer-based model, it utilizes a decoder-only arch
BLOOM-1b7 is a transformer-based language model developed by the BigScience Workshop, designed to generate human-like text across 48 languages. As a scaled-down variant of the larger BLOOM model, it o
BLOOM-3B is a 3-billion parameter multilingual language model developed by the BigScience initiative. As a scaled-down version of the larger BLOOM model, it maintains the same architecture and trainin
BLOOM-7B1 is a multilingual language model developed by BigScience, designed to generate human-like text across 48 languages. With over 7 billion parameters, it leverages a transformer-based architect
Granite-3.1-1B-A400M-Base is a language model developed by IBM's Granite Team, designed to handle extensive context lengths up to 128K tokens. This model is based on a decoder-only sparse Mixture of E
Granite-3.2-2B-Instruct is a 2-billion-parameter language model developed by IBM's Granite Team, designed to handle a wide range of instruction-following tasks. Built upon its predecessor, Granite-3.1
Granite-3.2-8B-Instruct is an 8-billion-parameter AI model fine-tuned for advanced reasoning tasks. Built upon its predecessor, Granite-3.1-8B-Instruct, it has been trained using a combination of perm
Granite-3.3-2B-Instruct is a 2-billion parameter language model developed by IBM's Granite Team, designed to enhance reasoning and instruction-following capabilities. With a context length of 128K tok
Granite-3.3-8B-Instruct is an advanced language model developed by IBM's Granite Team, featuring 8 billion parameters and a 128K context length. Fine-tuned for enhanced reasoning and instruction-follo
Granite-4.0-Tiny-Preview is a 7-billion-parameter fine-grained hybrid mixture-of-experts (MoE) instruction-following model developed by IBM's Granite Team. Fine-tuned from the Granite-4.0-Tiny-Base-Pr
Granite-4.0-Tiny-Base-Preview is a 7-billion-parameter hybrid mixture-of-experts (MoE) language model developed by IBM's Granite Team. It features a 128,000-token context window and utilizes the Mamba