Adaptation of Mistral 7B to follow instructions in Spanish

Adaptation of Mistral 7B to follow instructions in Spanish

Use AI to create music with your voice and Leverage the latest in AI technology to supercharge your music.

As the internet continues to develop and grow exponentially, jobs related to the industry do too, particularly those that relate to web design and development.

October 26, 2023

Adaptation of Mistral 7B to follow instructions in Spanish
Adaptation of Mistral 7B to follow instructions in Spanish
Adaptation of Mistral 7B to follow instructions in Spanish

Mistral is a 7.3 trillion parameter language model that has surpassed the state of the art in previously established open-source models by Meta's Llama 2, demonstrating performance superior to models three times its size and positioning itself as the most efficient open-source language model to date.

Building on this solid foundation, at Clibrain we have focused on a gap that we are especially prepared to address: optimizing it for the Spanish-speaking world. Through a rigorous evaluation process, we can confirm that not only have we maintained Mistral's exceptional performance, but we have also ensured its full functionality in Spanish.

Overtaking the best open-source models to date

Up until the appearance of this new model, Meta occupied the top positions in evaluations of open-source language models with Llama 2, which was presented in three sizes (7B, 13B, 70B).

Mistral 7B has managed to surpass the 13B version of Llama 2 (almost twice its size) in all types of benchmarks. Furthermore, it is also superior to Llama 1 in its 34B parameter version in code, mathematics, and reasoning.

This demonstrates the great efficiency of Mistral 7B, improving the performance of models with a greater number of parameters. This efficiency allows using fewer computational resources, without compromising performance.

Speed and efficiency at its best

By using various attention mechanisms, Mistral achieves low latency and the ability to handle longer text sequences (greater context).

The Sliding Window Attention (SWA) mechanism (Child et al., Beltagy et al.) allows handling considerably long sequences with ease. Leveraging stacked transformer layers to attend to tokens in the past beyond the window size, it provides upper layers access to information further back in time.

The Grouped Query Attention (GQA) technique (Ainslie et al.) enables faster inferences, processing queries more efficiently. Queries are grouped based on their similarity and processed together. This allows the model to make predictions more quickly and with fewer resources.

Together, GQA and SWA allow Mistral 7B to handle lengths of up to 16,000 tokens with low latency and using 50% less memory.

Mistral 7B, now in Spanish

At Clibrain, we have a clear mission: developing artificial intelligence for the more than 600 million Spanish speakers in the world. As with our previous adaptations and releases, we want the Spanish-speaking community to benefit from the latest advances made in this technology, dominated by English.

Therefore, we have adapted Mistral 7B using fine-tuning techniques to understand instructions in Spanish, thus allowing the Spanish-speaking community to interact with the model.

We maintain the original performance (on-pair)

Using the MT Bench evaluation technique (Zheng et al.), we have compared the results offered by the original Mistral model adapted for instructions, Mistral 7B Instruct, and our adaptation for understanding instructions in Spanish.

In the evaluation (MT Bench in Spanish), our adaptation achieves a score of 6.84, compared to 7.05 of the original model. However, the Mistral model can respond in English or other languages when interacting with instructions in Spanish. With our adaptation, we ensure it speaks Spanish in 100% of situations, maintaining the original performance (on-pair).

When analyzing the performance of the two language models in different categories (capabilities), our adaptation to Spanish stands out for enhanced writing ability, where language control is essential.

Resources and download links

The adaptation of Mistral 7B, along with the rest of our open-source models, available for free at hf.co/clibrain.

Mistral is a 7.3 trillion parameter language model that has surpassed the state of the art in previously established open-source models by Meta's Llama 2, demonstrating performance superior to models three times its size and positioning itself as the most efficient open-source language model to date.

Building on this solid foundation, at Clibrain we have focused on a gap that we are especially prepared to address: optimizing it for the Spanish-speaking world. Through a rigorous evaluation process, we can confirm that not only have we maintained Mistral's exceptional performance, but we have also ensured its full functionality in Spanish.

Overtaking the best open-source models to date

Up until the appearance of this new model, Meta occupied the top positions in evaluations of open-source language models with Llama 2, which was presented in three sizes (7B, 13B, 70B).

Mistral 7B has managed to surpass the 13B version of Llama 2 (almost twice its size) in all types of benchmarks. Furthermore, it is also superior to Llama 1 in its 34B parameter version in code, mathematics, and reasoning.

This demonstrates the great efficiency of Mistral 7B, improving the performance of models with a greater number of parameters. This efficiency allows using fewer computational resources, without compromising performance.

Speed and efficiency at its best

By using various attention mechanisms, Mistral achieves low latency and the ability to handle longer text sequences (greater context).

The Sliding Window Attention (SWA) mechanism (Child et al., Beltagy et al.) allows handling considerably long sequences with ease. Leveraging stacked transformer layers to attend to tokens in the past beyond the window size, it provides upper layers access to information further back in time.

The Grouped Query Attention (GQA) technique (Ainslie et al.) enables faster inferences, processing queries more efficiently. Queries are grouped based on their similarity and processed together. This allows the model to make predictions more quickly and with fewer resources.

Together, GQA and SWA allow Mistral 7B to handle lengths of up to 16,000 tokens with low latency and using 50% less memory.

Mistral 7B, now in Spanish

At Clibrain, we have a clear mission: developing artificial intelligence for the more than 600 million Spanish speakers in the world. As with our previous adaptations and releases, we want the Spanish-speaking community to benefit from the latest advances made in this technology, dominated by English.

Therefore, we have adapted Mistral 7B using fine-tuning techniques to understand instructions in Spanish, thus allowing the Spanish-speaking community to interact with the model.

We maintain the original performance (on-pair)

Using the MT Bench evaluation technique (Zheng et al.), we have compared the results offered by the original Mistral model adapted for instructions, Mistral 7B Instruct, and our adaptation for understanding instructions in Spanish.

In the evaluation (MT Bench in Spanish), our adaptation achieves a score of 6.84, compared to 7.05 of the original model. However, the Mistral model can respond in English or other languages when interacting with instructions in Spanish. With our adaptation, we ensure it speaks Spanish in 100% of situations, maintaining the original performance (on-pair).

When analyzing the performance of the two language models in different categories (capabilities), our adaptation to Spanish stands out for enhanced writing ability, where language control is essential.

Resources and download links

The adaptation of Mistral 7B, along with the rest of our open-source models, available for free at hf.co/clibrain.

Mistral is a 7.3 trillion parameter language model that has surpassed the state of the art in previously established open-source models by Meta's Llama 2, demonstrating performance superior to models three times its size and positioning itself as the most efficient open-source language model to date.

Building on this solid foundation, at Clibrain we have focused on a gap that we are especially prepared to address: optimizing it for the Spanish-speaking world. Through a rigorous evaluation process, we can confirm that not only have we maintained Mistral's exceptional performance, but we have also ensured its full functionality in Spanish.

Overtaking the best open-source models to date

Up until the appearance of this new model, Meta occupied the top positions in evaluations of open-source language models with Llama 2, which was presented in three sizes (7B, 13B, 70B).

Mistral 7B has managed to surpass the 13B version of Llama 2 (almost twice its size) in all types of benchmarks. Furthermore, it is also superior to Llama 1 in its 34B parameter version in code, mathematics, and reasoning.

This demonstrates the great efficiency of Mistral 7B, improving the performance of models with a greater number of parameters. This efficiency allows using fewer computational resources, without compromising performance.

Speed and efficiency at its best

By using various attention mechanisms, Mistral achieves low latency and the ability to handle longer text sequences (greater context).

The Sliding Window Attention (SWA) mechanism (Child et al., Beltagy et al.) allows handling considerably long sequences with ease. Leveraging stacked transformer layers to attend to tokens in the past beyond the window size, it provides upper layers access to information further back in time.

The Grouped Query Attention (GQA) technique (Ainslie et al.) enables faster inferences, processing queries more efficiently. Queries are grouped based on their similarity and processed together. This allows the model to make predictions more quickly and with fewer resources.

Together, GQA and SWA allow Mistral 7B to handle lengths of up to 16,000 tokens with low latency and using 50% less memory.

Mistral 7B, now in Spanish

At Clibrain, we have a clear mission: developing artificial intelligence for the more than 600 million Spanish speakers in the world. As with our previous adaptations and releases, we want the Spanish-speaking community to benefit from the latest advances made in this technology, dominated by English.

Therefore, we have adapted Mistral 7B using fine-tuning techniques to understand instructions in Spanish, thus allowing the Spanish-speaking community to interact with the model.

We maintain the original performance (on-pair)

Using the MT Bench evaluation technique (Zheng et al.), we have compared the results offered by the original Mistral model adapted for instructions, Mistral 7B Instruct, and our adaptation for understanding instructions in Spanish.

In the evaluation (MT Bench in Spanish), our adaptation achieves a score of 6.84, compared to 7.05 of the original model. However, the Mistral model can respond in English or other languages when interacting with instructions in Spanish. With our adaptation, we ensure it speaks Spanish in 100% of situations, maintaining the original performance (on-pair).

When analyzing the performance of the two language models in different categories (capabilities), our adaptation to Spanish stands out for enhanced writing ability, where language control is essential.

Resources and download links

The adaptation of Mistral 7B, along with the rest of our open-source models, available for free at hf.co/clibrain.

Mistral is a 7.3 trillion parameter language model that has surpassed the state of the art in previously established open-source models by Meta's Llama 2, demonstrating performance superior to models three times its size and positioning itself as the most efficient open-source language model to date.

Building on this solid foundation, at Clibrain we have focused on a gap that we are especially prepared to address: optimizing it for the Spanish-speaking world. Through a rigorous evaluation process, we can confirm that not only have we maintained Mistral's exceptional performance, but we have also ensured its full functionality in Spanish.

Overtaking the best open-source models to date

Up until the appearance of this new model, Meta occupied the top positions in evaluations of open-source language models with Llama 2, which was presented in three sizes (7B, 13B, 70B).

Mistral 7B has managed to surpass the 13B version of Llama 2 (almost twice its size) in all types of benchmarks. Furthermore, it is also superior to Llama 1 in its 34B parameter version in code, mathematics, and reasoning.

This demonstrates the great efficiency of Mistral 7B, improving the performance of models with a greater number of parameters. This efficiency allows using fewer computational resources, without compromising performance.

Speed and efficiency at its best

By using various attention mechanisms, Mistral achieves low latency and the ability to handle longer text sequences (greater context).

The Sliding Window Attention (SWA) mechanism (Child et al., Beltagy et al.) allows handling considerably long sequences with ease. Leveraging stacked transformer layers to attend to tokens in the past beyond the window size, it provides upper layers access to information further back in time.

The Grouped Query Attention (GQA) technique (Ainslie et al.) enables faster inferences, processing queries more efficiently. Queries are grouped based on their similarity and processed together. This allows the model to make predictions more quickly and with fewer resources.

Together, GQA and SWA allow Mistral 7B to handle lengths of up to 16,000 tokens with low latency and using 50% less memory.

Mistral 7B, now in Spanish

At Clibrain, we have a clear mission: developing artificial intelligence for the more than 600 million Spanish speakers in the world. As with our previous adaptations and releases, we want the Spanish-speaking community to benefit from the latest advances made in this technology, dominated by English.

Therefore, we have adapted Mistral 7B using fine-tuning techniques to understand instructions in Spanish, thus allowing the Spanish-speaking community to interact with the model.

We maintain the original performance (on-pair)

Using the MT Bench evaluation technique (Zheng et al.), we have compared the results offered by the original Mistral model adapted for instructions, Mistral 7B Instruct, and our adaptation for understanding instructions in Spanish.

In the evaluation (MT Bench in Spanish), our adaptation achieves a score of 6.84, compared to 7.05 of the original model. However, the Mistral model can respond in English or other languages when interacting with instructions in Spanish. With our adaptation, we ensure it speaks Spanish in 100% of situations, maintaining the original performance (on-pair).

When analyzing the performance of the two language models in different categories (capabilities), our adaptation to Spanish stands out for enhanced writing ability, where language control is essential.

Resources and download links

The adaptation of Mistral 7B, along with the rest of our open-source models, available for free at hf.co/clibrain.