Home / Blog / Beyond GPT: Exploring On-Device AI

Beyond GPT: Exploring On-Device AI

Unlock AI potential in your mobile app with on-device LLMs. See what business contexts smaller language models can be used in on mobile phones. In the article, we cover the benefits SLMs offer to app developers, not least cost minimization, improved security, and much more.

Written by

Mateusz Sawa, Patryk Matwijczuk

When

Apr 26th 2024

What is AI and Generative AI?

There have been thousands of definitions already, so we won’t be reinventing the wheel. If you want to read a more technical definition, then just check out the glossary we are preparing for you with the explanations of all the technical terms used in this article. So, what is AI?

Put very simply, AI is a computer trained to think and learn as humans do. Artificial Intelligence is basically about helping the machines to be able to pick up knowledge, learn from experiences, eventually being able to make decisions on the basis of what they’ve learned. Moreover, the expectation is that it will be able to correct its own mistakes after some time.

What we call ‘traditional’ AI is ai systems is an elaborate algorithm focused on performing a specific, well-defined and task limited in scope in an intelligent manner. These ‘algorithms’ are systems created to respond meaningfully to a particular set of inputs, the training data, if you will. They are able to learn from the data put into them, and then, subsequently, make decisions or even predictions from that data.

Think computer chess. Your opponent knows the rules of the game, is able to predict your moves, and makes its own based on a predefined strategy and various scenarios built into it. The program you’re playing against is not inventing any new ways to play chess. Nor should it. It’s selecting from a host of strategies put into it as it was programmed. That’s an example of such a traditional, simple AI. Voice assistants like Alexa or Siri, Google’s search algorithm, and the way recommendation engines function on Amazon, Netflix or Spotify are all examples of AI that has not yet evolved into its Generative version-a significant step forward. These AIs have been trained to follow specific rules, do a particular job, or act according to various scenarios gamed beforehand. And they do these things fairly well. However, they certainly don’t innovate.

Generative AI models, in contrast to traditional AI, do create new stuff, they basically generate content. You can think of it as the next generation of artificial intelligence-the ultimate inanimate creative artist and inventor. What generative AI does certainly feels like magic at times. It can whip up a brand-new story that reads like something written by Tove Jansson. It can compose a piece of music that will sound like a long-forgotten sonata written by Mozart and stir your emotions without ever having touched an instrument. Designing new clothes, coming up with unusual recipes, and finding combinations of medications previously unheard of—almost nothing is beyond the power of GenAI.

A good example of Generative AI would be asking GPT-4 to create a brand-new story, after giving it the following starting line: “The Moomin family had been living for some weeks in the valley where they had found their house after the dreadful flood (which is another story).” It will take that line and easily generate a whole story, complete with characters, plot twists, development and a thrilling climax. The resulting story is unlikely to beat the original ‘Comet in Moominland‘, but AI is getting there.

Generative AI models are trained on a set of data and learn the underlying patterns to generate new data that mirrors the training set. GPT-4, OpenAI’s language prediction model, would be a prime example of generative AI. Trained on millions of pages on the internet, it can produce human-like text that is occasionally almost indistinguishable from a text written by a person. If you add some human feedback to it, the result keeps getting better and better.

What is the main difference between traditional AI and generative AI? Well, it lies in their capabilities and application. Traditional AI is for analyzing data and making predictions. Generative AI goes a step further. It is used to create new data, similar to its training data.

Put differently, traditional AI is a master of pattern recognition, while generative AI is adept at pattern creation. Traditional AI can analyze data and tell you what it sees, but generative AI can use that same data to create something entirely new.

Enter LLM-large language models. These are ‘star performers’ in the world of AI creativity. They are the brains behind the scenes, and they are especially good with words. LLMs basically have read a lot, so they ‘know’ how to use words fairly well. Unlike the simple AI we described above, which is something of a super-smart assistant for simple and repetitive tasks, GenAI goes way further. It can bring entirely new and original ideas to life, especially when it comes to playing with words.

To describe LLMs in a more technical way, we’ll have to add a few more technical terms, so please brace yourself. First, we’ll need machine learning, which is a field in artificial intelligence focusing on data generalization studies with statistical algorithms. What machine learning is trying to achieve is data analysis so that eventually artificial intelligence can perform tasks without additional instruction.

Then comes the Natural Language Processing field at the intersection between computer science and linguistics. It is there that machine processing of the human text is taking place. Advanced language models, which can learn without any direction, are created there too. The language model is a probabilistic NLP model able to perform many tasks previously marked as ‘human-only’, such as translation, grammar correction, and text generation. These models used to use purely statistical models.

With the advent of Neural Networks, this is no longer the case. Now the preferred approach is Neural Networks-computer programs mimicking the human brain’s neuron structure. As these are capable of handling complex pattern recognition, current language models are usually on neural networks.

Large Language Models are machine learning models that learn from a huge number of data sources. Afterward, they perform general-purpose language generation. They are still a language model, but it is the huge number of parameters learned by the NN which makes them considered large. In layperson’s terms, the model is able to predict the next words from the given input words very well, almost like a human being.

Examples of LLM tasks? Think language translation, machine chatbot, question answering, and many more. From any sequence of data input, the model could identify relationships between the words and generate output suitable from the instruction.

Take almost any Generative AI products and you’ll see that they are powered by the LLMs. Big products like ChatGPT, Google’s Gemini, and many more are using LLMs as the basis of their product.

Now that we’ve figured it out, let’s explore what role Artificial Intelligence plays in modern society.

The Role of Artificial Intelligence in Modern Society

The role AI is playing now is hard to overestimate. According to some of the latest statistics:

Next Move Strategy Consulting estimates the value of the market for artificial intelligence (AI) to be more than 100 billion U.S. dollars. And it is expected to grow twentyfold by 2030, up to nearly two trillion U.S. dollars.
The AI market is expanding at a CAGR of 38.1% between 2022 and 2030.
The global market for artificial intelligence (AI) in transport is projected to grow at a compound annual growth rate (CAGR) of 17.5% from 2020 to 2027.
80% of logistics companies believe that AI will become a critical part of their business within 10 years.
37% of logistics professionals already use AI in their operations.
AI could save the logistics industry $1.5 trillion by 2025.
By 2025, as many as 97 million people will work in the AI space.
83% of companies claim that AI is a top priority in their business plans.
Netflix makes $1 billion annually from automated personalized recommendations.
48% of businesses use some form of AI to utilize big data effectively.
38% of medical providers use computers as part of their diagnosis.

These are examples of AI being extremely important in large and crucial sectors of society’s life—healthcare and logistics. But what about using it on a smaller scale? Can we have AI in the applications we install on our phones? Can we have AI at our fingertips all the time? The answer is ‘yes’!

That’s where SLMs can and are employed.

In our mobile-centric era, AI has seamlessly woven and integrated itself into the fabric of our daily lives, making sophisticated technology not just accessible but indispensable to getting most of the important things done. It’s a testament to how the most impactful innovations are often those that fit snugly into the palm of our hands. They are quietly shaping our interactions, decisions, and connections with the world around us.

Can language models be used on mobile devices?

LLMs vs SLMs

Large language models, such as the recent GPT-4 can’t really be used on mobile devices, namely smartphones, as they require too much processing power. So does it mean that due to hardware limitations, using language models on mobile devices is impossible? Can we not do advanced AI tasks on our phones and tablets? Yes we can. How? You guessed it right-SLMs.

Small language models are considerably less resource-intensive than LLMs. SLMs are less complex and have fewer parameters than large language models (LLMs), making them faster and less costly to use. LLMs, such as GPT-4, can definitely generate more advanced and complex responses thanks to training on vast datasets, but they require way more resources and time to operate. Sure, SLMs may be less accurate and versatile, but this can mended.

In our research we looked at three models—Phi-2, Gemma 2b, and Mistral 7b—assessing how they work on mobiles to see what we can expect and how these AI models can be used in real life on our moble devices. We also compared how these three perform against the GPT-4 model. This comparison will give a clearer picture of their capabilities.

SLMs we explored

The first SLM we picked up was Phi-2, released by Microsoft on December 12 last year. It’s a 2.7 billion-parameter language model with “outstanding reasoning and language understanding capabilities, showcasing state-of-the-art performance among base language models with less than 13 billion parameters”. According to Microsoft, this model matched or even outperformed models up to 25x larger, thanks to new innovations in model scaling and training data curation.

The second SLM we chose for our little experiment is Gemma 2b, released by Google in February 2024. Gemma’s size makes it an ideal candidate to run on mobile devices and laptops. It comes with a basic pre-trained version and an instruction-tuned version. This SLM was fine-tuned to be useful in conversational settings, i.e. to be a chatbot. With its 3 billion parameters, Gemma 2B is so small compared to other state-of-the-art models that it is often called a prime example of a small language model (SLM).

The third SLM of choice is Mistarl AI’s Mistral 7b. This one is a whopping 7.3B parameter model that outperforms Meta’s Llama on a host of benchmarks.

How we successfully launched an SLM on a mobile devices

Interface Selection

We chose llama.cpp for its compatibility with Android, iOS, and PCs, ensuring seamless model communication across platforms.

Model Selection

We focused on models with fewer than 7 billion parameters for broader device compatibility, selecting popular Mistral-7b and smaller models Phi-2 and Gemma to compare their performance against larger counterparts.

Local Testing

Using llama.cpp and Python Langchain, which is a framework for developing applications powered by large language models, we tested the models locally to understand their configurations and operations better.

Simulator Testing

During iPhone simulator tests, we encountered and quickly resolved several configuration issues, especially the ones with complex queries, thanks to the experience gained from local testing.

Real-Device Testing

We successfully ran the models on an actual device, testing pre-prepared prompts and queries to evaluate performance.

Small Language Models – Is bigger always better?

Sure, when it comes to language models, size often matters a lot. But is a bigger size always the best option? Definitely not. Many other aspects are important when making the decision, if not more important than just the number of parameters, for example. When choosing the model to use, one should pay attention to available resources, budget, the desired and/or optimal response time, and the complexity and accuracy of the response we want from the model.

So which aspects matter? How do you choose the right model for your needs? What’s the difference between LLMs and SLMs? To answer these questions, we made a neat table for you. Let’s take a look at some of the aspects we find worth singling out.

* SLMs and LLMs pit against each other.

The size of the model impacts computational resources and affects the model’s complexity, latency, and storage in business infrastructure. The bigger the number of parameters, the better the performance. At a higher cost and slower speed, though. Fun fact: a 4 GB movie recorded on an iPhone in ProRes format at a 4K resolution and 30 frames per second would last about 11 minutes.

Resources are the hardware and infrastructure required for the SLM to function. Security is the possibility of any data leakages. And vision analysis means image processing quality (e.g., object detection, image classification, and general visual understanding).

So, the benefits of using a local model are clear. When opting for an SLM, users can be sure about the security of their data since it is stored locally on the device. This local storage means that the data is not transmitted over the internet or stored on external servers, which significantly reduces the risk of data breaches or unauthorized access. However, it should be noted that at the time of publishing this article, not all phones can handle even a small SLM model, so this should be double checked on a case-by-case basis.

However, it’s only a matter of a few years before most devices are powerful enough to allow for a smooth running of any SLM app. Until that time, we can implement the functionality on the server, thereby foregoing offline features. Still, even such a compromise is miles ahead of OpenAI’s solutions in terms of costs and security. If the model is on our server, we can be sure that the data we send to it will not be used to train a publicly available model.

From Theory to Practice: SLMs and GPT 4 in Business Contexts

To see how Small Language Models stack against Large Language Models, we used the same prompts in the SLMs we chose and compared the results with what ChatGPT generated for that prompt.

Put simply, prompts are the information, sentences, or questions that you enter into any GenAi tool to get a response from it. Prompts are a big influence on the quality of outputs you receive. After you enter a prompt, the AI model analyzes your input and generates a response based on the patterns it has learned through its training.

So, using the same prompts across all the models we used, we aimed to demonstrate how the lines between large and small models are blurring, especially for concise prompts, showing that SLMs are becoming increasingly competitive.

Using SLMs to sum up messages, meeting notes or any long text on our mobile phone helps make long texts short and clear. This is really useful for working faster and getting to important info quickly, which is super important in busy work settings.

We selected three AI cases and compared the results using different models. The cases we analyzed are:

Summarization
Birthday wishes content generation
Leave requests content generation

Summarization

Imagine that before a business meeting, a colleague shared meeting notes with you via chat. Overwhelmed with work, you couldn’t read everything. Just before talking to a client, you quickly used an AI chat feature to summarize your colleague’s message, helping you get the key information needed for the meeting. That’s just one example of the many out there of how AI can make your work more efficient.

The original message we used is as follows:

Johannes Gutenberg (1398 – 1468) was a German goldsmith and publisher who introduced printing to Europe. His introduction of mechanical movable type printing to Europe started the Printing Revolution and is widely regarded as the most important event of the modern period. It played a key role in the scientific revolution and laid the basis for the modern knowledge-based economy and the spread of learning to the masses.
Gutenberg many contributions to printing are: the invention of a process for mass-producing movable type, the use of oil-based ink for printing books, adjustable molds, and the use of a wooden printing press. His truly epochal invention was the combination of these elements into a practical system that allowed the mass production of printed books and was economically viable for printers and readers alike.
In Renaissance Europe, the arrival of mechanical movable type printing introduced the era of mass communication which permanently altered the structure of society. The relatively unrestricted circulation of information—including revolutionary ideas—transcended borders, and captured the masses in the Reformation. The sharp increase in literacy broke the monopoly of the literate elite on education and learning and bolstered the emerging middle class.

The results we’ve got across the models:

Phi-2	Gemma	Mistral 7B	GPT-4
John Gutenberg (1398–1468), the first modern printer, has a huge impact on society because he invented movable type printing, oil-based ink, adjustable molds, and wooden press, which were the fundamental elements for mass-producing printed books. The Printing Revolution led by him changed the society structure in Renaissance Europe, which allowed for information circulation across borders, spread revolutionary ideas, and supported learning for the masses.	Johannes Gutenberg was a German goldsmith and publisher who introduced printing to Europe. He was a key figure in the Printing Revolution and is widely regarded as the most important event of the modern period. His contributions to printing include the invention of a process for mass-producing movable type, the use of oil-based ink for printing books, adjustable molds, and the use of a wooden printing press.	Johannes Gutenberg invented mechanical movable type printing, which started the Printing Revolution in Europe. He introduced oil-based ink for printing books, adjustable molds, and a wooden printing press. The combination of these elements allowed for mass production of printed books, making it economically viable for printers and readers alike. The invention of mechanical movable type printing permanently altered the structure of society by introducing mass communication, capturing the masses in the Reformation, increasing literacy, and breaking the monopoly of the literate elite on education and learning, which bolstered the emerging middle class	Johannes Gutenberg, a German goldsmith and publisher, introduced printing to Europe, sparking the Printing Revolution, a pivotal event in the modern period. His inventions in printing encompassed a mass-production process for movable type, the introduction of oil-based ink, adjustable molds, and a wooden printing press. This revolutionary system facilitated economically viable mass production of books. The advent of this technology in Renaissance Europe initiated an era of mass communication, encouraging a surge in literacy rates, the spread of innovative ideas across borders, and bolstering the emerging middle class.

* Summarization of a text done by the SLMs we selected.

This is how AI fares with a very mechanical task. Read on to see how it deals with a more creative one.

Birthday wishes content generation

Processing the existing text is easy, so let’s try doing something more creative. Let’s assume you get a notification on the company messenger that your colleague has a birthday today. Many people struggle with composing wishes. Here, AI can help us by generating personalized birthday greetings based on a few pieces of information we provide about the recipient.

We can’t share the exact prompt we used here or elsewhere as it’s the private property of Applandeo, and we are going to use them in many future projects that we’ll share with you.

The results we’ve got across the models:

Phi-2	Gemma	Mistral 7B	GPT-4
Dear Bryan, It’s hard to find words to express how much I value our friendship! As we celebrate your special day, I’m reminded of all the amazing moments we’ve shared together – inside jokes, heart-to-heart conversations, and countless adventures. I’m thankful to have you as my best buddy! I hope your birthday brings you all the joy and happiness you deserve! May it be filled with love and laughter! Can’t wait for all our future adventures and shared experiences! Cheers to another year of friendship and fun! Don’t let time pass you by! Your bestie, [Your Name]	Hey Bryan, Happy birthday! It’s a day that deserves a celebratory shout-out! You know, the kind of day that makes you smile ear to ear and laugh until your sides hurt. Remember that time we stumbled across that hidden pizza place in Little Italy? Or how about that epic game of frisbee we had at the park, where you somehow managed to snag the winning throw? From our spontaneous road trip across the country to just hanging out at your apartment with a cup of coffee and watching movies, you’ve always been there for me. You’re my rock, my confidant, and my biggest cheerleader. Thanks for always being there to pick me up when I stumble and celebrate my successes with genuine joy. As another year unfolds, I wish you happiness, good health, and countless successes in all your endeavors. May this year bring you more laughter than a clown car parade and more wins than a bingo night! Remember, you’re the luckiest guy alive, so let’s keep spreading some good vibes and making memories that will last a lifetime. Here’s to another year of friendship, laughter, and unforgettable adventures! Can’t wait to see what this next chapter holds for us, buddy! Happy birthday, my friend! With love and warm wishes, [Your Name]	Dear Bryan, Wishing you a very happy birthday! As I reflect on our friendship, I am filled with gratitude for all the wonderful moments we have shared together. From our wild adventures in college to our current shared experiences, I am grateful for your constant presence in my life. As we celebrate this special day, I want to take a moment to acknowledge your many talents and accomplishments. You have always been an inspiration to me and have helped shape me into the person I am today. I hope this year brings you all the happiness, health, and success you deserve. On a lighter note, I remember the time we got lost on our way to a concert and ended up in some sketchy part of town. We ended up stumbling upon a hilarious comedy club and spent the rest of the night laughing until our bellies hurt! I know you’d appreciate this little story and I hope it brings a smile to your face on this special day. I look forward to creating many more unforgettable memories with you in the years to come. Here’s to another year of adventures, laughter, and friendship. Happy Birthday, dear Bryan! With love, [Your Name]	To Bryan – a friend like no other!” Inside of e-card: Hey there, Bryan-o! As another year has rolled around and we’re here to celebrate your special day, it got me thinking about all the wonderful moments we’ve spent together. Remember that crazy road trip? When we tried to use a doughnut as a compass? Or that time we ended up dancing in the rain on what was supposed to be a sun-soaked beach day! Ah, good times! On your special day, I truly want you to reflect on how amazing you are and how you’ve not only enriched your own life, but the lives of those around you, especially mine. Puns aside (and you know how much I love them), your jokes are the only ‘Bryan-brand humor’ I wouldn’t trade for the world. As you blow out your candles today (and please, let’s aim for less smoke than last year), I have a heart full of wishes for you – may your coming year be filled with overflowing joy, unstoppable laughter, robust health, and profound success. But hey, don’t let it stop there. Keep those socks mismatched, continue to debate whether pineapple belongs on pizza (you’re not winning this one), and never, ever forget how absolutely epic you are! As we close this chapter and open a new one in your life, it fills me with excitement and joyful anticipation. I can’t wait to fill another year with remarkable stories Forever Forward, [Your Name]

* Birthday wishes generated by the SLMs we selected.

Not bad. Now let’s try a leave request.

Leave requests content generation

In one of the mobile applications we’ve developed, Freequest, we’re exploring a proof of concept where users can generate messages for leave requests using SLMs. After selecting some information, we used SLM to generate an excuse note for our employer to justify our absence. For safety and user-friendliness, the initial version uses preset input choices to prevent security risks like prompt injections, or toxic message generation. This shows how SLMs can meet specific business needs, making interactions smoother and processes more automated.

You can see how exactly this feature works in the video below:

*Proof of concept: using SLMs to generate messages for leave requests in the FreeQuest app.

The results across the models:

Phi-2	Gemma	Mistral 7B	GPT-4
As a software developer at XYZ Inc., I am writing to ask for the flexibility to take some time off work due to physical symptoms. As you may be aware, I am in the process of receiving treatment for flu symptoms and I have been instructed by my doctor to take time off until I am fully recovered. However, I am worried about not being able to complete all my tasks and meet deadlines during my recovery period. I am hoping you may be able to offer any support or guidance in order to make my recovery as smooth as possible, and I am happy to provide additional information in order to be granted leave if possible. If it is possible, I am also open to discussing other possible arrangements with you and I am confident I can return to work as normal once I am fully recovered. Thank you in advance for your time and understanding!	Dear [Manager Name], I am writing to inform you that I am unable to come to work from March 1st to March 12th due to illness. I have been experiencing a fever, high temperature, and other symptoms since [Date of Symptom Onset]. While I have been unable to work remotely due to my health, I have been advised by my doctor to rest at home and avoid strenuous activity. I have attached a copy of my doctor’s recommendation for medical reasons. I understand that this is a busy time for the team, but I have made every effort to minimize my disruption and ensure a smooth handover during my absence. I have completed any urgent tasks assigned to me and have informed my colleagues of my absence. I am confident that I can catch up on any missed work upon my return. I would appreciate your understanding and support during this time. Please let me know if you have any questions or if there are any specific arrangements I can make to ensure a seamless transition during my absence. I am available by phone or email if necessary. Thank you for your consideration. Sincerely, [Your Name]	Dear [Manager’s name], I am writing this request to inform you that I will be taking sick leave from 01.03.2024 to 12.03.2024 due to a serious case of flu and high temperature. As per doctor’s recommendation, I need to rest at home and take some time off from work. Unfortunately, I am unable to work from home as I do not have access to work-related devices. I apologize for any inconvenience caused by my absence and I assure you that I am taking all necessary measures to make sure my tasks are handled in my absence. I have already communicated my concerns with my team members and delegated my responsibilities accordingly. I have also created detailed handover documents to ensure a smooth transition while I am away from work. I appreciate your understanding in this matter and I look forward to being back at work as soon as possible. Please let me know if there are any further arrangements or considerations I need to make in order to ensure a seamless transition while I am on sick leave. Thank you for your time and attention to this matter. Sincerely, [Your name]	Subject: Urgent Sick Leave Request – Software Developer Dear [Recipient’s Name], I hope this message finds you well. I regret to inform you that I am not in a state to work as I have been diagnosed with the flu and am experiencing high temperature. The doctor has advised me to rest and recover during this period to ensure my complete well-being. This period spans from the 1st to the 12th of March 2024. Given these circumstances, as much as I would like to continue my work remotely, I am unable to do so due to my current health condition and the fact that I currently do not have access to my work-related devices. This is a delicate time and my total focus needs to be towards my recuperation. I deeply regret any inconveniences that my sudden absence might cause to the team and projects. I am committed to mitigating the impact of my absence and am open to any suitable arrangements. To aid the transition during this period, I am more than willing to delegate my tasks and provide guidance, should someone need to temporarily take over my duties. I apologize for not being able to contribute to our mutual goals and understand that this is a critical period for all of us. Please do respect my predicament and I appeal to you for your understanding and support during this time. If there are any processes or further communication required from me, kindly let me know. I will aim to ensure a smooth transition as far as possible. Thank you for your

* Leave requests generated by the SLMs we selected.

Our thoughts—what we make of SLMs in practice

We presented 3 potential applications of working with text and given information in the field of Generative AI. We used exactly the same prompt in each case. As you could see yourself in the results above, each of the small models performed reasonably well with the given task. However, if we take a look at the birthday wishes generation case, we can see that OpenAI stands out slightly compared to the small models. Looking closely at the generated wishes, we can’t help but notice that OpenAI added a few facts on its own. In other words, it generated nice sentences but included fictional events in them. That’s a case in point of language models having a tendency to ‘hallucinate’.

What do we make of the results? Is OpenAI worse than small local models? Absolutely not. It’s important to remember that we used exactly the same prompt, requesting AI to “Create a warm, personal birthday message suitable for an e-card.” If we specifically instruct the language model in the prompt not to add fictional information, then the new message will be on par with the rest of the generated wishes.

In our opinion, this particular case highlights very well the advantage large models enjoy against the smaller ones. Why did we have this error? Well, the difference between simply summarizing the existing text, writing a brief and generic comment, and creating original birthday wishes lies in the word “creation.” In the first two cases, the result is generated based on the provided data. However, a birthday card is different; it requires more creativity, more input, as well as effort. Humans are wired the same way, actually. The more well-read a person is, having stored and processed vast quantities of knowledge inside their brain, the greater their imagination and creativity. For AI, it works similarly.

The takeaway? Smaller models perform quite well at generating material based on the data provided. In these instances, smaller models do not fall behind large models. Moreover, local SLMs are the key to security and reliability.

However, in tasks where greater creativity is required, larger models are hands down the best option to choose. Furthermore, it’s now also evident that using the same prompts for all models might not be the correct approach and that sometimes they need to be adapted to the specific model or fine-tuned a bit to get the result you’re looking for.

Diving deep into the pros and cons of our chosen models

Are any of the models we used better than the other? It depends. Do we have any advice on which of them you should choose? No, our role is to showcase their features so that you yourself can make an informed decision. Once you see what each feature represents and get to know the impact it has on the result you’ll have, you’ll be able to decide for yourself.

Currently, with the rapid increase in the popularity of AI, more and more companies are publishing their own models. We have chosen those three models that, in our opinion, are some of the most interesting and promising at the moment.

However, it’s important to remember that the situation is constantly evolving, and new models that outperform the ones mentioned here may emerge after some time. Moreover, you should remember that, in the case of smaller models, exercising additional caution is very important. When reviewing any comparisons, look for independent analyses, as providers can easily present comparative study results in their favor by using prompts specifically tailored to their models. As you could have seen in our comparison, some models may be more specialized in a narrower field, which will result in skewed results.

*What SMLs have under the hood.

Params, or parameters, value is the number and type of parameters influencing the model’s performance and adaptability. The more parameters there are in a language model, the higher the complexity of the text it can understand or generate. However, it will also require more answer time and computing resources for training and usage.

File Size is the amount of space required on the devices. Smaller models can be used on a larger number of devices. Community is roughly the number of people who downloaded the model and are using it now; hence, it’s the impact this model has. Moreover, this figure directly influences language learning models (LLMs) in aspects like the introduction of new use cases, bug fixing, and improvement of current models based on community feedback or discussions.

Exploring small language model issues

As a result of our experiment, we faced the following issues that are worth further exploration and analysis:

Inaccurate Code and Facts Generation: The models may sometimes produce incorrect code snippets and statements. Users should treat these outputs as suggestions or starting points, not as definitive or accurate solutions.
Unreliable Responses to Instruction: The models we used have not undergone instruction fine-tuning. That is… As a result, it may struggle or fail to adhere to intricate or nuanced instructions provided by users.
Language Limitations: As explored above, due to its size, SLMs are primarily designed to understand standard English. Informal English, slang, or any other languages might pose challenges to its comprehension, leading to potential misinterpretations or errors in response.
Potential Societal Biases: SLM is not entirely free from societal biases despite efforts to ensure training data safety. There’s a possibility it may generate content that mirrors these societal biases, particularly if prompted or instructed to do so. We urge users to be aware of this and to exercise caution and critical thinking when interpreting model outputs.
Toxicity: Despite being trained with carefully selected data, the model can still produce harmful content if explicitly prompted or instructed to do so.

All in all, you should be mindful that, although smaller models offer many possibilities, they also have limitations. Keep in mind that SML’s constraints could block business perspectives where LLM doesn’t present such a problem.

Summary

AI is a formidable tool that is capable of way more than simple reading comprehension. It is here to stay with us way into the future. And it will keep redefining our life, the way we interact with our smartphones and technology in general. As our little experiment has shown, despite certain limitations, AI enables us to do things unheard of even a decade ago. Sure, using larger models has numerous advantages over smaller language models. However, SLMs offer better user experiences, as we know that any sensitive data is handled appropriately.

Our benchmarks have shown that SLM’s reasoning capabilities are frequently on par with those of their larger counterparts. The SLMs we tested can run on our smartphones, be integrated into our mobile apps, or be securely put on our own servers. For now, proper prompting would still require a lot of fine-tuning to provide us with the details we seek. We’ll have to wait a bit for newer versions of hardware and software that would be up for the challenge to be in place so that we finally get that state-of-the-art result we all seek. The concept of using SLMs in business settings is surely a promising one, so we’ll have to keep abreast of the latest developments in the realm of AI for the time being.

← Back to all blogposts

Let's chat!

Hi, I’m Marcin, COO of Applandeo

Are you looking for a tech partner? Searching for a new job? Or do you simply have any feedback that you'd like to share with our team? Whatever brings you to us, we'll do our best to help you. Don't hesitate and drop us a message!

Drop a message