Kelan Kennedy
Dec 04 2024
Summative Review
As I reflect on my project, I'm excited to share the culmination of my work on implementing a local large language model server at Hawai‘i Preparatory Academy. My goal has been to create a tailored and sustainable artificial intelligence system that meets the academic needs of my school community.
Throughout my journey, I've explored various approaches to deploying a LLM server, focusing on optimizing both performance and resource use. Initially, I considered using either GPT4All or Msty as a potential solution for my project. I was drawn to GPT4All because it supports Apple Silicon and can operate with a relatively small amount of VRAM, which made it a feasible option for HPA's hardware infrastructure. Additionally, its performance in replicating ChatGPT's functionality positioned it as an attractive choice. On the other hand, Msty offers sophisticated models out of the box, enhanced performance, and an intuitive user interface that provides features like real-time data and simultaneous conversations. However, as I dove deeper into the capabilities and limitations of both options, I began to realize that neither GPT4All nor Msty fully met HPA's needs.
After experimenting with several other possible solutions, I've decided that I'm transitioning away from all previous developments with GPT4All and Msty to a new one: Open WebUI. It's a game-changer for my project. Open WebUI offers everything that I need to create a robust AI service that can be easily integrated into HPA's educational infrastructure. With support for load balancing, it provides a seamless experience that's similar to what most are already familiar with when using ChatGPT. Moreover, Open WebUI allows system administrators to manage individual user accounts, which is essential for maintaining privacy and customizability. Ultimately, my goal is to deploy an AI service by the time the next semester rolls around, making HPA a pioneer in harnessing the potential of artificial intelligence for education. I'm confident that with Open WebUI as a foundation, I can create a tailored and sustainable LLM server that enhances learning outcomes and fosters innovation within HPA's community.
In parallel with my main project, I've been collaborating with other students from Island Science on a LoRaWAN system that's designed to monitor fire conditions. Our goal is to fly a group of sensors via drone out into the distance, drop them off, and then fly back to collect data. As it progresses, this project will give me valuable insight into the technical aspects of the Internet of Things and its potential applications in real-world scenarios.
Oct 30 2024
Refining Hosting Options
I have decided to divide my project into two distinct parts. One will focus on processing and embedding local documents, while the other will be dedicated solely to conversations with artificial intelligence. GPT4All will serve as the frontend for the local documents, and Msty will be used for regular AI interactions.
Oct 23 2024
Msty
Msty is a strong alternative to GPT4All because it supports sophisticated models out of the box, offering enhanced performance. Its user interface is more intuitive, providing features like real-time data and simultaneous conversations. However, Msty falls short in handling local data, where GPT4All has a clear advantage.
Oct 17 2024
A Generous Donor
This month, a donor visited the Energy Lab and dropped off a system to be used for my AI project. Its specifications are as follows.
- CPU
- GPU
- RAM
- 32 Gigabytes
Oct 01 2024
Building a New System
I decided to repurpose existing hardware at the Energy Lab in order to build a new system around the GPU that arrived earlier. I cleaned out an old case and installed two new hard drives, adding a total of over eight terabytes of storage. Then, I configured Ubuntu 24.04.1 LTS alongside all of its necessary drivers.
Sep 27 2024
Nomic Atlas
As part of my project, I am setting up an analysis of local documents for use in an AI model's generation algorithm. Nomic Atlas offers tools that enable fast and efficient evaluation of large datasets. It creates what is known as a data lake, which leverages unstructured information and quickly generates valuable insights.
Sep 18 2024
Ordering Hardware
Last class, I requested that the following hardware be ordered for my project.
- Apple Mac Studio
- M2 Ultra
- 24‑Core CPU
- 76‑Core GPU
- 32‑Core Neural Engine
- 192 Gigabytes of Unified Memory
- 2 Terabytes of SSD Storage
- M2 Ultra
- ASUS Dual GeForce RTX 4070 SUPER EVO OC Edition 12GB GDDR6X
Sep 10 2024
Brainstorming Names
After much consideration, Dr. Bill and I decided that Tailwind would be the perfect name for my project. It draws on the roots of Hawaii and conveys a sense of power.
Sep 04 2024
InspiritAI
As I develop my project to implement a local LLM server at HPA, I have discovered that InspiritAI sets an excellent example of how to integrate them into educational settings. InspiritAI offers a hands-on, project-based approach that aligns with my goal of helping students understand how AI can be tailored to meet specific needs.
Aug 26 2024
Empowering Learning with Tailored and Sustainable Artificial Intelligence
Abstract
At this point in time, large language models, or text-based AIs, serve as the biggest tools for students. To my peers, using artificial intelligence is unanimous with navigating to ChatGPT's website and asking it a question. This pattern is a result of its direct application to academic tasks. Text-based AI can write essays, solve mathematical equations, and even answer specific questions related to a course's curriculum. The key issue that I intend to address in this research project is the need for a local LLM server at Hawai‘i Preparatory Academy. While artificial intelligence tools are already widely used by the student population, they lack privacy and customizability. My project seeks to bridge this gap by developing accessible AI resources tailored specifically to academic environments. In doing so, I aim to improve the quality of AI-assisted learning. At the same time, I want to foster a deeper understanding among students and educators about the capabilities, limitations, and ethical implications of artificial intelligence. If successful, my project would crown HPA as the first school in the world to operate its own LLM servers exclusively on solar energy.
Introduction
A large language model is an advanced form of artificial intelligence that is trained on vast amounts of text data to generate human-like responses. It can perform a variety of tasks, such as writing, problem-solving, and language translation. LLMs, like ChatGPT, use deep learning techniques to understand and produce natural language. A quant refers to the parameters that define a model's capability to process and understand language. Quants, often in the form of weights and biases, help determine how accurately a LLM can generate text based on the patterns it has learned from training data. The larger the number of quants, the more complex and capable a model becomes, allowing it to perform increasingly sophisticated tasks.
Literature Review
There are a few options for hosting large language models locally on a server, each requiring substantial compute power. Although it is possible to run local LLMs using only CPU resources, they rely heavily on GPU performance for optimal results.
- Koboldcpp
- For a server with an NVIDIA or AMD graphics card that runs Linux, Koboldcpp is a good option. It is easy to install and can act as an API for various applications. It runs using llama.cpp out of the box, which is a version of Meta's Llama model.
- GPT4All
- To replicate the performance of ChatGPT, GPT4All works best. Though slightly more complicated to work with, it runs on Apple Silicon. With access to a high-end Mac Studio, GPT4All can operate with just over one hundred gigabytes of VRAM. This is significantly more than any current NVIDIA or AMD graphics card, likely resulting in impressive speed.
- Petals
- Given the demanding hardware requirements of most large language models, Petals offers an alternative by using a distributed computing framework. It pools the resources of multiple lower-end computers to run a local LLM model. Adding more systems could exponentially increase its speed, but this would need testing.
Methodology
The methodology for this project involves experimenting with several approaches to deploying a local LLM server, focusing on optimizing both performance and resource use. Once a prototype of a large language model is operational, its speed and effectiveness will be evaluated by a small group of users to assess how well it meets the academic needs of HPA. Feedback from tests like these will guide further adjustments that aim to improve usability.
Budget
Cost-effectiveness is a crucial aspect of my project. I want to ensure that the implementation of a local LLM server at Hawai‘i Preparatory Academy remains both financially feasible and environmentally-friendly. By keeping costs manageable, I can maximize return on investment, allowing the benefits of AI to reach a larger portion of the student and faculty body. Leveraging the resources that are already available at HPA is the best strategy to achieve this goal. Taking advantage of the existing hardware and technical expertise on campus will make my project more affordable and promote sustainability.