![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
![]() |
|
This tutorial will guide you through the process of building a simple C++ program that performs inference on GGUF LLM models using the llama.cpp framework. We will cover the essential steps involved in loading the model, performing inference, and displaying the results. The code for this tutorial can be found here.
Prerequisites
To follow along with this tutorial, you will need the following:
A Linux-based operating system (native or WSL)
CMake installed
GNU/clang toolchain installed
Step 1: Setting Up the Project
Let's start by setting up our project. We will be building a C/C++ program that uses llama.cpp to perform inference on GGUF LLM models.
Create a new project directory, let's call it smol_chat.
Within the project directory, let's clone the llama.cpp repository into a subdirectory called externals. This will give us access to the llama.cpp source code and headers.
mkdir -p externals
cd externals
git clone https://github.com/georgigerganov/llama.cpp.git
cd ..
Step 2: Configuring CMake
Now, let's configure our project to use CMake. This will allow us to easily compile and link our C/C++ code with the llama.cpp library.
Create a CMakeLists.txt file in the project directory.
In the CMakeLists.txt file, add the following code:
cmake_minimum_required(VERSION 3.10)
project(smol_chat)
set(CMAKE_CXX_STANDARD 20)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
add_executable(smol_chat main.cpp)
target_include_directories(smol_chat PUBLIC ${CMAKE_CURRENT_SOURCE_DIR})
target_link_libraries(smol_chat llama.cpp)
This code specifies the minimum CMake version, sets the C++ standard and standard flag, adds an executable named smol_chat, includes headers from the current source directory, and links the llama.cpp shared library to our executable.
Step 3: Defining the LLM Interface
Next, let's define a C++ class that will handle the high-level interactions with the LLM. This class will abstract away the low-level llama.cpp function calls and provide a convenient interface for performing inference.
In the project directory, create a header file called LLMInference.h.
In LLMInference.h, declare the following class:
class LLMInference {
public:
LLMInference(const std::string& model_path);
~LLMInference();
void startCompletion(const std::string& query);
std::string completeNext();
private:
llama_model llama_model_;
llama_context llama_context_;
llama_sampler llama_sampler_;
std::vector
std::vector
std::vector
llama_batch batch_;
};
This class has a public constructor that takes the path to the GGUF LLM model as an argument and a destructor that deallocates any dynamically-allocated objects. It also has two public member functions: startCompletion, which initiates the completion process for a given query, and completeNext, which fetches the next token in the LLM's response sequence.
Step 4: Implementing LLM Inference Functions
Now, let's define the implementation for the LLMInference class in a file called LLMInference.cpp.
In LLMInference.cpp, include the necessary headers and implement the class methods as follows:
#include "LLMInference.h"
#include "common.h"
#include
#include
#include
LLMInference::LLMInference(const std::string& model_path) {
llama_load_model_from_file(&llama_model_, model_path.c_str(), llama_model_default_params());
llama_new_context_with_model(&llama_context_, &llama_model_);
llama_sampler_init_temp(&llama_sampler_, 0.8f);
llama_sampler_init_min_p(&llama_sampler_, 0.0f);
}
LLMInference::~LLMInference() {
for (auto& msg : _messages) {
std::free(msg.content);
}
llama_free_model(&llama_model_);
llama_free_context(&llama_context_);
}
void LLMInference::startCompletion(const std::string& query)
免责声明:info@kdj.com
所提供的信息并非交易建议。根据本文提供的信息进行的任何投资,kdj.com不承担任何责任。加密货币具有高波动性,强烈建议您深入研究后,谨慎投资!
如您认为本网站上使用的内容侵犯了您的版权,请立即联系我们(info@kdj.com),我们将及时删除。
-
- 比特币的疯狂旅程:集会,记录高点和不可避免的利润
- 2025-07-15 15:10:11
- 比特币命中率高于$ 120,000,然后随着贸易商获利而退缩。加密王国王的下一步是什么?我们分解趋势和见解。
-
-
- 比特币集会冷却:获利还是只是呼吸?
- 2025-07-15 15:10:11
- 比特币最近在新高处进行的集会引发了有关获利和市场情绪的辩论。加密王国王的下一步是什么?
-
-
- 比特币,国会和加密立法:最新DC戏剧的纽约分钟
- 2025-07-15 15:35:12
- 国会正在就加密货币立法作斗争,因为比特币达到了新的高点。新法规会帮助或伤害普通投资者吗?让我们潜水。
-
- 加密票据,众议院辩论,政治紧张局势:国会山的纽约分钟
- 2025-07-15 15:35:12
- 解码房屋中的加密摊牌:比尔,辩论和政治戏剧加热了华盛顿。
-
- Doge的疯狂旅程:Meme Coin Market的获利和抛售
- 2025-07-15 15:40:12
- Doge在获利和抛售的情况下,迅速的方向鞭打。这只是呼吸器,还是还有更多?
-
- 解码加密混乱:泵,FUD和2025年的野外西部
- 2025-07-15 15:40:12
- 探索2025年加密世界的动荡世界,那里的泵令牌,FUD和鲸鱼运动造成了机会和风险的旋风。获取最新趋势的内部勺子!
-
- 以太坊与比特币:真的是一场比赛吗?
- 2025-07-15 15:45:12
- 忘记比特币与以太坊竞争。以太坊(Ethereum)将目光投向了更大的奖项:获得Web2并重新定义互联网。找出如何!