Skip to main content

Quickstart

info

Cortex.cpp is in active development. If you have any questions, please reach out to us:

Local Installation

Cortex has an Local Installer that packages all required dependencies, so that no internet connection is required during the installation process.

Start Cortex.cpp API Server

This command starts the Cortex.cpp API server at localhost:39281.


cortex start

Pull a Model & Select Quantization

This command allows users to download a model from these Model Hubs:

It displays available quantizations, recommends a default and downloads the desired quantization.


$ cortex pull llama3.2
$ cortex pull bartowski/Meta-Llama-3.1-8B-Instruct-GGUF

Run a Model

This command downloads the default gguf model format from the Cortex Hub, starts the model, and chat with the model.


cortex run llama3.2

info

All model files are stored in the ~/cortex/models folder.

Using the Model

API


curl http://localhost:39281/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.1:8b-gguf",
"messages": [
{
"role": "user",
"content": "Hello"
},
],
"stream": true,
"max_tokens": 1,
"stop": [
null
],
"frequency_penalty": 1,
"presence_penalty": 1,
"temperature": 1,
"top_p": 1
}'

Refer to our API documentation for more details.

Show the System State

This command displays the running model and the hardware system status (RAM, Engine, VRAM, Uptime)


cortex ps

Stop a Model

This command stops the running model.


cortex models stop llama3.2

Stop Cortex.cpp API Server

This command starts the Cortex.cpp API server at localhost:39281.


cortex stop

What's Next?

Now that Cortex.cpp is set up, here are the next steps to explore:

  1. Adjust the folder path and configuration using the .cortexrc file.
  2. Explore the Cortex.cpp data folder to understand how it stores data.
  3. Learn about the structure of the model.yaml file in Cortex.cpp.