Download and Installation

Before installing Sapientia, please ensure your hardware and operating system meet the minimum requirements to guarantee optimal AI model performance.

System Requirements

Running Large Language Model (LLM) inference typically requires significant memory (RAM).

However, Sapientia is engineered to be a lightweight and accessible for running LLMs. By default, it utilizes the efficient Gemma 3n E2B and EmbeddingGemma 300m models. We extend our sincere appreciation to Google DeepMind, as the developers of the Gemma models, for their outstanding optimization of the Gemma models, which allows Sapientia to run smoothly on consumer hardware—from standard laptops to mobile devices.

Consequently, the minimum specifications required are remarkably low.

Hardware Specifications

Component	Minimum Specifications
CPU	Intel Core i5 Gen 8 or equivalent
RAM	8 GB
Storage	10 GB available space (SSD)

Note: These specifications reflect our internal testing on an Intel Core i5 Gen 8 device with 8 GB of RAM.

OS Compatibility

Sapientia supports modern 64-bit (x64) desktop operating systems.

Windows: Windows 10 (Build 1903+) and Windows 11.
Linux: Ubuntu 20.04+, Fedora, and other Debian-based distributions.
macOS: Coming Soon.

Download

Select the appropriate installation for your operating system.

Windows

Download from Microsoft Store. Updates are handled automatically by Windows.

MacOS

Coming Soon. via the App Store.

Linux

Available in .AppImage format (compatible with most distributions) and .deb (for Debian/Ubuntu).

Model Setup

Sapientia utilizes a local-first AI architecture, meaning the intelligence runs directly on your device.

Base AI Model and Embedding Model

Sapientia relies on two core components to function:

Base AI Model: This is the primary engine responsible for text generation, reasoning, and understanding user instructions.

Embedding Model: This model converts text into numerical vectors, enabling semantic search capabilities. It allows the AI to "understand" and retrieve context from your local documents (Retrieval-Augmented Generation).

Both models utilize the GGUF (GPT-Generated Unified Format). GGUF is a high-performance binary format optimized for fast loading and efficient inference on CPUs and Apple Silicon, significantly reducing memory overhead compared to traditional formats.

Default Models

Upon the initial initialization of the Sapientia application, it is recommended to download the following optimized default models:

Base Model: Gemma 3n 2B

Embedding Model: EmbeddingGemma

Why these models?

We have selected the Gemma 3n 2B and EmbeddingGemma family by Google DeepMind as the default standard because they offer an exceptional balance between performance and efficiency. They deliver state-of-the-art reasoning and retrieval capabilities while remaining lightweight enough to run smoothly on standard consumer devices (e.g., laptops with 8GB RAM) without requiring dedicated high-end GPUs.

Post-Installation Steps

After successfully installing the application, the next step is to configure the Runtime Environment. Please proceed to the Setup Runtime page.

Download and Installation

Windows

MacOS

Linux

On this page