Documentation
Enterprise

Getting Started

Workflow Builder

Performance & Monitoring

Local API Server

Troubleshooting

Metrics Throughput TTFT and TPS

Metrics Throughput TTFT and TPS overview

Throughput metrics measure AI model performance in processing and generating output. The two primary indicators monitored are TTFT and TPS.

Time To First Token (TTFT)

TTFT is the time required for the model to generate the first token after receiving input. This metric measures initial system responsiveness.

Why TTFT Matters:

Determines the initial response speed of the application
Influences user perception of system performance
Indicates efficiency of model loading and initialization processes
Lower TTFT results in a more responsive user experience

Tokens Per Second (TPS)

TPS measures the number of tokens generated by the model per second after the first token. This metric indicates sustained processing speed.

Why TPS Matters:

Determines overall speed in generating complete output
Affects system throughput efficiency
Indicates GPU utilization and model optimization
Higher TPS enables processing more concurrent requests

Throughput Chart

The chart displays a comparison of TTFT and TPS over time. This visualization helps:

Identify model performance patterns
Detect anomalies or performance degradation
Optimize system configuration based on historical data
Compare performance across usage sessions

Throughput data is updated after each model inference completion to provide an accurate representation of system performance.

On this page

Time To First Token (TTFT)Tokens Per Second (TPS)Throughput Chart

Sapientia

Run Large Language Models entirely on your device. Sapientia brings AI capabilities to your local environment with complete privacy, offline functionality, and visual workflow orchestration for intelligent agent systems.

Quick Links

Home
Documentation

Resources

Privacy Policy
Term of use

Contact Us

sapientia@godiscus.com

Instagram LinkedIn GitHub Twitter / X

©2026 Sapientia Project by GoDiscus. Made with by Sapientia Team.