Nexa
Discord
navigation

Nexa Models

NexaQuant: Llama.cpp-Compatible Model Compression with 100%+ Accuracy Recovery

NexaQuant: Llama.cpp-Compatible Model Compression with 100%+ Accuracy Recovery

Works with both text and multimodal models and can be deployed on any devices

Developer
Model
OmniAudio-2.6B: World's Fastest Audio Language Model for Edge Deployment

OmniAudio-2.6B

Model
Research

Compact text-audio-in model with optimized performance and size for edge devices

OmniVision-968M: World's Smallest Vision Language Model

OmniVision-968M

Model
Research

Pocket-size multimodal model with 9x token reduction for on-device deployment

Octopus v3

Octopus v3

Model

Compact (Sub-Billion) Multimodal Action Model for On-Device AI Agents

Squid

Squid

Model

Revolutionizing On-Device Language Models for Long Contexts

Octopus v2

Octopus v2

Model

On-Device 0.5B LLMs, Voice/Text in, action out, outperform GPT-4 in function-calling

Octo-planner

Octo-planner

Model

A 3.8B Model for AI Agent Action Planning with 98%+ Accuracy

Join +8,000 developers

Stay tuned with the Best in On-Device AI