Nexa
Discord
navigation

Nexa Blog

NexaQuant: Llama.cpp-Compatible Model Compression with 100%+ Accuracy Recovery

NexaQuant: Llama.cpp-Compatible Model Compression with 100%+ Accuracy Recovery

Works with both text and multimodal models and can be deployed on any devices

Developer
Model
Nexa AI 2024 Year Review

Nexa AI 2024 Year Review

News

Nexa AI's 2024 Milestones and Highlights at a Glance

Visit AMD and Nexa AI at CES 2025: Transforming On-Device AI with Multimodal Capabilities

Visit AMD and Nexa AI at CES 2025

News

Transforming On-Device AI with Multimodal Capabilities

OmniAudio-2.6B: World's Fastest Audio Language Model for Edge Deployment

OmniAudio-2.6B

Model
Research

Compact text-audio-in model with optimized performance and size for edge devices

For The First Time, You Can Run Qwen2-Audio On Your Device

Run Qwen2-Audio Locally

Developer
News

Run Qwen2-Audio on edge devices with Nexa SDK

AMD: Efficient Local RAG for Document Intelligence

E2E Local RAG System

Success Story

End-to-end Local RAG System Powered by AMD Hardware

OmniVision-968M: World's Smallest Vision Language Model

OmniVision-968M

Model
Research

Pocket-size multimodal model with 9x token reduction for on-device deployment

Octopus v3

Octopus v3

Model

Compact (Sub-Billion) Multimodal Action Model for On-Device AI Agents

Squid

Squid

Model

Revolutionizing On-Device Language Models for Long Contexts

Octopus v2

Octopus v2

Model

On-Device 0.5B LLMs, Voice/Text in, action out, outperform GPT-4 in function-calling

Octo-planner

Octo-planner

Model

A 3.8B Model for AI Agent Action Planning with 98%+ Accuracy

Lenovo: Local Personalized AI Agent

Local AI Voice Agent

Success Story

Voice-Enabled Personal AI Assistant Run Entirely on Lenovo AI PC

Vivaia: Agentic Workflow for Influencer Marketing

AI Influencer Marketing

Success Story

AI Agent System Automates Influencer Discovery, Outreach and Campaign Management

PIN AI: Local-Cloud Hybrid Mobile LLM OS

Local-Cloud LLM OS

Success Story

LLM OS Empowering Seamless, Private, and Lightning-fast Interaction Across Mobile Apps

Nexa AI x PIN AI

Nexa AI x PIN AI

News

Nexa AI Partners with PIN AI to Bring Secure, On-Device AI to Mobile

What can you do with tiny (1B/3B) LLMs in a local RAG system?

Local RAG w. Tiny LLMs

Developer

A practical exploration of on-device AI for chatting with document: from basic Q&A to specialized tasks with LoRA

Nexa SDK Tutorial: A Comprehensive On-Device AI Inference Toolkit

Nexa SDK

Developer

Run Multimodal AI Models on Your Local Devices

Join +8,000 developers

Stay tuned with the Best in On-Device AI