Nexa
Discord
navigation

Nexa Blog

OmniVision-968M: World's Smallest Vision Language Model

OmniVision-968M: World's Smallest Vision Language Model

Pocket-size multimodal model with 9x token reduction for on-device deployment

Model
Research
AMD: Efficient Local RAG for Document Intelligence

E2E Local RAG System

Success Story

End-to-end Local RAG System Powered by AMD Hardware

Octopus v3

Octopus v3

Model

Compact (Sub-Billion) Multimodal Action Model for On-Device AI Agents

Squid

Squid

Model

Revolutionizing On-Device Language Models for Long Contexts

Octopus v2

Octopus v2

Model

On-Device 0.5B LLMs, Voice/Text in, action out, outperform GPT-4 in function-calling

Octo-planner

Octo-planner

Model

A 3.8B Model for AI Agent Action Planning with 98%+ Accuracy

Lenovo: Local Personalized AI Agent

Local AI Voice Agent

Success Story

Voice-Enabled Personal AI Assistant Run Entirely on Lenovo AI PC

Vivaia: Agentic Workflow for Influencer Marketing

AI Influencer Marketing

Success Story

AI Agent System Automates Influencer Discovery, Outreach and Campaign Management

PIN AI: Local-Cloud Hybrid Mobile LLM OS

Local-Cloud LLM OS

Success Story

LLM OS Empowering Seamless, Private, and Lightning-fast Interaction Across Mobile Apps

Nexa AI x PIN AI

Nexa AI x PIN AI

News

Nexa AI Partners with PIN AI to Bring Secure, On-Device AI to Mobile

What can you do with tiny (1B/3B) LLMs in a local RAG system?

Local RAG w. Tiny LLMs

Developer

A practical exploration of on-device AI for chatting with document: from basic Q&A to specialized tasks with LoRA

Nexa SDK Tutorial: A Comprehensive On-Device AI Inference Toolkit

Nexa SDK

Developer

Run Multimodal AI Models on Your Local Devices