Today we're releasing three things: Oculus — a standalone hybrid-reasoning VLM that outperforms systems 10x its size — oceanir-search, semantic search across your analysis history, and Oceanir-Memory, persistent context that learns from every query.
For months we've been working on a fundamental question: how do you build something small enough to run anywhere, smart enough to reason about what it sees, and integrated enough to remember what it's learned? The answer is Oculus and the ecosystem around it.
What we're shipping
Three releases, each designed to work independently and together.
A small model that thinks big
Oculus is a hybrid-reasoning vision language model built on our OO1 Architecture. The key insight behind it is deceptively simple: scale reasoning, not parameters. Instead of building a larger model, we built a smarter one.
Oculus uses two mechanisms that set it apart: thinking traces and perceptive tool calling. Thinking traces let the model reason step-by-step before answering. Perceptive tool calling lets it identify regions of interest in an image, zoom in, and re-examine them at higher resolution — the way an analyst actually works.
"The best vision model isn't the largest one — it's the one that knows where to look and how to think about what it sees."
— OO1 Architecture Paper, 2025
Quick start
Oculus ships as a Python package with a minimal API surface.
from oceanir import Oculus
model = Oculus.from_pretrained("OceanirAI/Oculus-0.1")
# Basic VQA
answer = model.ask("image.jpg", "What is this?")
# With reasoning traces
answer = model.ask("scene.jpg", "Count the people", think=True)
# With focus/zoom for fine details
answer = model.ask("document.jpg", "Read the fine print", focus=True)
# Structured JSON output
result = model.generate(image, prompt="Describe objects", mode="json")How Oculus works
Oculus relies on two core mechanisms that can be activated independently or together.
Thinking Traces. When you pass think=True, Oculus generates structured reasoning inside <think>...</think> tags before producing its final answer. This forces the model to decompose visual problems into steps — identifying what it sees, what context matters, and what conclusions follow.
Perceptive Focus. When you pass focus=True, Oculus automatically identifies regions of interest in the image, crops and zooms into them, and re-examines those regions at higher resolution. This is particularly effective for fine-grained tasks like reading distant signage or identifying small objects.
Output modes
Oculus supports seven output modes, each designed for a different class of visual task.
Oculus capabilities
Six core capabilities, all available in a single model.
- Reasoning via thinking traces — Step-by-step decomposition of visual problems before answering, making the model's logic transparent and auditable.
- Perceptive focus (zoom & crop) — Automatic zoom and crop on regions of interest for fine-grained detail extraction.
- Structured outputs (JSON) — Native JSON mode for machine-readable results that integrate directly into downstream pipelines.
- Complex OCR — Multi-language text extraction with spatial awareness, handling curved text, distant signage, and overlapping layers.
- Desktop & UI understanding — Detection and classification of user interface elements, buttons, forms, and navigation structures.
- Edge-ready architecture — Small enough to run on consumer hardware without sacrificing reasoning quality.
Find what you've seen
oceanir-search is semantic search across your entire analysis history. Instead of scrolling through a timeline, describe what you're looking for in natural language.
Every analysis you run is automatically indexed. Search queries are matched against visual features, extracted text, location metadata, and reasoning traces — not just filenames or dates.
Examples of what you can search for:
- "That street with the blue and white tiles in Portugal"
- "The intersection near that pink Art Deco building"
- "Photos from last week that showed rooftop terraces"
- "All analyses where we detected Japanese text"
Context that persists
Oceanir-Memory is a persistent knowledge store that learns from every analysis you run. The more you use Oceanir, the better it understands your work — recognizing patterns, remembering context, and surfacing connections across sessions.
- Instant recognition — Previously analyzed locations are recognized immediately, with full context from prior sessions.
- Pattern learning — The system learns your analytical patterns and priorities over time.
- Cross-session context — Insights from one session inform future analyses, building a cumulative understanding.
- Privacy-first storage — All memory data is encrypted per-user with AES-256-GCM and never used for model training.
Try it now
Oculus-0.1 is available now for selected research pilots. Access is currently provisioned by the Oceanir team.
Private preview. Contact Oceanir for onboarding and integration details.
oceanir-search and Oceanir-Memory are available now to Pro and Enterprise users through the Oceanir platform. No additional setup is required — both features activate automatically with your existing account.
Experience the full stack
Oculus, search, and memory — working together in a single platform.


