First cut
This commit is contained in:
122
README.md
Normal file
122
README.md
Normal file
@@ -0,0 +1,122 @@
|
||||
# Vision Jobs
|
||||
|
||||
A multimodal visual analysis queue — submit an image + prompt, get a response from a cloud vision model. Switch between **Ollama Cloud** and **OpenRouter** by changing two environment variables, no code changes needed.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Browser
|
||||
│
|
||||
▼
|
||||
Fastify server (port 3000)
|
||||
│ openai npm package (OpenAI-compatible client)
|
||||
▼
|
||||
LLM_BASE_URL (configured in .env)
|
||||
├─ https://ollama.com/v1 → Ollama Cloud (qwen3.5:397b-cloud, etc.)
|
||||
└─ https://openrouter.ai/api/v1 → OpenRouter (300+ providers)
|
||||
```
|
||||
|
||||
Both Ollama Cloud and OpenRouter expose an OpenAI-compatible `/v1/chat/completions` endpoint, so the same `openai` npm package talks to both. No proxy or sidecar required.
|
||||
|
||||
## Stack
|
||||
|
||||
| Layer | Technology |
|
||||
|---|---|
|
||||
| Backend | Node.js · Fastify 5 · `@fastify/websocket` · `@fastify/multipart` |
|
||||
| LLM client | `openai` npm package (pointed at Ollama Cloud or OpenRouter) |
|
||||
| Queue | `p-queue` (in-process, no external server) |
|
||||
| Database | SQLite via Sequelize ORM |
|
||||
| Frontend | React 18 · Vite · plain CSS |
|
||||
|
||||
## Quick start
|
||||
|
||||
**1. Install dependencies**
|
||||
```bash
|
||||
npm install
|
||||
```
|
||||
|
||||
**2. Configure your provider**
|
||||
```bash
|
||||
cp server/.env.example server/.env
|
||||
# Edit server/.env — choose Ollama Cloud or OpenRouter (see comments inside)
|
||||
```
|
||||
|
||||
**3. Run in development**
|
||||
```bash
|
||||
npm run dev
|
||||
```
|
||||
|
||||
- Frontend → http://localhost:5173
|
||||
- Backend → http://localhost:3000
|
||||
|
||||
**4. Production build**
|
||||
```bash
|
||||
npm run build # builds React into client/dist
|
||||
npm start # serves everything from Fastify on port 3000
|
||||
```
|
||||
|
||||
## Provider configuration
|
||||
|
||||
Edit `server/.env` and uncomment the block for the provider you want:
|
||||
|
||||
### Ollama Cloud
|
||||
Get an API key at https://ollama.com → account → API keys.
|
||||
Model IDs listed at https://ollama.com/search?c=cloud
|
||||
```bash
|
||||
LLM_BASE_URL=https://ollama.com/v1
|
||||
LLM_API_KEY=your-ollama-api-key
|
||||
LLM_MODEL=qwen3.5:397b-cloud
|
||||
```
|
||||
|
||||
### OpenRouter
|
||||
Get an API key at https://openrouter.ai/keys.
|
||||
Model IDs listed at https://openrouter.ai/models (format: `provider/model`)
|
||||
```bash
|
||||
LLM_BASE_URL=https://openrouter.ai/api/v1
|
||||
LLM_API_KEY=sk-or-v1-...
|
||||
LLM_MODEL=qwen/qwen3.5-397b-a17b
|
||||
```
|
||||
|
||||
## Project structure
|
||||
|
||||
```
|
||||
vision-jobs/
|
||||
├── server/
|
||||
│ ├── index.js # Fastify entry point
|
||||
│ ├── routes/jobs.js # REST + WebSocket routes
|
||||
│ ├── jobs/queue.js # p-queue → openai client → provider
|
||||
│ ├── db/models.js # Sequelize Job model (SQLite)
|
||||
│ ├── ws/broadcast.js # WebSocket fan-out
|
||||
│ └── .env.example
|
||||
└── client/
|
||||
├── index.html
|
||||
├── vite.config.js
|
||||
└── src/
|
||||
├── App.jsx
|
||||
├── styles.css
|
||||
├── components/
|
||||
│ ├── ImageDrop.jsx # Drag-drop / file picker / camera
|
||||
│ └── JobCard.jsx # Live status + result display
|
||||
├── hooks/
|
||||
│ └── useJobSocket.js
|
||||
└── lib/
|
||||
└── api.js
|
||||
```
|
||||
|
||||
## How it works
|
||||
|
||||
1. User drops an image + types a prompt → clicks **Analyze**.
|
||||
2. `POST /api/jobs` receives the multipart upload, base64-encodes the image, saves a `queued` job to SQLite, and enqueues it via `p-queue`.
|
||||
3. The queue runner calls the OpenAI-compatible `/v1/chat/completions` endpoint with the image embedded as a `data:` URI in an `image_url` content block.
|
||||
4. As status changes (`queued → running → done/error`), the server broadcasts `job_update` WebSocket messages to every connected client.
|
||||
5. React merges updates into the live job list — no polling.
|
||||
|
||||
## Environment variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|---|---|---|
|
||||
| `LLM_BASE_URL` | `https://ollama.com/v1` | Provider endpoint. |
|
||||
| `LLM_API_KEY` | — | **Required.** API key for your chosen provider. |
|
||||
| `LLM_MODEL` | `qwen3.5:397b-cloud` | Model identifier (format varies by provider). |
|
||||
| `PORT` | `3000` | Server HTTP port. |
|
||||
| `JOB_CONCURRENCY` | `3` | Max simultaneous LLM requests. |
|
||||
Reference in New Issue
Block a user