โ† Blog

Introducing vector-triage

I just open-sourced vector-triage over the weekend:

It runs as a GitHub Action and helps detect duplicates and similar items for Issues and PRs.

The idea started from a Twitter discussion when Pieter (@steipete) asked about this kind of feature in another OSS project. That project needed extra third-party dependencies for vector DB and embeddings. I wanted a simpler path for GitHub-native repos, so I built this around GitHub Models, which are free for minimal usage.

What vector-triage is for

vector-triage is a GitHub Action that:

  • embeds issue/PR content with GitHub Models
  • searches a local SQLite index (vector + FTS)
  • posts one managed triage comment when related items are found

It works for both Issues and Pull Requests.

Design goals

I kept the project focused on a few principles:

  • zero-config setup
  • deterministic behavior
  • low comment noise
  • safe defaults for OSS repos

Quick start

Setup is a single workflow file:

name: Triage

on:
  issues:
    types: [opened, edited]
  pull_request_target:
    types: [opened, synchronize]

permissions:
  contents: write
  issues: write
  pull-requests: write
  models: read

jobs:
  triage:
    runs-on: ubuntu-latest
    concurrency:
      group: triage-index
      cancel-in-progress: false
    steps:
      - uses: rizwankce/[email protected]

That is the full setup.

How it behaves

  • First run: it creates the index and does not post noise comments.
  • Later runs: if matches exist, it posts or updates one managed triage comment.
  • If no matches are found, it stays quiet.
  • Recoverable failures are logged as warnings and exit non-fatally.

Built with Codex

I built most of this in a weekend with the help of Codex.

The biggest win was speed of iteration: shaping the retrieval behavior, comment format, and safety model quickly, then hardening with tests and E2E workflow checks.