Skip to content

(TS) Fix PGVector implementation, where vector distance was inverted.#4944

Merged
whysosaket merged 4 commits intomem0ai:mainfrom
zegerhoogeboom:fix/pgvector-scoring-inverse
Apr 27, 2026
Merged

(TS) Fix PGVector implementation, where vector distance was inverted.#4944
whysosaket merged 4 commits intomem0ai:mainfrom
zegerhoogeboom:fix/pgvector-scoring-inverse

Conversation

@zegerhoogeboom
Copy link
Copy Markdown
Contributor

@zegerhoogeboom zegerhoogeboom commented Apr 23, 2026

The hybrid search pipeline expects semantic scores where higher is better. pgvector's <=> returns cosine distance, where lower is better. Convert it back into a bounded similarity score before returning it.

Linked Issue

No issue opened.

Description

While doing the semantic search, most similar documents have to be ranked highest. Using PGVector, the scores were inverted, surfacing the least relevant documents I suppose.

This fix actually returns the most similar documents.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Refactor (no functional changes)
  • Documentation update

Test Coverage

  • I added/updated unit tests
  • I added/updated integration tests
  • I tested manually (describe below)
  • No tests needed (explain why)

I ran .search() in my (proprietary) app, got very bad results, installed my local version with the fix and now the results are very relevant as expected.

Run this SQL to confirm the distance function in PGVector indeed behaves as I'm describing:

 WITH memories(id, embedding, payload) AS (
    VALUES
      ('a', '[1,0,0]'::vector, '{"data":"exactly x-axis"}'::jsonb),
      ('b', '[0.9,0.1,0]'::vector, '{"data":"close to x-axis"}'::jsonb),
      ('c', '[0,1,0]'::vector, '{"data":"y-axis"}'::jsonb),
      ('d', '[-1,0,0]'::vector, '{"data":"opposite x-axis"}'::jsonb)
  )
  SELECT
    id,
    payload->>'data' AS text,
    embedding <=> '[1,0,0]'::vector AS distance,
    GREATEST(0, LEAST(1, 1 - (embedding <=> '[1,0,0]'::vector))) AS score
  FROM memories
  ORDER BY distance ASC;

which outputs:

id text distance score
a exactly x-axis 0.0 1.0
b close to x-axis 0.006116251198662548 0.9938837488013375
c y-axis 1.0 0.0
d opposite x-axis 2.0 0.0

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have added tests that prove my fix/feature works
  • New and existing tests pass locally
  • I have updated documentation if needed (NA)

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 23, 2026

CLA assistant check
All committers have signed the CLA.

…tter. pgvector's `<=>` returns cosine distance, where lower is better. Convert it back into a bounded similarity score before returning it.
@zegerhoogeboom zegerhoogeboom force-pushed the fix/pgvector-scoring-inverse branch from 205ba19 to 81ccfe9 Compare April 23, 2026 09:08
@kartik-mem0
Copy link
Copy Markdown
Contributor

please sign the cla and remove the comment from the code to maintain consistency.

@zegerhoogeboom
Copy link
Copy Markdown
Contributor Author

The CLA is signed and comment removed.

@kartik-mem0
Copy link
Copy Markdown
Contributor

can add some test to verify this claim as true please

thanks!

…r) gets turned into score (higher is better) for ranking.
@zegerhoogeboom
Copy link
Copy Markdown
Contributor Author

I've added a unit test. It mocks the postgres response and confirms the results, with the old code the test fails and with the fix it succeeds. Since there's no setup for using testcontainers and it feels like a bit of scope creep to add it, it doesn't actually run the query against a postgres database to confirm the mocked response is what you get, but run the SQL in the PR body on any postgres instance to confirm the behavior.

Hope this test is sufficient.

@zegerhoogeboom
Copy link
Copy Markdown
Contributor Author

@kartik-mem0 Anything else I can do to get this merged?

@kartik-mem0
Copy link
Copy Markdown
Contributor

Screenshot 2026-04-27 at 10 24 47 PM

please address this lint fix as our ci failed here @zegerhoogeboom

@zegerhoogeboom
Copy link
Copy Markdown
Contributor Author

@kartik-mem0 Sorry about that. Done.

@whysosaket whysosaket merged commit ece7ff6 into mem0ai:main Apr 27, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants