20x Speedup: Postgres Vector Similarity Plugin Implemented in Rust

Original link: http://gaocegege.com/Blog/pgvectors

We’ve open-sourced pgvecto.rs , a Postgres vector similarity search plugin written in Rust. Its HNSW algorithm is 20 times faster than pgvector at 90% recall. But speed is only the beginning – pgvecto.rs’ extensible architecture is designed to support new algorithms such as DiskANN. We also hope that with the help of the community, pgvecto.rs can become the standard for Postgres vector similarity search.

Why choose Rust

Typically Postgres plugins are implemented in C, and pgvecto.rs uses Rust instead of C, unlike many existing Postgres extensions. It is built on top of the pgrx framework, which enables users to write Postgres plugins in Rust. And Rust offers many advantages for plugins like pgvecto.rs.

First, Rust’s strict compile-time checking guarantees memory safety and helps avoid possible memory safety and error problems in C extensions. Just as importantly, Rust offers modern development tools with excellent documentation, package management, and easy-to-understand error messages . This makes it easier for developers to use and contribute to pgvecto.rs compared to the large C code base. Rust’s safety and ease of use make it an ideal language for building the next generation of pgrx-based Postgres plugins such as pgvecto.rs.

In addition, the Rust community is much more prosperous than the C language. A thriving community also means more developers, and more developers means more potential contributors to better maintain and grow pgvecto.rs with us.

scalable architecture

Pgvecto.rs has an extensible architecture, making it easy to add support for new index types. At the core is a set of traits that define the desired behavior of a vector index, such as building, saving, loading, and querying. Implementing a new index simply requires creating a struct of the index type and implementing the required traits. Pgvecto.rs currently provides two built-in index types – HNSW for maximum search speed, and ivfflat for approximate quantization-based searches. However, anyone can create additional indexes such as RHNSW, NGT, or custom types tailored for specific use cases. An extensible architecture enables pgvecto.rs to adapt to new vector search algorithms and lets you choose the appropriate index based on your data and performance needs. Pgvecto.rs provides an extensible framework for vector search algorithms in Postgres.

speed and performance

Benchmarks show that pgvecto.rs offers a huge speed boost over the existing Postgres extension pgvector. In tests, its HNSW index exhibited up to 25x search performance relative to pgvector’s ivfflat index. The flexible architecture also allows the use of different indexing algorithms, optimized for maximum throughput or precision. We are currently developing a quantified HNSW, so stay tuned!

SyOOvsC5n.png
Benchmark

Persistence

Previous work such as pg_embedding has done a good job in implementing HNSW indexes, but lacks persistence and proper CRUD operation support. pgvecto.rs also adds these two core functions. pgvecto.rs correctly persists vector indexes using WAL (write-ahead logging). pgvecto.rs automatically handles index saving, loading, rebuilding, and updating in the background. You get persistent indexes that require no external management, while seamlessly integrating with your current Postgres deployment and workflow.

quick start

Suppose you created a table with the following SQL command:

 CREATE TABLE items ( id bigserial PRIMARY KEY , emb vector ( 4 ));

Here vector(4) represents the vector data type, and 4 represents the dimension of the vector. You can use a vector with no dimension specified, but note that you cannot create an index on a vector type without specifying a dimension. Next, you can insert data into the table:

 INSERT INTO items ( emb ) VALUES ( '[1.1, 2.2, 3.3, 4.4]' );

To create an index on emb vector column with squared Euclidean distance, you can use the following command:

 CREATE INDEX ON items USING vectors ( emb l2_ops ) WITH ( options = $$ capacity = 2097152 size_ram = 4294967296 storage_vectors = "ram" [ algorithm . hnsw ] storage = "ram" m = 32 ef = 256 $$ );

If you want to retrieve the top 10 vectors closest to the origin, you can use the following SQL command:

 SELECT * , emb <-> '[0, 0, 0, 0]' AS score FROM items ORDER BY emb <-> '[0, 0, 0, 0]' LIMIT 10 ;

in conclusion

pgvecto.rs proposes a new solution for Postgres vector retrieval. Its implementation in Rust and its extensible architecture have important advantages over existing extensions, such as speed, safety, and flexibility. We’re excited to release pgvecto.rs as an open source project under the Apache 2.0 license, and can’t wait to see what the community builds upon. pgvecto.rs has a lot of room to grow, adding new index types and algorithms, optimizing for different data distributions and use cases, and integrating with existing Postgres workflows.

We encourage you to try pgvecto.rs on GitHub, benchmark it in your workloads, and contribute your own indexing innovations. We’ve worked together to make pgvecto.rs the best vector search extension Postgres has ever seen! The potential is huge, and we’re only just getting started. Please join us on our journey to bring vector search capabilities to the Postgres ecosystem. Join our Discord and work hard to improve pgvecto.rs with developers and other users!

about Us

The mission of our company, TensorChord, is to simplify the process of putting machine learning models into production. Our team has extensive expertise in MLOps engineering with experience from AWS, Tiktok, and open source projects Kubeflow, DGL, etc. So, if you have any questions related to getting your model into production, feel free to reach out to us by joining Discord or emailing [email protected]. We are happy to leverage our background in building MLOps platforms to provide guidance on any part of the model development to deployment workflow.

Our products and open source projects include:

  • ModelZ : A managed serverless GPU platform hosting service for quickly deploying and monitoring your own models on public clouds.
  • OpenModelZ : An open source version of ModelZ that can be deployed anywhere, including your Home Lab, or a personal PC.
  • Mosec : A high-performance model serving framework that provides dynamic batching and CPU/GPU pipelines to make full use of computing resources. It is a simple and faster alternative to NVIDIA Triton.
  • envd : A command-line tool that helps you create container-based AI/ML environments, from development to production. You only need to know Python to use this tool.
  • ModelZ-llm : LLM compatible with OpenAI API (including LLaMA, Vicuna, ChatGLM, etc.) and embedding API server.

License

  • This article is licensed under CC BY-NC-SA 3.0 .
  • Please contact me for commercial use.

This article is transferred from: http://gaocegege.com/Blog/pgvectors
This site is only for collection, and the copyright belongs to the original author.