Integration: Optimum
High-performance inference using Hugging Face Optimum
Table of Contents
Overview
Hugging Face Optimum is an extension of the Transformers library that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. Using Optimum, you can leverage the ONNX Runtime to automatically export models from the Hugging Face Model Hub and deploy them in pipelines to achieve significant improvements in performance.
Installation
pip install optimum-haystack
Usage
Components
This integration introduces two components: OptimumTextEmbedder and OptimumDocumentEmbedder.
To create semantic embeddings for documents, use OptimumDocumentEmbedder
in your indexing pipeline. For generating embeddings for queries, use OptimumTextEmbedder
.
Below is the example indexing pipeline with InMemoryDocumentStore
, OptimumDocumentEmbedder
and DocumentWriter
:
from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack_integrations.components.embedders.optimum import (
OptimumDocumentEmbedder,
OptimumEmbedderPooling,
)
document_store = InMemoryDocumentStore(embedding_similarity_function="cosine")
documents = [Document(content="I enjoy programming in Python"),
Document(content="My city does not get snow in winter"),
Document(content="Japanese diet is well known for being good for your health"),
Document(content="Thomas is injured and can't play sports")]
indexing_pipeline = Pipeline()
indexing_pipeline.add_component("embedder", OptimumDocumentEmbedder(
model="intfloat/e5-base-v2",
normalize_embeddings=True,
pooling_mode=OptimumEmbedderPooling.MEAN,
))
indexing_pipeline.add_component("writer", DocumentWriter(document_store=document_store))
indexing_pipeline.connect("embedder", "writer")
indexing_pipeline.run({"embedder": {"documents": documents}})
License
optimum-haystack
is distributed under the terms of the
Apache-2.0 license.