SpannerVectorStore(instance_id: str, database_id: str, table_name: str, embedding_service: langchain_core.embeddings.embeddings.Embeddings, id_column: str = 'langchain_id', content_column: str = 'content', embedding_column: str = 'embedding', client: typing.Optional[google.cloud.spanner_v1.client.Client] = None, metadata_columns: typing.Optional[typing.List[str]] = None, ignore_metadata_columns: typing.Optional[typing.List[str]] = None, metadata_json_column: typing.Optional[str] = None, query_parameters: langchain_google_spanner.vector_store.QueryParameters = <langchain_google_spanner.vector_store.QueryParameters object>)
Initialize the SpannerVectorStore.
Parameters:
- instance_id (str): The ID of the Spanner instance.
- database_id (str): The ID of the Spanner database.
- table_name (str): The name of the table.
- embedding_service (Embeddings): The embedding service.
- id_column (str): The name of the row ID column. Defaults to ID_COLUMN_NAME.
- content_column (str): The name of the content column. Defaults to CONTENT_COLUMN_NAME.
- embedding_column (str): The name of the embedding column. Defaults to EMBEDDING_COLUMN_NAME.
- client (Client): The Spanner client. Defaults to Client().
- metadata_columns (Optional[List[str]]): List of metadata columns. Defaults to None.
- ignore_metadata_columns (Optional[List[str]]): List of metadata columns to ignore. Defaults to None.
- metadata_json_column (Optional[str]): The generic metadata column. Defaults to None.
- query_parameters (QueryParameters): The query parameters. Defaults to QueryParameters().
Methods
_generate_sql
_generate_sql(
dialect,
table_name,
id_column,
content_column,
embedding_column,
column_configs,
primary_key,
secondary_indexes: typing.Optional[
typing.List[langchain_google_spanner.vector_store.SecondaryIndex]
] = None,
)
Generate SQL for creating the vector store table.
Parameters:
- dialect: The database dialect.
- table_name: The name of the table.
- id_column: The name of the row ID column.
- content_column: The name of the content column.
- embedding_column: The name of the embedding column.
- column_names: List of tuples containing metadata column information.
Returns:
- str: The generated SQL.
_select_relevance_score_fn
_select_relevance_score_fn() -> typing.Callable[[float], float]
The 'correct' relevance function may differ depending on a few things, including:
- the distance / similarity metric used by the VectorStore
- the scale of your embeddings (OpenAI's are unit normed. Many others are not!)
- embedding dimensionality
- etc.
Vectorstores should define their own selection based method of relevance.
add_documents
add_documents(
documents: typing.List[langchain_core.documents.base.Document],
ids: typing.Optional[typing.List[str]] = None,
**kwargs: typing.Any
) -> typing.List[str]
Add documents to the vector store.
Parameters | |
---|---|
Name | Description |
documents |
List[Document]
Documents to add to the vector store. |
ids |
Optional[List[str]]
Optional list of IDs for the documents. |
Returns | |
---|---|
Type | Description |
List[str] |
List of IDs of the added texts. |
add_texts
add_texts(
texts: typing.Iterable[str],
metadatas: typing.Optional[typing.List[dict]] = None,
ids: typing.Optional[typing.List[str]] = None,
batch_size: int = 5000,
**kwargs: typing.Any
) -> typing.List[str]
Add texts to the vector store index.
Parameters | |
---|---|
Name | Description |
texts |
Iterable[str]
Iterable of strings to add to the vector store. |
metadatas |
Optional[List[dict]]
Optional list of metadatas associated with the texts. |
ids |
Optional[List[str]]
Optional list of IDs for the texts. |
batch_size |
int
The batch size for inserting data. Defaults to 5000. |
Returns | |
---|---|
Type | Description |
List[str] |
List of IDs of the added texts. |
delete
delete(
ids: typing.Optional[typing.List[str]] = None,
documents: typing.Optional[
typing.List[langchain_core.documents.base.Document]
] = None,
**kwargs: typing.Any
) -> typing.Optional[bool]
Delete records from the vector store.
Parameters | |
---|---|
Name | Description |
ids |
Optional[List[str]]
List of IDs to delete. |
documents |
Optional[List[Document]]
List of documents to delete. |
Returns | |
---|---|
Type | Description |
Optional[bool] |
True if deletion is successful, False otherwise, None if not implemented. |
from_documents
from_documents(documents: typing.List[langchain_core.documents.base.Document], embedding: langchain_core.embeddings.embeddings.Embeddings, instance_id: str, database_id: str, table_name: str, id_column: str = 'langchain_id', content_column: str = 'content', embedding_column: str = 'embedding', ids: typing.Optional[typing.List[str]] = None, client: typing.Optional[google.cloud.spanner_v1.client.Client] = None, metadata_columns: typing.Optional[typing.List[str]] = None, ignore_metadata_columns: typing.Optional[typing.List[str]] = None, metadata_json_column: typing.Optional[str] = None, query_parameter: langchain_google_spanner.vector_store.QueryParameters = <langchain_google_spanner.vector_store.QueryParameters object>, **kwargs: typing.Any) -> langchain_google_spanner.vector_store.SpannerVectorStore
Initialize SpannerVectorStore from a list of documents.
Parameters | |
---|---|
Name | Description |
documents |
List[Document]
List of documents. |
embedding |
Embeddings
The embedding service. |
id_column |
str
The name of the row ID column. Defaults to ID_COLUMN_NAME. |
content_column |
str
The name of the content column. Defaults to CONTENT_COLUMN_NAME. |
embedding_column |
str
The name of the embedding column. Defaults to EMBEDDING_COLUMN_NAME. |
ids |
Optional[List[str]]
Optional list of IDs for the documents. Defaults to None. |
client |
Client
The Spanner client. Defaults to Client(). |
metadata_columns |
Optional[List[str]]
List of metadata columns. Defaults to None. |
ignore_metadata_columns |
Optional[List[str]]
List of metadata columns to ignore. Defaults to None. |
metadata_json_column |
Optional[str]
The generic metadata column. Defaults to None. |
query_parameter |
QueryParameters
The query parameters. Defaults to QueryParameters(). |
Returns | |
---|---|
Type | Description |
SpannerVectorStore |
Initialized SpannerVectorStore instance. |
from_texts
from_texts(texts: typing.List[str], embedding: langchain_core.embeddings.embeddings.Embeddings, instance_id: str, database_id: str, table_name: str, metadatas: typing.Optional[typing.List[dict]] = None, id_column: str = 'langchain_id', content_column: str = 'content', embedding_column: str = 'embedding', ids: typing.Optional[typing.List[str]] = None, client: typing.Optional[google.cloud.spanner_v1.client.Client] = None, metadata_columns: typing.Optional[typing.List[str]] = None, ignore_metadata_columns: typing.Optional[typing.List[str]] = None, metadata_json_column: typing.Optional[str] = None, query_parameter: langchain_google_spanner.vector_store.QueryParameters = <langchain_google_spanner.vector_store.QueryParameters object>, **kwargs: typing.Any) -> langchain_google_spanner.vector_store.SpannerVectorStore
Initialize SpannerVectorStore from a list of texts.
Parameters | |
---|---|
Name | Description |
texts |
List[str]
List of texts. |
embedding |
Embeddings
The embedding service. |
metadatas |
Optional[List[dict]]
Optional list of metadatas associated with the texts. Defaults to None. |
id_column |
str
The name of the row ID column. Defaults to ID_COLUMN_NAME. |
content_column |
str
The name of the content column. Defaults to CONTENT_COLUMN_NAME. |
embedding_column |
str
The name of the embedding column. Defaults to EMBEDDING_COLUMN_NAME. |
ids |
Optional[List[str]]
Optional list of IDs for the texts. Defaults to None. |
client |
Client
The Spanner client. Defaults to Client(). |
metadata_columns |
Optional[List[str]]
List of metadata columns. Defaults to None. |
ignore_metadata_columns |
Optional[List[str]]
List of metadata columns to ignore. Defaults to None. |
metadata_json_column |
Optional[str]
The generic metadata column. Defaults to None. |
query_parameter |
QueryParameters
The query parameters. Defaults to QueryParameters(). |
Returns | |
---|---|
Type | Description |
SpannerVectorStore |
Initialized SpannerVectorStore instance. |
init_vector_store_table
init_vector_store_table(
instance_id: str,
database_id: str,
table_name: str,
client: typing.Optional[google.cloud.spanner_v1.client.Client] = None,
id_column: typing.Union[
str, langchain_google_spanner.vector_store.TableColumn
] = "langchain_id",
content_column: str = "content",
embedding_column: str = "embedding",
metadata_columns: typing.Optional[
typing.List[langchain_google_spanner.vector_store.TableColumn]
] = None,
primary_key: typing.Optional[str] = None,
vector_size: typing.Optional[int] = None,
secondary_indexes: typing.Optional[
typing.List[langchain_google_spanner.vector_store.SecondaryIndex]
] = None,
) -> bool
Initialize the vector store new table in Google Cloud Spanner.
Parameters:
- instance_id (str): The ID of the Spanner instance.
- database_id (str): The ID of the Spanner database.
- table_name (str): The name of the table to initialize.
- client (Client): The Spanner client. Defaults to Client(project="span-cloud-testing").
- id_column (str): The name of the row ID column. Defaults to ID_COLUMN_NAME.
- content_column (str): The name of the content column. Defaults to CONTENT_COLUMN_NAME.
- embedding_column (str): The name of the embedding column. Defaults to EMBEDDING_COLUMN_NAME.
- metadata_columns (Optional[List[Tuple]]): List of tuples containing metadata column information. Defaults to None.
- vector_size (Optional[int]): The size of the vector. Defaults to None.
max_marginal_relevance_search
max_marginal_relevance_search(
query: str,
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
pre_filter: typing.Optional[str] = None,
**kwargs: typing.Any
) -> typing.List[langchain_core.documents.base.Document]
Return docs selected using the maximal marginal relevance.
Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.
max_marginal_relevance_search_by_vector
max_marginal_relevance_search_by_vector(
embedding: typing.List[float],
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
pre_filter: typing.Optional[str] = None,
**kwargs: typing.Any
) -> typing.List[langchain_core.documents.base.Document]
Return docs selected using the maximal marginal relevance.
Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.
max_marginal_relevance_search_with_score_by_vector
max_marginal_relevance_search_with_score_by_vector(
embedding: typing.List[float],
k: int = 4,
fetch_k: int = 20,
lambda_mult: float = 0.5,
pre_filter: typing.Optional[str] = None,
) -> typing.List[typing.Tuple[langchain_core.documents.base.Document, float]]
Return docs and their similarity scores selected using the maximal marginal relevance.
Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.
similarity_search
similarity_search(
query: str,
k: int = 4,
pre_filter: typing.Optional[str] = None,
**kwargs: typing.Any
) -> typing.List[langchain_core.documents.base.Document]
Perform similarity search for a given query.
Parameters | |
---|---|
Name | Description |
query |
str
The query string. |
k |
int
The number of nearest neighbors to retrieve. Defaults to 4. |
pre_filter |
Optional[str]
Pre-filter condition for the query. Defaults to None. |
Returns | |
---|---|
Type | Description |
List[Document] |
List of documents most similar to the query. |
similarity_search_by_vector
similarity_search_by_vector(
embedding: typing.List[float],
k: int = 4,
pre_filter: typing.Optional[str] = None,
**kwargs: typing.Any
) -> typing.List[langchain_core.documents.base.Document]
Perform similarity search by vector.
Parameters | |
---|---|
Name | Description |
embedding |
List[float]
The embedding vector. |
k |
int
The number of nearest neighbors to retrieve. Defaults to 4. |
pre_filter |
Optional[str]
Pre-filter condition for the query. Defaults to None. |
Returns | |
---|---|
Type | Description |
List[Document] |
List of documents most similar to the query. |
similarity_search_with_score
similarity_search_with_score(
query: str,
k: int = 4,
pre_filter: typing.Optional[str] = None,
**kwargs: typing.Any
) -> typing.List[typing.Tuple[langchain_core.documents.base.Document, float]]
Perform similarity search for a given query with scores.
Parameters | |
---|---|
Name | Description |
query |
str
The query string. |
k |
int
The number of nearest neighbors to retrieve. Defaults to 4. |
pre_filter |
Optional[str]
Pre-filter condition for the query. Defaults to None. |
Returns | |
---|---|
Type | Description |
List[Tuple[Document, float]] |
List of tuples containing Document and similarity score. |
similarity_search_with_score_by_vector
similarity_search_with_score_by_vector(
embedding: typing.List[float],
k: int = 4,
pre_filter: typing.Optional[str] = None,
**kwargs: typing.Any
) -> typing.List[typing.Tuple[langchain_core.documents.base.Document, float]]
Perform similarity search for a given query.
Parameters | |
---|---|
Name | Description |
query |
str
The query string. |
k |
int
The number of nearest neighbors to retrieve. Defaults to 4. |
pre_filter |
Optional[str]
Pre-filter condition for the query. Defaults to None. |
Returns | |
---|---|
Type | Description |
List[Document] |
List of documents most similar to the query. |