Data Models

This section documents the key data models used throughout the SemaDbAPI.jl package.

SemaDbAPI.BinaryQuantizerParametersType

BinaryQuantizerParameters Converts vectors to boolean values. Works effectively only if vector features are already binary or are normally distributed.

BinaryQuantizerParameters(;
    threshold=nothing,
    triggerThreshold=10000,
    distanceMetric=nothing,
)

- threshold::Float64 : Optional initial threshold for binary quantization, if not provided, it will be calculated at trigger threshold.
- triggerThreshold::Float64 : Optional trigger threshold for binary quantization.
- distanceMetric::String : The distance metric to use for binary quantization after the vectors are encoded
source
SemaDbAPI.CreateCollectionRequestType

CreateCollectionRequest

CreateCollectionRequest(;
    id=nothing,
    indexSchema=nothing,
)

- id::String : The unique identifier of the collection
- indexSchema::Dict{String, IndexSchemaValue} : The schema for the collection, each property can be indexed with a different type of index.
source
SemaDbAPI.GetCollectionResponseType

GetCollectionResponse

GetCollectionResponse(;
    id=nothing,
    indexSchema=nothing,
    shards=nothing,
)

- id::String : The unique identifier of the collection
- indexSchema::Dict{String, IndexSchemaValue} : The schema for the collection, each property can be indexed with a different type of index.
- shards::Vector{GetCollectionResponseShardsInner}
source
SemaDbAPI.IndexSchemaValueType

IndexSchemaValue Defines what the property is and how it is indexed.

IndexSchemaValue(;
    type=nothing,
    vectorFlat=nothing,
    vectorVamana=nothing,
    text=nothing,
    string=nothing,
    stringArray=nothing,
)

- type::String
- vectorFlat::IndexVectorFlatParameters
- vectorVamana::IndexVectorVamanaParameters
- text::IndexTextParameters
- string::IndexStringParameters
- stringArray::IndexStringParameters
source
SemaDbAPI.IndexStringParametersType

IndexStringParameters Parameters for string indexing

IndexStringParameters(;
    caseSensitive=false,
)

- caseSensitive::Bool : Whether the string is case sensitive
source
SemaDbAPI.IndexVectorFlatParametersType

IndexVectorFlatParameters Parameters for flat indexing. Flat indexing is the simplest form of indexing, where the search is exhaustive.

IndexVectorFlatParameters(;
    vectorSize=nothing,
    distanceMetric=nothing,
    quantizer=nothing,
)

- vectorSize::Float64 : The size of the vectors in the collection
- distanceMetric::DistanceMetric
- quantizer::Quantizer
source
SemaDbAPI.IndexVectorVamanaParametersType

IndexVectorVamanaParameters Parameters for Vamana indexing

IndexVectorVamanaParameters(;
    vectorSize=nothing,
    distanceMetric=nothing,
    searchSize=75,
    degreeBound=64,
    alpha=1.2,
    quantizer=nothing,
)

- vectorSize::Float64 : The size of the vectors in the collection
- distanceMetric::DistanceMetric
- searchSize::Float64 : Determines the scope of the greedy search algorithm. The higher the value, the more exhaustive the search.
- degreeBound::Float64 : Maximum number of edges of a node in the graph. The higher the value, the denser the graph becomes, slower the search but more accurate.
- alpha::Float64 : Determines how aggressive the edge pruning is. Higher values reduce pruning, lower values make it more aggressive.
- quantizer::Quantizer
source
SemaDbAPI.InlineObjectType

inline_object

InlineObject(;
    message=nothing,
)

- message::String : A message indicating the result of the operation
source
SemaDbAPI.InlineObject1Type

inlineobject1

InlineObject1(;
    error=nothing,
)

- error::String : An error message hopefully describing the problem
source
SemaDbAPI.InsertPointsResponseType

InsertPointsResponse

InsertPointsResponse(;
    message=nothing,
    failedRanges=nothing,
)

- message::String : A message indicating the result of the operation
- failedRanges::Vector{InsertPointsResponseFailedRangesInner} : A list of ranges of points that failed to insert. Each range has a start and an end index.  The end index is exclusive. For example, if the range is [0, 2], the first two points failed to insert.
source
SemaDbAPI.InsertPointsResponseFailedRangesInnerType

InsertPointsResponsefailedRangesinner

InsertPointsResponseFailedRangesInner(;
    shardId=nothing,
    start=nothing,
    var"end"=nothing,
    error=nothing,
)

- shardId::String
- start::Int64
- var"end"::Int64
- error::String
source
SemaDbAPI.ProductQuantizerParametersType

ProductQuantizerParameters Uses the product quantization to reduce the memory footprint of the vectors. It may be slower and less accurate.

ProductQuantizerParameters(;
    numCentroids=256,
    numSubVectors=nothing,
    triggerThreshold=10000,
)

- numCentroids::Float64 : Number of centroids to quantize to, this is the k* parameter in the paper and is often set to 255 giving 256 centroids (including 0). We are limiting this to maximum of 256 (uint8) to keep the overhead of this process tractable.
- numSubVectors::Float64 : Number of subvectors / segments / subquantizers to use, this is the m parameter in the paper and is often set to 8.
- triggerThreshold::Float64 : The trigger threshold is the number of points in the collection that will trigger the quantization process. This is to ensure that the quantization process is only triggered when the collection is large enough to benefit from the memory savings.
source
SemaDbAPI.QuantizerType

Quantizer Applied quantizer to the vectors if any

Quantizer(;
    type=nothing,
    binary=nothing,
    product=nothing,
)

- type::String
- binary::BinaryQuantizerParameters
- product::ProductQuantizerParameters
source
SemaDbAPI.QueryType

Query A query object that can be used to perform search. The query object can contain multiple filters, each with a property and a value. Use _and and _or to combine queries.

Query(;
    property=nothing,
    vectorFlat=nothing,
    vectorVamana=nothing,
    text=nothing,
    string=nothing,
    integer=nothing,
    float=nothing,
    stringArray=nothing,
    _and=nothing,
    _or=nothing,
)

- property::String
- vectorFlat::SearchVectorFlatOptions
- vectorVamana::SearchVectorVamanaOptions
- text::SearchTextOptions
- string::SearchStringOptions
- integer::SearchNumberOptions
- float::SearchNumberOptions
- stringArray::SearchStringArrayOptions
- _and::Vector{Query}
- _or::Vector{Query}
source
SemaDbAPI.SearchNumberOptionsType

SearchNumberOptions Options for searching numbers. The operator determines how the search is performed. The value is a number to search for, endValue is used for range queries.

SearchNumberOptions(;
    value=nothing,
    operator=nothing,
    endValue=nothing,
)

- value::Float64
- operator::String
- endValue::Float64
source
SemaDbAPI.SearchRequestType

SearchRequest

SearchRequest(;
    query=nothing,
    select=nothing,
    sort=nothing,
    offset=0,
    limit=10,
)

- query::Query
- select::Vector{String} : A list of properties to return in the search results. If not provided, all properties are returned.
- sort::Vector{SortOption} : A list of sort options for the search results. The search results are sorted by the first sort option, then the second, and so on.
- offset::Int64 : The number of points to skip in the search results
- limit::Int64 : Maximum number of points to return
source
SemaDbAPI.SearchStringArrayOptionsType

SearchStringArrayOptions Options for searching string arrays. The operator determines how the search is performed. The value is an array of strings to search for.

SearchStringArrayOptions(;
    value=nothing,
    operator=nothing,
)

- value::Vector{String}
- operator::String
source
SemaDbAPI.SearchStringOptionsType

SearchStringOptions Options for searching strings. The operator determines how the search is performed. The value is a string to search for.

SearchStringOptions(;
    value=nothing,
    operator=nothing,
    endValue=nothing,
)

- value::String
- operator::String
- endValue::String
source
SemaDbAPI.SearchTextOptionsType

SearchTextOptions Text search options, the value is the text to search for. The weight determines the hybrid search weighting.

SearchTextOptions(;
    value=nothing,
    operator=nothing,
    limit=10,
    filter=nothing,
    weight=1,
)

- value::String
- operator::String
- limit::Float64 : Maximum number of points to search
- filter::Query
- weight::Float64 : The weight of the text search, the higher the value, the more important the text search is.
source
SemaDbAPI.SearchVectorFlatOptionsType

SearchVectorFlatOptions Options for searching vectors with flat indexing.

SearchVectorFlatOptions(;
    vector=nothing,
    operator=nothing,
    limit=10,
    filter=nothing,
    weight=1,
)

- vector::Vector{Float64} : A vector with a fixed number of dimensions
- operator::String
- limit::Float64 : Maximum number of points to search
- filter::Query
- weight::Float64 : The weight of the vector search, the higher the value, the more important the vector search is.
source
SemaDbAPI.SearchVectorVamanaOptionsType

SearchVectorVamanaOptions Options for searching vectors with Vamana indexing. The larger the search size the longer the search will take.

SearchVectorVamanaOptions(;
    vector=nothing,
    operator=nothing,
    searchSize=75,
    limit=10,
    filter=nothing,
    weight=1,
)

- vector::Vector{Int64} : A vector with a fixed number of dimensions
- operator::String
- searchSize::Float64 : Determines the scope of the greedy search algorithm. The higher the value, the more exhaustive the search.
- limit::Float64 : Maximum number of points to search
- filter::Query
- weight::Float64 : The weight of the vector search, the higher the value, the more important the vector search is.
source
SemaDbAPI.SortOptionType

SortOption Sort options for search results. The field is the property to sort by and the order is the direction to sort in.

SortOption(;
    property=nothing,
    descending=false,
)

- property::String
- descending::Bool
source
SemaDbAPI.UpdatePointsResponseType

UpdatePointsResponse

UpdatePointsResponse(;
    message=nothing,
    failedPoints=nothing,
)

- message::String : A message indicating the result of the operation
- failedPoints::Vector{FailedPointsInner} : A list of points that failed to insert. Each point has an id and an error message. For example, if the error is not found, the point does not exist in the collection.
source
SemaDbAPI.create_collectionMethod

Create a new collection

Creates a new collection if it does not already exist. The maximum number of collections per user is restricted based on the plan. Before you can insert and search points, you must create a collection.

Params:

  • createcollectionrequest::CreateCollectionRequest (required)

Return: InlineObject, OpenAPI.Clients.ApiResponse

source
SemaDbAPI.delete_collectionMethod

Delete a collection

Deletes a collection and all of its points. This operation is irreversible. If you want to delete only some points, use the bulk delete endpoint. If some shards are temporarily unavailable, the operation will still succeed, but some of the data will be deleted in the future.

Params:

  • collection_id::String (required)

Return: InlineObject, OpenAPI.Clients.ApiResponse

source
SemaDbAPI.delete_pointMethod

Delete points by id

Bulk delete points based on id. This endpoint does not check if the points exist. If you attempt to delete a point that does not exist, it will be ignored and included in the failedPoints list.

Params:

  • collection_id::String (required)
  • deletepointsrequest::DeletePointsRequest (required)

Return: DeletePointsResponse, OpenAPI.Clients.ApiResponse

source
SemaDbAPI.get_collectionMethod

Get the details of a collection

This endpoint attempts to also list the shards currently available in the collection. Some shards may be temporarily unavailable. In that case, you can retry at a future time.

Params:

  • collection_id::String (required)

Return: GetCollectionResponse, OpenAPI.Clients.ApiResponse

source
SemaDbAPI.insert_pointMethod

Insert new points into the collection

This endpoint assumes all points to be inserted are new points and does not check for duplication. It is important to ensure consistency of the database you do not insert duplicate points. If you are unsure if a point exists, you can leave the id field blank and the database will assign a new id. For cosine distance, you must normalise the vectors prior to inserting them.

Params:

  • collection_id::String (required)
  • insertpointrequest::InsertPointsRequest (required)

Return: InsertPointsResponse, OpenAPI.Clients.ApiResponse

source
SemaDbAPI.list_collectionsMethod

List user collections

Returns a list of all collections for the current user. The list is not sorted by any value and the order may change between requests.

Params:

Return: ListCollectionResponse, OpenAPI.Clients.ApiResponse

source
SemaDbAPI.search_pointMethod

Fast index based search

This endpoint allows searching for points in a collection using the index. The search is based on the index schema of the collection.

Params:

  • collection_id::String (required)
  • search_request::SearchRequest (required)

Return: SearchPointsResponse, OpenAPI.Clients.ApiResponse

source
SemaDbAPI.update_pointMethod

Update existing points with new data

This endpoint allows updating point vectors and metadata. It does not allow updating the point id. If you want to update the id, you must delete the point and insert a new point. The points are required to exist before you can update them. You can check the failedPoints to see which points failed to update and potentially why.

Params:

  • collection_id::String (required)
  • updatepointsrequest::UpdatePointsRequest (required)

Return: UpdatePointsResponse, OpenAPI.Clients.ApiResponse

source