Skip to content

Similarity API

This API supports advanced search capabilities, which will allow clients to find similar documents base in the input.

Clients calling the API are required to add an authentication header with the valid authentication token.

Authorization: Bearer {ACCESS_TOKEN}

The similarity API allows the client to find the nearest document based off of a piece of text and a targeted field.

query FindNearestDocument {
similarity(
input: {
nearest: { text: "eats carrots and is an animal", field: "post_content" }
}
) {
total
docs {
id
score
data
}
}
}

sample response:

{
"data": {
"similarity": {
"total": 1,
"docs": [
{
"id": "rabbit:6",
"score": 2.5204816,
"data": {
"ID": 1,
"post_content": "Rabbit is happily running",
"post_date": "11-06-2023T12:33:00",
"post_status": "publish",
"post_title": "George the rabbit",
"post_type": "rabbit"
}
},
{
"id": "horse:7",
"score": 2.2204816,
"data": {
"ID": 1,
"post_content": "Horse in a field",
"post_date": "11-06-2023T12:33:00",
"post_status": "publish",
"post_title": "Greg the horse",
"post_type": "horse"
}
}
]
}
}
}

Users can choose up to three fields instead of just one, and apply boosting to those fields, provided the fields are indexed with the inference model.

Using a singular field is still supported, but not recommended. The API accepts either field or fields, but not both simultaneously.

You can use the boosting query to demote certain documents without excluding them from the search results.

Boost 0 -> 0.9 is decreasing relevance score
Boost 1.0 is not affecting the score (can be omitted, as it's the default behavior)
Boost 1.1 -> 10.0` is increasing relevance score

At least one field is required, name is a required key, boost is optional.

query FindNearestDocumentWithMultipleFields {
similarity(
input: {
nearest: {
text: "eats carrots and is an animal"
fields: [{ name: "post_name" }]
}
}
) {
total
docs {
id
score
data
}
}
}
query FindNearestDocumentWithMultipleFieldWithBoost {
similarity(
input: {
nearest: {
text: "eats carrots and is an animal"
fields: [
{ name: "post_name", boost: 0.1 }
{ name: "post_content", boost: 10 }
{ name: "post_title", boost: 9 }
]
}
}
) {
total
docs {
id
score
data
}
}
}

Here’s a sample response demonstrating how the score changes when boosting is applied using the same FindNearestDocument query:

sample response:

{
"data": {
"similarity": {
"total": 6,
"docs": [
{
"id": "rabbit:6",
"score": 19.557358,
"data": {
"ID": 6,
"post_content": "",
"post_date": "2025-06-13T12:44:46",
"post_date_gmt": "2025-06-13T12:44:46",
"post_excerpt": "",
"post_modified": "2025-06-13T12:44:46",
"post_modified_gmt": "2025-06-13T12:44:46",
"post_name": "rabbit-1",
"post_status": "publish",
"post_title": "Rabbit-1",
"post_type": "rabbit",
"post_url": "http://localhost:8000/rabbit/rabbit-1"
}
},
{
"id": "rabbit:7",
"score": 14.139838,
"data": {
"ID": 7,
"post_content": "",
"post_date": "2025-06-13T12:44:47",
"post_date_gmt": "2025-06-13T12:44:47",
"post_excerpt": "",
"post_modified": "2025-06-13T12:44:47",
"post_modified_gmt": "2025-06-13T12:44:47",
"post_name": "rabbit-2",
"post_status": "publish",
"post_title": "Rabbit-2",
"post_type": "rabbit",
"post_url": "http://localhost:8000/rabbit/rabbit-2"
}
},
{
"id": "page:2",
"score": 3.59585,
"data": {
"ID": 2,
"author": {
"user_nicename": "admin"
},
"post_content": "This is an example page. It’s different from a blog post because it will stay in one place and will show up in your site navigation (in most themes). Most people start with an About page that introduces them to potential site visitors. It might say something like this:\n\n\n\nHi there! I’m a bike messenger by day, aspiring actor by night, and this is my website. I live in Los Angeles, have a great dog named Jack, and I like piña coladas. (And gettin’ caught in the rain.)\n\n\n\n…or something like this:\n\n\n\nThe XYZ Doohickey Company was founded in 1971, and has been providing quality doohickeys to the public ever since. Located in Gotham City, XYZ employs over 2,000 people and does all kinds of awesome things for the Gotham community.\n\n\n\nAs a new WordPress user, you should go to your dashboard to delete this page and create new pages for your content. Have fun!",
"post_date": "2025-06-13T12:44:24",
"post_date_gmt": "2025-06-13T12:44:24",
"post_excerpt": "",
"post_modified": "2025-06-13T12:44:24",
"post_modified_gmt": "2025-06-13T12:44:24",
"post_name": "sample-page",
"post_status": "publish",
"post_title": "Sample Page",
"post_type": "page",
"post_url": "http://localhost:8000/sample-page"
}
},
{
"id": "zombie:4",
"score": 0.9963488,
"data": {
"ID": 4,
"post_content": "",
"post_date": "2025-06-13T12:44:45",
"post_date_gmt": "2025-06-13T12:44:45",
"post_excerpt": "",
"post_modified": "2025-06-13T12:44:45",
"post_modified_gmt": "2025-06-13T12:44:45",
"post_name": "zombie-1",
"post_status": "publish",
"post_title": "Zombie-1",
"post_type": "zombie",
"post_url": "http://localhost:8000/zombie/zombie-1"
}
},
{
"id": "post:1",
"score": 0.646362,
"data": {
"ID": 1,
"author": {
"user_nicename": "admin"
},
"categories": [
{
"name": "Uncategorized",
"slug": "uncategorized",
"term_id": 1,
"term_taxonomy_id": 1
}
],
"myCustomField": "my custom field value",
"post_content": "Welcome to WordPress. This is your first post. Edit or delete it, then start writing!",
"post_date": "2025-06-13T12:44:24",
"post_date_gmt": "2025-06-13T12:44:24",
"post_excerpt": "",
"post_modified": "2025-06-13T12:44:24",
"post_modified_gmt": "2025-06-13T12:44:24",
"post_name": "hello-world",
"post_status": "publish",
"post_title": "Hello world!",
"post_type": "post",
"post_url": "http://localhost:8000/hello-world"
}
},
{
"id": "zombie:5",
"score": 0.48280594,
"data": {
"ID": 5,
"post_content": "",
"post_date": "2025-06-13T12:44:46",
"post_date_gmt": "2025-06-13T12:44:46",
"post_excerpt": "",
"post_modified": "2025-06-13T12:44:46",
"post_modified_gmt": "2025-06-13T12:44:46",
"post_name": "zombie-2",
"post_status": "publish",
"post_title": "Zombie-2",
"post_type": "zombie",
"post_url": "http://localhost:8000/zombie/zombie-2"
}
}
]
}
}
}

The client can also filter documents based on a provided query string — for example, excluding the rabbit post type.

query FindNearestDocumentWithFilter {
similarity(
input: {
nearest: { text: "eats carrots and is an animal", field: "post_content" }
filter: "NOT post_type:rabbit"
}
) {
total
docs {
id
score
data
}
}
}

Sample response:

{
"data": {
"similarity": {
"total": 0,
"docs": [
{
"id": "horse:7",
"score": 2.2204816,
"data": {
"ID": 1,
"post_content": "Horse in a field",
"post_date": "11-06-2023T12:33:00",
"post_status": "publish",
"post_title": "Greg the horse",
"post_type": "horse"
}
}
]
}
}
}

The similarity API allows you to paginate through results. The following query would offset to the 10th document and then display the next five additional documents.

query FindNearestDocumentWithPagination {
similarity(
limit: 5
offset: 10
input: {
nearest: { text: "eats carrots and is an animal", field: "post_content" }
filter: "NOT post_type:rabbit"
}
) {
total
docs {
id
score
data
}
}
}

The similarity API also allows you to set a min score, since the nearest documents are based on probability the client can set a minimum score threshold to omit documents that fall below the threshold.

query FindNearestDocumentWithMinScore {
similarity(
minScore: 0.8
input: {
nearest: { text: "eats carrots and is an animal", field: "post_content" }
}
) {
total
docs {
id
score
data
}
}
}

Similar to the find API, the similarity API supports time decay scoring to boost more recent documents in similarity search results:

query SimilarityWithTimeDecay {
similarity(
input: {
nearest: { text: "eats carrots and is an animal", field: "post_content" }
}
options: {
timeDecay: [
{ field: "post_date", scale: "30d", decayRate: 0.5, offset: "7d" }
]
}
) {
total
docs {
id
score
data
}
}
}

This example applies time decay where:

  • Documents older than 30 days (scale: "30d") will have their similarity scores reduced
  • At 30 days old, the score will be multiplied by 0.5 (decayRate: 0.5)
  • The decay doesn’t start until 7 days from now (offset: "7d")
  • The decay is applied to the post_date field

You can also apply multiple time decay functions or use different time fields:

query SimilarityWithMultipleTimeDecay {
similarity(
input: { nearest: { text: "machine learning", field: "post_content" } }
options: {
timeDecay: [
{ field: "post_date", scale: "60d", decayRate: 0.3 }
{ field: "post_modified_gmt", scale: "14d", decayRate: 0.7 }
]
}
) {
total
docs {
id
score
data
}
}
}

The time period formats supported are:

  • d for days (e.g., 30d, 7d)
  • h for hours (e.g., 12h, 24h)
  • m for minutes (e.g., 90m, 30m)
  • s for seconds (e.g., 3600s)

The similarity API supports geographic filtering to find similar documents within specific geographic areas. Geographic constraints can be applied using either circular areas (radius from a center point) or bounding boxes (rectangular areas).

Find similar documents within a specified distance from a center point:

query SimilarityWithCircleConstraint {
similarity(
input: {
nearest: { text: "coffee shops near downtown", field: "post_content" }
}
geoConstraint: {
circle: { center: { lat: 37.7749, lon: -122.4194 }, maxDistance: "5km" }
}
) {
total
docs {
id
score
data
}
}
}

Find similar documents within a rectangular geographic area:

query SimilarityWithBoundingBoxConstraint {
similarity(
input: {
nearest: { text: "restaurants in the area", field: "post_content" }
}
geoConstraint: {
boundingBox: {
southwest: { lat: 37.7749, lon: -122.4494 }
northeast: { lat: 37.8049, lon: -122.3894 }
}
}
) {
total
docs {
id
score
data
}
}
}

You can combine multiple circles and bounding boxes using OR logic - results will match if they fall within ANY of the specified areas:

query SimilarityWithMultipleGeoConstraints {
similarity(
input: { nearest: { text: "local events", field: "post_content" } }
geoConstraints: {
circles: [
{ center: { lat: 37.7749, lon: -122.4194 }, maxDistance: "3km" }
{ center: { lat: 37.8044, lon: -122.2711 }, maxDistance: "2mi" }
]
boundingBoxes: [
{
southwest: { lat: 37.7849, lon: -122.4094 }
northeast: { lat: 37.7949, lon: -122.3994 }
}
]
}
) {
total
docs {
id
score
data
}
}
}
  • Coordinate format: Geographic coordinates must be stored as latitude/longitude pairs in the coordinates field
  • Distance units: Supported distance units include:
    • km (kilometers) - e.g., "5km"
    • mi (miles) - e.g., "10.5mi"
    • m (meters) - e.g., "1000m"
    • ft (feet) - e.g., "500ft"
    • yd (yards) - e.g., "100yd"

Geographic coordinates must fall within valid ranges:

  • Latitude: -90.0 to 90.0 degrees
  • Longitude: -180.0 to 180.0 degrees

For bounding boxes, the southwest corner must have lower latitude and longitude values than the northeast corner.