📗

Rest API v3.3

This is the Restful API docs for calling Indx through HTTP requests.

Version 3.3. See older versions below

Older versions
☁️
indxRestAPI runs on Azure servers in 🇳🇴 Norway East. The server runs on 🌱 100% renewable energy.
💻
Typical use case

📡 Connect to the endpoint, log in and receive a token

⛓ Insert with an array of documents

🪄 Create Index

🟢 Check ready status

🔎 Search

⛓ Insert one or more documents without re-indexing

🗑 Delete one or more documents without re-indexing

🪄 (re-index if system status says it is required)

All of these functions can be run multiple times. For example to perform incremental loading.

All variations of this pattern can be run continuously over a long period of time to assure that the index and data is up to date.

🚘
Try indx before creating your own program

To try loading a dataset, go to https://load.indx.co

To search your dataset, go to https://search.indx.co

These projects and other resources can be found in Github on https://github.com/IndxSearch

How to use indx Rest API

📡 Indx Rest API endpoint: https://v33.indx.co/api/ Variables in use

${API_URL}
the url to the endpoint (https://v33.indx.co/api/)
${USER}
your username (e-mail)
${PASSWORD}
your password
${TOKEN}
your retrieved token string
${CONFIG}
a config value, most often this should be 100
${DATASET}
a string value of you dataset, for example “myDataset”
${DOCUMENT_KEY}
a foreign key for each record

Step by step

🔓 Log in and retrieve yourBearer access token
👩‍💻
If you do not have a username and password, request developer access here

POST https://v33.indx.co/api/Login[?userEmail][&userPassWord]

This returns 200 OK with the bearer token as string.

curl -X POST \
	'${API_URL}/Login?UserEmail=${USER}&UserPassword=${PASSWORD}' \
	-H 'accept: */*' \
	-d ''
⚙️ Create an instance of indx

A heap (an instance with a dataset) is required. Every API call will refer to this dataset ID. You can create multiple datasets. This function will also ask for a configuration number. Config number 100 is used for most cases.

PUT https://v33.indx.co/api/Search/{dataSetName}/{configuration}

curl -X PUT \
	'${API_URL}/Search/${DATASET}/${CONFIG}' \
	-H 'accept: */*' \
	-H 'Authorization: Bearer ${TOKEN}'
💡
If you try to create a heap with an id that is already in use, you will get this error message.
title": "Unprocessable Entity",
"status": 422,

If you want to override the ID, you need to run DELETE /api/Search/{DATASET}

📄 Insert data

To load documents into your heap, add data as JSON. A document is a class of information used by indx. The documents are identified by a key, 64 bit integer. The document class also has a field called “documentClientInformation” for text that should not be searchable. This is for storing information you typically need to display or process with your record. You can store JSON or any other format in here. int SegmentNumber is used by the client to add extra info such as line numbers in a book. Can be used to describe a part of a body text that is split into several records.

The recommended max length of a documentText is 80 characters, after this, SegmentNumber should be used. This ensures optimal pattern recognition capabilities.

An array of records

PUT https://v33.indx.co/api/Search/array/{dataSetName}

curl -X PUT \
	'${API_URL}/Search/array/${DATASET}' \
	-H 'accept: */*' \
	-H 'Authorization: Bearer ${TOKEN}'
	-H 'Content-Type: application/json' \
  -d '[
  {
    "deleted": false,
    "documentClientInformation": "Large airport, KLAX",
    "documentKey": 0,
    "documentTextToBeIndexed": "Los Angeles International Airport KLAX",
    "segmentNumber": 0
  },
  {
    "deleted": false,
    "documentClientInformation": "Medium airport, ENTO",
    "documentKey": 1,
    "documentTextToBeIndexed": "Sandefjord lufthavn Torp ENTO",
    "segmentNumber": 0
  }
]'
🪄 Index your data

After inserting, you need to run the index call before you can search. This will run asynchronously and can be monitored by checking system status

After searching you should Save the dataset for persistence on the server.

GET https://v33.indx.co/api/Search/IndexDataSet/{dataSetName}

curl -X GET \
	'${API_URL}/Search/IndexDataSet/${DATASET}' \
	-H 'accept: text/plain' \
	-H 'Authorization: Bearer ${TOKEN}'
🟢 Check system status

GET https://v33.indx.co/api/Search/{dataSetName}

curl -X GET \
	'${API_URL}/Search/${DATASET}' \
	-H 'accept: text/plain' \
	-H 'Authorization: Bearer ${TOKEN}'

Example response:

{
  "errorMessage": "",
  "documentCount": 123456,
  "indexProgressPercent": 100,
  "invalidArgument": false,
  "invalidHeapId": false,
  "invalidState": false,
  "reIndexRequired": false,
  "searchCounter": 0,
  "secondsToIndex": 0,
  "systemState": 3,
  "timeOfInstanceCreation": "2024-02-27T13:50:10.532Z",
  "timeOfLastIndexBuid": "2024-02-27T13:50:10.532Z",
  "tooLongClientText": false,
  "tooLongSearchText": false,
  "tooManyDocuments": false,
  "unknownConfigurationError": false,
  "version": "3.3.0.0"
}
int DocumentCount returns the number of documents uploaded and indexed.
int IndexProgressPercent returns the progress of the indexing. This can e.g. tied up to a progress bar when indexing large datasets.
bool InvalidHeapId is only relevant for certain configurations and will then be described separately for each individual
bool InvalidState will return true if DoIndex is called when DocumentCount == 0 or if the search function is called before has been called at least once.
bool ReIndexRequired is returned true when a fraction over a limit of the documents has been deleted or inserted after the last indexing. The limit is set in the configuration. Omitted reindexing may affect search results.
int SearchCounter returns the number of calls to the search function after occurring
int SecondsToIndex returns the indexing time in seconds
int SystemState returns what state the system is in. 0 = created, not loaded 1 = loading 2 = indexing 3 = ready to search
DateTime TimeOfInstanceCreation returns timestamp for call to constructor
DateTime TimeOfLastIndexBuild returns timestamp of last call indexing
💡
The configuration will have a maximum length for the search text, the client text, and the number of documents that can be loaded.
bool TooLongSearchText is returned true if the maximum length of the search text is exceeded. If that happens, the text will be truncated. This alarm is not reset until a call to DeleteAll is made.
bool TooLongClientText is returned true if the maximum length of client text is exceeded. If that happens, the text will be truncated. This alarm is not reset until a DeleteAll call is made.
bool TooManyDocuments is returned true if the number of documents that are indexed exceeds the maximum limit defined in configuration. The alarm is reset by deletion and subsequent re-indexing.
bool UnknownConfigurationError is returned true if the instance is created with an invalid configuration number
string Version returns version number of IndxSearchLib
💾 Save heap for persistence on the server

In order for your dataset to be persistent on the server you will need to save it. This is typically called after insertions or deletions.

PUT https://v33.indx.co/api/Search/{dataSetName}

curl -X PUT \
	'${API_URL}/Search/${DATASET}' \
	-H 'accept: */*' \
	-H 'Authorization: Bearer ${TOKEN}'
🔎 Search query

The search query takes a JSON argument to set up a search. The only required fields when searching are the text to search for and the maximum number of records you want to return. See the advanced query for more options.

POST https://v33.indx.co/api/Search/{dataSetName}

curl -X POST \
	'${API_URL}/Search/${DATASET}' \
	-H 'accept: text/plain' \
	-H 'Authorization: Bearer ${TOKEN}'
	-H 'Content-Type: application/json' \
	-d '{
  "maxNumberOfRecordsToReturn": 30,
  "queryText": "string",
}'
Advanced query
curl -X POST \
	'${API_URL}/Search/${DATASET}' \
	-H 'accept: text/plain' \
	-H 'Authorization: Bearer ${TOKEN}'
	-H 'Content-Type: application/json' \
	-d '{
	"applyCoverage": true,
  "keyExcludeFilter": null,
  "keyIncludeFilter": null,
  "logPrefix": "",
  "maxNumberOfRecordsToReturn": 30,
  "removeDuplicates": true,
  "queryText": "string",
  "timeOutLimitMilliseconds": 1000,
  "wordExcludeFilter": null,
  "wordIncludeFilter": null,
  "coverageSetup": {
    "minWordSize": 3,
    "coverWholeQuery": true,
    "coverWholeWords": true,
    "coverFuzzyWords": true,
    "coverJoinedWords": true,
    "coverPrefixSuffix": true,
  },
  "numberOfRecordsForAppliedAlgorithm": 500
}'

Required:

string queryText refers to the text (input) to be searched for.
int maxNumberOfRecordsToReturn defines the maximum number of documents to be returned.

Optional:

bool applyCoverage describes whether the search should use the Coverage algorithm after finding pattern recogntition matches. This is default on. The Coverage function was introduced in v3.1 and is a function that detects exact and near-exact matching on strings and tokens in the query, and lifts them up to the top of the result list. Coverage is a collection of algorithms that works together, and this complements the RelevancyRanking algorithm that runs first. Coverage is capable of detecting the complete query string, whole words, split words and joined versions of the same word, prefixes and suffixes of words, and words with a minor error. The Coverage function requires some more CPU resources, but will still give a best-in-class response time. This also returns int CoverageBottomIndex to truncate the list when the function determines a cut-point.
coverageSetup defines properties for the Coverage algorithm. This only applies when applyCoverage is set to true.
int MinWordSize defines the minimum size of a detectable token. Default set to 3. Values between 2 and 5 is recommended.
bool coverWholeQuery sets whether to detect the whole search query as a string.
bool coverWholeWords sets whether to detect whole words from the string in the result list. This will look for multiple words.
bool coverFuzzyWords sets whether to detect words with a minor error tolerance.
bool coverJoinedWords sets whether to detect words that are either joined or split up. Both will be returned in the same query.
bool coverPrefixSuffix sets whether to detect incomplete strings as prefix or suffix of a bigger word.
int timeOutLimitMilliseconds sets waiting time in case of overloaded CPU. We recommend 1000 ms.
keyIncludeFilter and keyExcludeFilter are inclusive and exclusive filters based on the foreign key field. See “Search query with key filters” for how to use this.
wordIncludeFilter and wordExcludeFilter are word-based including and excluding filters. Completely dynamic filter based on text that can, for example, be entered from the end user. See “Search query with word filters” for how to use this.
bool removeDuplicates removes all duplicates of documents with the same foreign key. Only the one with the best score value is returned in the results list. If foreign keys are not in use, this can be set to null.
int numberOfRecordsForAppliedAlgorithm is used to set the number of records the Coverage function should work on. Default value is 500, but it can be overrun in cases where you need to retrieve many results in a single query.
🔦 Search results
{
  "illegalHeapNumber": false,
  "invalidArgument": false,
  "invalidState": false,
  "coverageBottomIndex": 0,
  "searchRecords": [
    {
      "metricScore": 204,
      "deleted": false,
      "documentClientInformation": "string",
      "documentKey": 0,
      "documentTextToBeIndexed": "string1",
      "segmentNumber": 0
    },
    {
      "metricScore": 204,
      "deleted": false,
      "documentClientInformation": "string",
      "documentKey": 1,
      "documentTextToBeIndexed": "string1",
      "segmentNumber": 0
    },
    {
      "metricScore": 180,
      "deleted": false,
      "documentClientInformation": "string",
      "documentKey": 2,
      "documentTextToBeIndexed": "string2",
      "segmentNumber": 0
    }
  ],
  "timedOut": false
}
bool illegalHeapNumber is only applicable for special configurations that will be documented separately.
bool invalidArgument returns true if SearchQuery is null.
bool invalidState is returned true when searching before Indexing or during indexing. In this case the search will be unsuccessful.
(new in v3.1) coverageBottomIndex returns an index number the list could be truncated on. The value is greater than -1 if the Coverage algorithm is in use, and requirements like one or more words or string are exactly matched.
SearchRecords, the search result, in the form of documents with their score.
Deleted returns true of a document that has been deleted without saving the Heap
byte metricScore is the result of pattern recognition where 255 is the best result, i.e. identical similarity.
bool timedOut is returned true if there are not enough resources to complete the search within the specified time.
🔑 Key filters

A query with filter is used by filling the keyExcludeFilter or keyIncludeFilter arrays in the Search function. This accepts foreign keys of 64 bits. Indx supports exluding filters, meaning “do not include”.

POST https://v33.indx.co/api/Search/{dataSetName}

curl -X POST \
	'${API_URL}/Search/${DATASET}' \
	-H 'accept: text/plain' \
	-H 'Authorization: Bearer ${TOKEN}'
	-H 'Content-Type: application/json' \
	-d '{
	"maxNumberOfRecordsToReturn": 30,
	"queryText": "string",
	"keyExcludeFilter": null,
	"keyIncludeFilter": 
  [
    0, 1, 2, 3, 4
  ],
}'
💬 Word filters

WordFilter is a word-based dynamic filter that can be predefined, but also possibly set up by the user. Can be used to set up both inclusive and exclusive filters.

WordFilter takes a string as a single word and is defined as an argument to SearchQuery, and can be used either inclusively or exclusively. The word must hit exactly.

POST https://v33.indx.co/api/Search/{dataSetName}

curl -X POST \
	'${API_URL}/Search/${DATASET}' \
	-H 'accept: text/plain' \
	-H 'Authorization: Bearer ${TOKEN}'
	-H 'Content-Type: application/json' \
	-d '{
	"maxNumberOfRecordsToReturn": 30,
	"queryText": "string",
	"wordExcludeFilter": 
  [
	  "secret", 
    "admin"
	],
	"wordIncludeFilter": null,
}'
🗑️ Delete a single document

Delete a document with a given key.

Remember to Save dataset after for this to be persistent.

DELETE https://v33.indx.co/api/Search/{dataSetName}/{documentKey}

curl -X 'DELETE' \
  '${API_URL}/Search/${DATASET}/${DOCUMENT_KEY}' \
  -H 'accept: */*' \
  -H 'Authorization: Bearer ${TOKEN}'
🛑 Delete instance

Delete dataset, will delete the entire dataset (instance), including all contained documents. It will take effect immediately and the data cannot be recovered.

DELETE https://v33.indx.co/api/Search/{dataSetName}

curl -X 'DELETE' \
  '${API_URL}/Search/${DATASET}' \
  -H 'accept: */*' \
  -H 'Authorization: Bearer ${TOKEN}'