⚙️

Core concepts

This note is written to create an understanding of the possibilities that this technology provides, and how it can be used effectively. The use of the system requires a slightly different approach compared to traditional systems which are typically field- and word-based.

🔎
Pattern Recognition search

Pattern Recognition Search

Indx is based on pattern recognition rather than linguistic models. This means that the search system recognizes fragments of the same pattern. The length and shape of the pattern will also affect recognition.

The pattern recognition is done on the entire search string that is indexed. This means that the system can, among other things, respond to relationships between hierarchies, unlike traditional search systems that simply indexes lists of single words.

The system is deliberately designed without concepts of words, punctuation, or language. When indexing, all text is indexed to a vector model in the system. Once indexed, the memory occupied by the source text may also be released, resulting in a low footprint, where only the foreign keys are returned as the search result.

💡
Indx is built with a new approach that eliminates common challenges encountered with traditional search technology, such as stop words.
🔝
Built-in relevancy ranking

Find the most relevant results

Indx uses a built-in relevance ranking system to analyze the frequency of patterns. Rare patterns or combinations of characters will be given a high relevance, while very common patterns (such as "and") will be given a lower relevance. The system compares all indexed patterns with each other to create a vector model, which helps it identify patterns that are frequent or rare and use that to determine relevance.

🌪️
Filters

Advanced and highly dynamic filter functions

Indx search has two types of filters. Both filter types can be set up as both excluding and including filters. For both types, it is also possible to set up multiple filters and switch which ones are used for a search instance.

Key-based filters are defined as a list of foreign keys sent to Indx search. Key-based filters are well suited for documents that have fixed categories. If you want to show results from multiple categories at the same time, you can repeat the search, with filters corresponding to the category. This is often called faceted search. Even though categories usually are considered distinct, this is not a requirement. They may overlap, hence a key value may be part of multiple filters.

Word-based filters are specified to the search engine as a word. These filters are automatically updated when deleting or inserting. When such a filter is set up, Indx search can immediately tell how many documents are affected by the filter. Word-based filters enable advanced user interaction because the end user can define which words to include or exclude from the search.

All filter types support the Boolean operations AND and OR. If you have a category "women" and a category "bricklayers", the AND will result in all "female bricklayers", while an OR operation will give all who are "women or bricklayers". The example applies to an inclusive filter.

🏳️
Language agnostic by design

Not limited by language

Indx Search is designed without language patterns, meaning you can in theory combine several languages in one search.

When working with language that has unique national characters we often recommend using the StringReplacer function to standardise these. If the user native language is known, the stringreplacer and consequently the index may thus be part of a personalization.

Fault-tolerance

Bigger tolerance for spelling mistakes or noisy data

Traditional search systems can't handle more than two spelling mistakes. Indx is designed to be more tolerant of mistakes, making it easier for users to find what they need. This is especially helpful in fields like healthcare, legal services, finance, and customer service, where mistakes can have a high cost.

🪢
Multiple indexes

Handle hundreds of unique datasets - and search across them

The system can handle hundreds of indexes running concurrently. This can be used for personalization, context boundary, structuring of data to be weighted, or other. See best practices information to understand the scope of this.

Indx has merge functionality with very high performance. Merging is a process where the results from multiple searches in different indexes are combined into one list, which is then sorted by score value.

It is possible to give different amounts of importance to each index. A common use is to combine an index with headlines that are given more importance with an index that contains body text.

🦹
Aliases

Handle variations, synonyms, and duplicates with ease

Indx supports aliases by allowing multiple documents with the same foreign key, but with different indexing text. Alternatively, the client can use its own annotation field as an identifier instead of the foreign key.

The search function has an argument removeDuplicates which removes redundant results with the same foreign key. Only the one with the highest score is returned.

Index on the fly

Incredibly fast indexing makes it possible to spin up an index in “no time”

The fast indexing is useful for use cases where you want to work with dynamic data that changes based on user behaviour, or simply to spin up an index when a user logs in to your service.

🛡️
Security and GDPR

Designed to protect your customers data

Indx runs as a class library or a REST service and does not require any data to be sent to external services. It is also possible to delete the actual dataset once it has been indexed and the pattern network has been generated.