AWS Premier Consulting Partner

Inverted Index Demonstration

Sample Documents
Document 1: "OpenSearch is a fast search engine."
Document 2: "REPLY is a Service Delivery Partner of Amazon for OpenSearch."
The Indexing Pipeline
Document
Tokenize
Filter
Index
Search
Document
Raw text with mixed case and punctuation
Tokenize
Split into words, lowercase, remove punctuation
Filter
Remove common stopwords (is, a, of, for)
Index
Map terms to documents for fast lookup
Search
Find documents instantly using the index
Quick Start: Press Enter or click Next to begin the interactive tour
Tokenize: lowercase, remove punctuation, and split into words
Document 1
↓ Processing
↓ Tokens
Document 2
↓ Processing
↓ Tokens
Filter stopwords: remove common words
Document 1
"OpenSearch is a fast search engine."
↓ Filtering
Document 2
"REPLY is a Service Delivery Partner of Amazon for OpenSearch."
↓ Filtering
Build inverted index: term → documents
Document 1
"OpenSearch is a fast search engine."
Document 2
"REPLY is a Service Delivery Partner of Amazon for OpenSearch."
↓ Inverted Index
Term Found in Documents
Search in action: type to filter documents and terms instantly
Term Found in Documents
Step 1 of 5