Inverted Index Demonstration
Sample Documents
Document 1: "OpenSearch is a fast search engine."
Document 2: "REPLY is a Service Delivery Partner of Amazon for OpenSearch."
Document 2: "REPLY is a Service Delivery Partner of Amazon for OpenSearch."
The Indexing Pipeline
Document
→
Tokenize
→
Filter
→
Index
→
Search
Document
Raw text with mixed case and punctuation
Tokenize
Split into words, lowercase, remove punctuation
Filter
Remove common stopwords (is, a, of, for)
Index
Map terms to documents for fast lookup
Search
Find documents instantly using the index
Quick Start:
Press Enter or click Next to begin the interactive tour
Tokenize: lowercase, remove punctuation, and split into words
Document 1
↓ Processing
↓ Tokens
Document 2
↓ Processing
↓ Tokens
Filter stopwords: remove common words
Document 1
"OpenSearch is a fast search engine."
↓ Filtering
Document 2
"REPLY is a Service Delivery Partner of Amazon for OpenSearch."
↓ Filtering
Build inverted index: term → documents
Document 1
"OpenSearch is a fast search engine."
Document 2
"REPLY is a Service Delivery Partner of Amazon for OpenSearch."
↓ Inverted Index
| Term | Found in Documents |
|---|
Search in action: type to filter documents and terms instantly
| Term | Found in Documents |
|---|