Let's build a Full-Text Search engine

这是一篇转载文章原文地址,原文讲述如何构建一个全文搜索引擎,用的 Go 实现的,本来想翻译一下,顺便用 Java 实现一下,由于翻译出来比较生硬,还是把原文放出来,顺便把我用 Java 实现的版本放在链接中Java实现版本

Full-Text Search is one of those tools people use every day without realizing it. If you ever googled “golang coverage report” or tried to find “indoor wireless camera” on an e-commerce website, you used some kind of full-text search.

Full-Text Search (FTS) is a technique for searching text in a collection of documents. A document can refer to a web page, a newspaper article, an email message, or any structured text.

Today we are going to build our own FTS engine. By the end of this post, we’ll be able to search across millions of documents in less than a millisecond. We’ll start with simple search queries like “give me all documents that contain the word cat“ and we’ll extend the engine to support more sophisticated boolean queries.

Note

Most well-known FTS engine is Lucene (as well as Elasticsearch and Solr built on top of it).

Read more