Sunday, December 26, 2010

Semantic Suggestion

The weakest point of the keyword query interface for structured information is the vocabulary mismatch problem between the user's vocabularies and the vocabularies allowed by the search system. In a certain domain there exist simply too may natural language expressions about a concept. It is not possible the search system understand all the natural language expressions.

Even in a very restricted domain, there are too many concepts. But the concepts understood by the search system are usually a very small subset of the whole domain concepts. So most of the user queries will fail. This phenomenon is called overshooting the capabilities of the search system.

NaverLab Semantic Movie Search provide semantic query suggestion to mitigate the overshooting problem. The semantic auto-completion helps to mitigate the problem. The semantic auto-completion only works for single object. But the semantic suggestion works for multi-keyword queries.

While typing the first word in the search box, the auto-completion is activated. If the first word is completed and a space is inserted, the semantic suggestion starts to be activated. The semantic suggestions are the multi-keyword queries that have answers in the database. The suggested queries are not plain texts but object queries.

The following is a sequence of the semantic suggestions starting from "gladiator". If a space is inserted after "gladiator", two-word queries are suggested. Select one of the suggested queries and insert a space. Then three-word queries are suggested. In maximum 5-word queries are suggested.

Auto-completion
(step 1) Two-word query suggestions
(step 2) Three-word query suggestions
(step 3) Four-word query suggestions

No comments: