The search engine

SPIP includes its own search engine which is deactivated by default. When activated by an administrator on the configuration page, this engine is used to perform searches over the various types of data stored in the database: articles, sections, news items, keywords and authors..

The general principle

There are two main ways of using a search engine. The first is to simply search the existing storage (HTML files, database, etc. depending on the type of site). The second is to index the content in the database. The indexing solution was used until SPIP 2.0. From SPIP 3.0 onwards, content is searched for directly without any prior indexing.

On the other hand, in the case of SPIP, we are obliged to use PHP and MySQL as for the rest of the software, which does not allow us to create a very high-performance engine, in terms of speed, but also in terms of relevance or various enrichments (indexing of documents outside the site, creation of semantic fields enabling more refined searches, etc.).

The advantage of the internal engine, however, is that it makes it possible to manage the display of the search results using the same methods (templates) as for the rest of the SPIP pages, and to do so within the same visual environment.

The search is performed simply by separating the search text into its individual words; the same filter is applied as during indexing: words of three letters or less (except acronyms) are removed, and transliteration is applied.

For each piece of content searched, the score for each word is then retrieved and added together to give the total score. Finally, the results are generally displayed in descending order of {!by points} score, i.e. relevance (but this is left to the discretion of the person writing page layout templates).

For each content being searched, the score of the various words is then retrieved and added in order to obtain a total score. Finally, the results are generally displayed in decreasing score order ({par points}{inverse}), i.e. in decreasing order of relevancy (but which remains at the discretion of whoever codes The search loops and tags).

The search does not offer Boolean operators, the implicit operator being roughly a logical ’OR’. However, the articles found are displayed in an order that favours the results containing the most words spelt precisely according to the query. For example, a query on "la main rouge" will highlight articles containing "main" and "rouge", far ahead of articles containing only "maintenance" or "rouget" - these will appear, but further down the ranking.

Nor does the search engine index the content of files added to articles.

Advanced search engine

The default search engine is rather limited. Fortunately there are alternatives to provide a better search:

  • The Fulltext plug-in which relies on MySQL’s Fulltext mode which allows finer-grained indexing and supports Boolean operators ("+vin -white")
  • The Indexer plug-in, which interfaces SPIP with the Sphinx search engine.

Author Mark Published : Updated : 26/06/23

Translations : عربي, català, English, Español, français, italiano, Nederlands, українська