The database structure is a fairly simple one. Certain conventions have been implemented that can be readily identified just be reading this document. For example, most SPIP editorial objects are indexed within an auto-incrementing integer, uniformly named to follow the id_object convention, and which is declared as the primary key of its corresponding table.
NB: this article is becoming somewhat dated and may be out-of-date in respect of certain fields and tables. It should not be considered as a reference guide, but simply as assistance towards understanding the underlying operations of SPIP. If you feel inclined to contribute to the documentation project by updating, please feel free to do so!
- Each section is identified by its unique id_rubrique.
- id_parent is the id_rubrique of the section that is the parent of this current section (or zero if the section is located at the root level of the site hierarchy).
- titre, descriptif, texte are used for the title, description and text for the section.
- id_secteur is the id_rubrique of the section at the top of the section hierarchy contains this section. A section belongs to another section which belongs to another section... up until you reach a section located at the site root; it is this last section that determines the id_secteur. This pre-calculated value makes it possible to speed up certain calculations for the public space (in fact, SPIP’s news items are classified only by their sector, and not by their position further down in the section hierarchy).
- maj is an internal field automatically updated by MySQL which contains the last date that the record was updated in the table.
- export, id_import are fields reserved for future functionalities.
- Each article is identified by its unique id_article.
- id_rubrique indicates which section the article belongs to.
- id_secteur indicates which sector that section belongs to (refer to the previous discussion to explain the difference between section and sector).
- titre, surtitre, soustitre, descriptif, chapo, texte, ps contain the article’s title, top-title, sub-title, description, stand first text, body text and post-scriptum text.
- date is the article’s publication date (if the article is not yet published, then this field shows the creation date).
- date_redac is the date of earlier publishing if you have enabled this field for your site, otherwise it is equal to "0000-00-00".
- statut is the current status of the article : prepa (being prepared (edited)), prop (proposed for publication), publie (published), refuse (refused), poubelle (in the dustbin).
- accepter_forum: is used to manually control if that article accepts forum messages (by default set to yes).
- maj: same meaning as for the sections table (date last updated).
- export is a field reserved for future functionalities.
- images is a field containing the list of images used by the article in a special format. This field is automatically generated by spip_image.php3.
- visites and referers are used by the statistics about the site’s articles. The first is the number of times that article has been viewed in the public space; the second contains an extract of the hashes from the different referrers as a way to find out the number of distinct referrers. See inc-stats.php3.
- each author is identified by its unique id_auteur.
- nom, bio, nom_site, url_site, pgp are used respectively for the author’s name, a short biography, an email address, a web site name and URL and the author’s private PGP key. All of which can be modified freely by the authors themselves.
- email, login are the author’s subscription email address and login ID. These are only modifiable by an administrator.
- pass is the MD5 hash of the author’s password.
- htpass is the encrypted value (i.e. generated by crypt()) of the password for use by .htpasswd.
- statut is the author’s status: 0minirezo (administrator), 1comite (editor), 5poubelle (in the dustbin), 6forum (subscribed to forums, when those are being managed in "by subscription" mode).
- maj has the same meaning as for the previous tables above.
News items: spip_breves
- Each news item is identified by its unique id_breve.
- id_rubrique is the section 8actually, the sector) that the nes item is classified under.
- titre, texte, lien_titre, lien_url are used for the title, text, and link title and URL associated with the news item.
- date_heure is the date of the news item.
- statut is the status of the news item: prop (proposed for publication), publie (published), refuse (refused).
- maj: the same as for the previous tables above.
- Each keywords is identified by its unique id_mot.
- The type of the keyword is the type, or group, that has been chosen for that keyword. By defining multiple types, we are able to define several mutually independent classifications (e.g. "subject", "era", "country", "source"...).
- titre, descriptif, texte are used for the title, description and text of the keyword.
- maj: same as for the previous tables above.
Syndicated sites: spip_syndic
- each syndicated site is identified by its unique id_syndic.
- id_rubrique and id_secteur define the location in the hierarchy for where the site’s contents will be inserted.
- nom_site, url_site, descriptif are the name, URL address and description of the syndicated site.
- url_syndic is the URL address of the dynamic file used to retrieve syndicated content (often this is just the url_site followed by backend.php3).
Syndicated articles: spip_syndic_articles
- Each syndicated article is identified by its unique id_syndic_article.
- id_syndic refers to the syndicated site where the article came from.
- titre, url, date, lesauteurs are used for the title, URL, date and authors of the syndicated article.
Forum messages: spip_forum
- Each forum message is identified by its unique id_forum.
- The object to which the forum is attached is provided by its corresponding id_rubrique, id_article or id_breve. By default, all of these values are zero.
- The parent message (being the message to which this current message is responding to) is identified by id_parent. If the message is not responding to any previous message (a first level forum question), then this value is equal to zero.
- titre, texte, nom_site, url_site are used for the title, text, and name and URL address of the attached link.
- auteur and email_auteur are the name and email address declared by the author. For subscribed forums, these fields are not forcibly the same as the data stored in the author’s record (i.e. in the spip_auteurs table).
- id_auteur identified the message author for subscription forums.
- statut is the message status: publie (published and visible in the public zone), prive (a response to an article in the private space), privrac (written in the internal forums in the private space), off (deleted or to be validated, depending on whether forum moderation is set to prior or post moderation modes).
- ip is the author’s IP address for the public forums.
- maj has the same meaning as for the other tables above.
- id_article identifies the article to which the petition is associated (only one petition per article).
- email_unique, site_obli, site_unique, message are boolean variables that define the configuration of the petition: the email address of signatories must be unique, the Web site address is compulsory, it must be unique, a message accompanying signatures is authorised (yes or no).
- texte is the text of the petition.
- maj has the same meaning as for the other tables above.
Petition signatures: spip_signatures
- Each signature is identified by its unique id_signature.
- id_article identifies the article to which the petition is attached.
- nom_email, ad_email, nom_site, url_site are the name, email address and web site as declared by the signatory.
- message is a message that may have been entered by the signatory.
- statut is the status of the signature: publie (accepted), poubelle (deleted); any other value provides the validation key that is used for the email confirmation.
- maj has the same meaning as for the previous tables.
Relationships between SPIP objects
These tables do not generate any content, but are used to map the relationships between the other objects in the other tables.
- spip_auteurs_articles specifies the relationship between authors and articles. If an id_auteur is linked to an id_article, this means that the author in question either wrote or co-wrote that article (there can be several authors for a single article, and of course an author may write several articles).
- spip_mots_articles defines the referencing relationships between articles and keywords.
The spip_meta table is a critical key component. It contains data pairs (nom< (name)/tt>, valeur (value)) indexed by the name as primary key; these pairs are used to store various data such as configuration parameters of the site, the installed version number, etc.
The spip_forum_cache table is used to modify the cache system to accommodate the immediacy of the forums. For each cache file giving rise to a query on the spip_forum table, the spip_forum_cache table will store the query parameters (article, section, news items and possible forum parent message). When a message is posted, the message parameters are compared with those already existing in spip_forum_cache, and for each match found the cache file specified in the table is erased. In this way, messages do not require the ordinary recalculation of the page into which they are inserted in order to appear on the public site.
Indexing (search engine)
Six tables are used by the search engine. They are divided into two categories.
The indexing dictionary: spip_index_dico
Every word encountered during indexing is stored in this table, as well as the first 64 bits of its MD5 hash. This is the word that works as the primary key, thereby making it possible to perform very fast enquiries on the start of a word; we then retrieve the hash(s) matching the query in order to perform the real search in the indexing tables.
The other indexing tables: spip_index_*
These five tables manage indexing for the different types of objects: articles, sections, news items, authors and keywords. One record per word and object is stored. Each record contains the word’s hash (actually only the most significant 64 bits of the MD5 hash), the identifier of the indexed object (e.g. the id_article for an article), and the number of points associated with the indexing of that word for the object. This number of points is calculated on the number of times that words appears weighted by the field that contains those appearances: one occurrence in the title of an article assigns more points than an occurrence in the text body of that article.
The indexing mechanism is explained in more detail in this article: The search engine.