Databases within SPIP

Everything that SPIP does with databases

SPIP can be seen as a tool for the presentation in a MIME format (HTML, RSS, ICS ...) of data extracted from a database. One of the new features in SPIP 2.0 is the ability to combine on a single page information originating from several SQL databases, accessed from various different servers and quite separate from each other in ass respects. SPIP takes charge of managing the various connections and table declarations without imposing any programming requirement on its users. This article aims to describe all of the database operations that are offered in this latest release of SPIP.

Minimum installation

First up, we’ll look at the all the possible scenarios for installing SPIP with a single database.

During the installation of SPIP, it will test the local PHP environment and offer, wherever possible, a choice from several types of SQL database server (currently MySQL, PostgreSQL or SQLite), which all offer the same functionalities for SPIP’s use. At this stage, it is also necessary to supply the Internet address of the SQL server, a connection identifier for that server and its associated password. These data are generally entered on the installation set-up form. However, if you want to implement Multi-site provisioning with a shared SPIP kernel, you can specify one or more of these values in the configuration file called mes_options.php. SPIP will then not ask for them, which will mean they don’t need to be entered and you will not be obliged to provide them to others who use your shared server installation. Here are the PHP constants to be defined to achieve this:

_INSTALL_SERVER_DB type of SQL server (Mysql or PG; character case is unimportant)
_INSTALL_HOST_DB Internet name of the SQL server (e.g.: localhost)
_INSTALL_USER_DB an SQL server user name
_INSTALL_PASS_DB the corresponding password

SPIP then connects to the selected server using the connection identifiers. When it succeeds, it checks to see if the user has the right to create databases, or only the right to view those that already exist and select one of them, or perhaps only has the right to use a database with the same name as the login ID. You can just as easily specify beforehand in the mes_options.php file what the name of the database is that should be used by declaring the PHP constant:

_INSTALL_NAME_DB database name

SPIP then continues with the installation by creating its SQL tables (or for re-installations, checking that those that exist are usable). These tables all start with a table prefix (by default equal to spip_) and may be specified in the mes_options.php file in two ways:

the global variable $table_prefix
the constant _INSTALL_TABLE_PREFIX

This table prefix is what makes it possible to write abbreviated loops in the templates such as:

<BOUCLE1(ARTICLES)....

where the SQl name of the table is in fact spip_articles.

SPIP next requests to create the first user of the site by supplying the name and password (unrelated to those for the SQL server) or by offering to delegate the authentication tasks to an LDAP server. After this declaration, the minimum installation procedure has finished, and the next page displayed if the normal back-end login page which will request those same identifiers as have just been declared. Those connection credentials have been saved into a file which by default it called config/connect.php.

Installation supplements

Now we illustrate how to specify to SPIP that it should use several databases, either on the same server or on different servers, and even on different physical servers.

Once the site has been installed, select the sub-menu option site maintenance from the configuration menu shown at the top of the private zone. This will bring up a page with three tabs; click on the right-hand one called Declare another database. You will then be offered a form almost identical to the one that appeared during installation: it asks you to specify the SQL server type, its Internet address, a valid username and its password. If this connection is successful, SPIP will ask for name of a database, listing those that already exist if it can do so. IF the nominated database exists, it will create an additional connection file which will be stored by default in the config directory and will be given the same name as the specified database.

When completed, SPIP returns back again to this form, and lists the existing connection files, otherwise known as the accessible databases. SPIP then offers to declare yet another additional database if you want one: there are no limits on the total number of declared databases.

It is useful to declare these databases here because it enables you to use the standard template notation for formatting data from these supplementary databases. Suppose there is a connection file called B.php, then you can access a database on that server for a table called T with the following template:

<BOUCLE1(B:T) />#TOTAL_BOUCLE<//B1>

which will return the total number of records in that table. In addition, SPIP requests the SQL server to describe that table for its use, which allows SPIP to find out what all of its fields are and also possible its primary key. If the loop T has a primary key called id and a field called fieldname, then the template:

<BOUCLE1(B:T){id}>#FIELDNAME</BOUCLE1>

will return the fieldname value for the unique row matching the id provided in the template’s context.

Lastly, it is possible to apply a template by requesting all of its loops to use another database than the default one, by adding into the page’s URL parameters one for the connect variable specifying the desired database name. For example http://mysite.com?connect=another_site will apply the standard sommaire (home page) template for the mysite.com site on to the database specified by the config/another_site.php connection file stored in the installation directory for my_site.com.

It is recommended to respect the case of the connection file name whenever you use it in a loop or as a connect parameter: there is no uppercase / lowercase conversion applied by SPIP (but note that some file systems will do this by themselves).

Remarks: Accessing supplementary databases is exclusively in read-only mode; in particular, if a database is a forum for example (whether administered by a SPIP system or not), it will remain impossible to post into this forum other than through its own original site. Removing this restriction is currently a topic under review.

Mixed installations

The template segments above therefore rely on the existence of a connection file called B.php. An important point to note is that the main connection file and those of the supplementary databases all have the same format, which specifically enables one SPIP site to directly use the connection file from another SPIP site in order to view its tables. One method of proceeding is to simply copy the standard connect.php file from site B and save it as B.php in the config directory of site A.

A more intelligent method, for sites that share the same SPIP installation as well, is to have a single config directory for all of the sites, and to name the connection file for site A not as connect.php but as A.php, and the same for site B etc. Setting things up this way means that as soon as a site installed in SPIP, it becomes known as a supplementary database for all of the other sites that share the same SPIP installation, and it will be able to view all of the other sites that already use the same config directory as it does as just so many other available databases. To arrive at this operational optimum using implicit declarations, you must take great care to assign the correct values for the _DIR_CONFIG and _FILE_CONNECT_INS constants. Lower down in this article there is an example of a shared kernel installation offering this functionality as well as several others.

For a site installed on a remote SQL server, just copying its connection file to the local site may be sufficient in theory. However, many ISP hosts refuse to allow their SQL servers to be accessed by machines outside of their local network. You must also consider that often the "name" indicated for the SQL server upon installation is localhost, which is a relative address to the HTTP server, whereas a remote server will require an absolute address. In summary, SPIP can operate in such a network architecture, but a certain minimum knowledge, and possibly access authorisations, are required for the sub-network topology employed is almost certainly indispensable.

Supplementary database under SPIP

Whenever a connection file is one for another site running under SPIP (whether it be a copy or the original accessible with shared installations), it contains the indication that that site is running under SPIP (the global variable spip_connect_version has a value within it). In this case, the templates of the main site will be applied with a special behaviour:

-  loops with abbreviated names will also be interpreted as abbreviated in the supplementary database, viz: <BOUCLE1(B:ARTICLES)... (or <BOUCLE1(ARTICLES)... with a URL specifying connect=B.) will reference the spip_articles table in database B, or more specifically the prefixarticles table where prefix is that used by site B (this prefix is specified in the connection file);

-  inside a loop, the #URL_ tags (refer to Using personalised URLs) will use the URL type of the principal site, and not that of the remote site, and the URL generated will reference the principal site with connect=remote_site; this strategy makes it possible to navigate through the remote site database without needing to change from the templates specified for the principal site, i.e. to test the appearance that this set of template files and its type of URL create on the remote site without modifying the installation of that remote site or working with a redundant copy of it.

-  SPIP’s typographical shortcuts for #URL tags that ma appear in the remote database fields are also interpreted in this same manner: [->art3681] will correctly reference article 3681 on the remote database, but will do so using a URL that will supply the page formatting provided by the template files for the principal site.

This way of working therefore makes it possible to present other SPIP databases without even having to write any templates for its pages, since the table name of the local database are the same as those of the remote site. On the other hand, a database that is not run under SPIP needs formatting templates that explicitly reference the tables of that other database and its table fields. To respond to this need, the site administrators use a special procedure: when they enter into their browser the URL of their site followed by ?page=table:table where table is the name of a database table, SPIP will automatically construct a special template for that table, allowing you to examine its contents is a very easy to use interface. The template generates is displayable using the squelette (template) link at the bottom of the page. It is then possible to copy it with the mouse (click on the top of column to high the line numbering) and to them improve on it by using a suitable code editor.

Note that this automated production from SPIP allows you to (just) operate without any predefined template.

Back-ups and Merges

Using the Site maintenance sub-menu, SPIP has always provided access to two tools for backing up and restoring the local database (see Backing up your data). Since SPIP 2.0, it has been possible to install a backup made from an old site into the SPIP installation with a higher version number; however, such a scenario requires memory space which is not always made available, and it is also recommended to try updating the original site, then to make a new back-up, and then finally to re-import it on the new site.

On the other hand, up until version 1.8, a back-up was a total affair, just like the restore. After several attempts based on the status of a restricted administrator, which finally ended up being inappropriate, SPIP 2.0 now offers a new functionality that is in closer harmony with partial back-ups and merge restorations.

The back-up form offers to restrict the backup to a given section. The file produced will contain the articles, sites, news items and sub-sections contained within the section indicated, as well as the keywords and keyword groups used by all of these records. The backup name will by default by the name of the section, but it can be overridden. Creating the backup files may take such a long time that the HTTP server disconnects you for appearing to be disconnected for to long an inactive period. Simply reload the page: SPIP will find out where it was up to and will continue on with its task. The large number of files that you might temporarily see appear in a sub-directory of the tmp directory is quite normal, since the are linked to such a restart situation being anticipated as a result of disconnection.

Quite symmetrically, the restore form now offers two new functionalities. The first is the standard replacement of the current database with the backed up database. The second consists of considering the backup database as an incomplete database that can be added to the already installed database. As there may be conflicts between the installed database and the incomplete database, as regards the numbering of its tables (for articles, news items etc), SPIP starts by renumbering the records of the incomplete database by assigning them numbers immediately following the highest such numbers used in the installed database. In a second pass, SPIP effectively imports the renumbered records, but during the first pass, it compared the import records with those of the installed database and ignores any duplicates as defined by the following rules:

-  if there exists in the installed database a group of keywords with the same name as a group of keywords to be imported, then that group is ignored;

-  if a keyword of a keyword group that was not imported has the same name in a keywords group that has the same name as its original group, then that keyword is ignored;

-  if there exists in the installed database a sector section that has the same name as a sector section to be imported, then that section is ignored;

-  if a section which belongs to a parent section that was not imported has the same name as its parent, then that sub-section is ignored; the same applies to any articles, news items and sites referenced in that section;

-  if there exists in the installed database a document that has the same name as a document to be imported and they have the same size, then that document is ignored.

Summarising all that more succinctly, the merge operation is the union of the two databases, with the content of the installed base taking priority over the imported one as regards duplicates (for example, for two sectors with the same name, the texte and descriptif fields of the already installed database will be retained and the others will be ignored).

As regards attached documents, these will be imported as remote documents, which dispenses with listing and copying the section of the IMG/ directory concerned by the partial backup. We can then use, for each document, the local copy button to liberate ourselves from the original site. If that one has already disappeared or if it has access restrictions that can not be overcome by the receiving site, then the original IMG/ directory will need to be installed on an accessible URL, and then provided in the last data field of the import form.

We note the possibility of reducing any status values of publisheddown to proposed, which makes it possible to apply a case-by-case publication policy despite the mass import.

A complete example

The shared kernel installation of SPIP shown below makes it possible to have just a single (_DIR_CONNECT) configuration directory and a single (_DIR_DUMP) backup directory. Each site can then provide previews of the databases of each of the other sites, and can use their backups for merging with its own databases. There is also only a single directory for the (_DIR_AIDE) online help directory, since SPIP imports it step by step from the spipnet server, and a single directory for the Document Type Definitions that The integrated XML validator will search in on the W3C site and other (_DIR_XML).

The access and logging files are also collected into just two (_DIR_CHMOD and _DIR_LOG) directories.

Therefore also created at the root directory will be: config/connect, config/chmod, tmp/log, tmp/dump, tmp/xml and tmp/aide. The first one will contain A.php for connection to database A, B.php etc.

if ( preg_match(',/([a-zA-Z0-9_-]*)[/?],',$_SERVER['REQUEST_URI'],$r)) {
	if (is_dir($e = _DIR_RACINE . 'Ajouts/' . $r[1]. '/')) {
		$cookie_prefix = $table_prefix = $r[1];

		define('_SPIP_PATH', 
		_DIR_RACINE. 'Ajouts/' . $table_prefix  . '/dist/:' .
		_DIR_RACINE .'Ajouts/' . $table_prefix  . '/:' .
		_DIR_RACINE .'dist/:' .
		_DIR_RACINE .'dist/javascript/:' .
		_DIR_RESTREINT);

		$pi = $e . _NOM_PERMANENTS_INACCESSIBLES;
		$pa = $e . _NOM_PERMANENTS_ACCESSIBLES;
		$ti = $e . _NOM_TEMPORAIRES_INACCESSIBLES;
		$ta = $e . _NOM_TEMPORAIRES_ACCESSIBLES;

		$pig = _DIR_RACINE . _NOM_PERMANENTS_INACCESSIBLES;
		$tig = _DIR_RACINE . _NOM_TEMPORAIRES_INACCESSIBLES;

		define('_DIR_DUMP', $tig . 'dump/');
		define('_DIR_AIDE', $tig . 'aide/');
		define('_DIR_CACHE_XML', $tig . "xml/");
		define('_DIR_LOG', $tig . 'log/');
		define('_DIR_CONNECT', $pig . 'connect/');
		define('_DIR_CHMOD', $pig . 'chmod/');
		define('_FILE_CONNECT_INS', $table_prefix);
		define('_FILE_CHMOD_INS', $table_prefix);
		define('_FILE_LOG_SUFFIX',
			 '_' . $table_prefix . '.log');

		$GLOBALS['test_dirs'] =
		  array($pa, $ta, $ti, $pig, $tig,
		_DIR_DUMP, _DIR_LOG, _DIR_CONNECT, _DIR_CHMOD);

		spip_initialisation($pi, $pa, $ti, $ta);
	}
)

Historical note. In its earliest versions, SPIP only provided access to certain fields in a single database (which it created upon its first installation). All the same, it was possible to simulate several SPIP databases with just one (but which did not recognise the existence of each other) by prefixing the table names of each simulated database with a specific prefix.

SPIP 1.8, 1.8.1 and the new template compiler opened access to all the fields of all the databases accessible through the HTTP server. But this possibility would require writing PHP files that imitated the standard files, a step which was far from intuitive. In addition, SPIP did not distinguish between the ideas of server types and databases: in order to address several databases on the same server, it was necessary to duplicate these files and modify them. This architecture was far from satisfactory and explains why these functionalities were never actually documented, even though several SPIP extensions made liberal use of them on the quiet.

Author Mark Published : Updated : 10/07/23

Translations : català, English, français, Türkçe