Woogle4MediaWiki/Admin manual

From TeamWeaverWiki

(Difference between revisions)
Jump to: navigation, search
(General configuration)
Line 73: Line 73:
=== Final steps ===
=== Final steps ===
-
* Open MediaWiki and see if the system runs properly
+
* Open MediaWiki and see if the system runs properly (if problems occur - see the [[Woogle4MediaWiki/Admin manual#Dealing_with_problems|"dealing with problems" section]] below!).
* Open <code>[MediaWiki URL]/index.php/Special:WoogleConfig</code> in your browser (which requires a WikiSysop user) and check the "status" section on top
* Open <code>[MediaWiki URL]/index.php/Special:WoogleConfig</code> in your browser (which requires a WikiSysop user) and check the "status" section on top
** For WoogleNative, click the button "Create database tables". If Woogle tells "You need db root access to create tables", your MediaWiki database user does not seem to have <code>CREATE TABLE</code> privileges (see prerequisites). You may either grant these rights via the database interface, or do the "Database adminstration via script" (see Additional topics section below)
** For WoogleNative, click the button "Create database tables". If Woogle tells "You need db root access to create tables", your MediaWiki database user does not seem to have <code>CREATE TABLE</code> privileges (see prerequisites). You may either grant these rights via the database interface, or do the "Database adminstration via script" (see Additional topics section below)
Line 79: Line 79:
*** Select "(Re-)Create Index (Direct)" to create the index immediately, within one HTTP session. For large Wikis, this will block your browser and probably even result in a script timeout (depending on your server configuration)
*** Select "(Re-)Create Index (Direct)" to create the index immediately, within one HTTP session. For large Wikis, this will block your browser and probably even result in a script timeout (depending on your server configuration)
*** Thus you may '''alternatively''' select "(Re-)Create Index (Jobs)", which will create tasks in your MediaWiki job queue for indexing each page. The job queue usually (depending on your MediaWiki configuration) processes a handful jobs with each user request. Thus, for a large Wiki, it will take considerable time until all indexing jobs have been processed. If your Wiki has low traffic, it might even take longer. You can alread search during this process, but you might not receive full results. See "Run indexing jobs" under "Additional topcis" on how to do batch indexing.
*** Thus you may '''alternatively''' select "(Re-)Create Index (Jobs)", which will create tasks in your MediaWiki job queue for indexing each page. The job queue usually (depending on your MediaWiki configuration) processes a handful jobs with each user request. Thus, for a large Wiki, it will take considerable time until all indexing jobs have been processed. If your Wiki has low traffic, it might even take longer. You can alread search during this process, but you might not receive full results. See "Run indexing jobs" under "Additional topcis" on how to do batch indexing.
-
*** For '''larger wikis''' (> 500 pages) we '''strongly recommend''' to used the command line script <code>php extensions/Woogle/maintenance/WoogleReIndexAll.php [options...]</code> for index creation!
+
*** For '''larger wikis''' (> 500 pages) we '''strongly recommend''' to used the command line script <code>php extensions/Woogle/maintenance/WoogleReIndexAll.php [options...]</code> for index creation! ([[Woogle4MediaWiki/Admin manual#Indexing_via_script|see below for details]])
* Enter <code>[MediaWiki URL]/index.php/Special:Woogle</code> to your browser to check if everything works
* Enter <code>[MediaWiki URL]/index.php/Special:Woogle</code> to your browser to check if everything works
Line 92: Line 92:
== Additional topcis ==
== Additional topcis ==
-
=== Database adminstration via script ===
+
=== Dealing with problems ===
 +
If problems occur, you might first see [[Woogle4MediaWiki/FAQ#Administrator_questions|the FAQ]].
 +
 
 +
If you see a blank page after enabling Woogle, adding the following code to your <code>LocalSettings.php</code> might help to track down the problem:
 +
<pre>
 +
# Error reporting (only useful for debugging)
 +
error_reporting( E_ALL );
 +
ini_set( 'display_errors', 1 );
 +
$wgShowExceptionDetails = true;
 +
$wgShowSQLErrors = true;
 +
</pre>
 +
 
 +
=== Database administration via script ===
If your MediaWiki database user does not have sufficient privileges to add/delete tables, you have to use the command line script instead of conveniently using the buttons at Special:WoogleConfig.
If your MediaWiki database user does not have sufficient privileges to add/delete tables, you have to use the command line script instead of conveniently using the buttons at Special:WoogleConfig.
Line 102: Line 114:
* <code>--delete</code> Delete Woogle database tables. If not selected, this script tries to create them.
* <code>--delete</code> Delete Woogle database tables. If not selected, this script tries to create them.
-
=== Run indexing jobs ===
+
=== Indexing via script ===
-
If you chose to "index as jobs" in a large Wiki, this might produce a huge job queue which takes a while to fill the index, since in MediaWiki, a small fraction of jobs is processed with every user request. However, MediaWiki provides a script <code>/maintenance/runJobs.php</code> to process the job queue from the command line. Follow the [http://www.mediawiki.org/wiki/Manual:Job_queue MediaWiki manual] concerning this option.
+
Especially for larger Wikis, it makes sense to run indexing tasks from the command line instead of the web browser. There are two scripts, that can support you with this task:
 +
* <code>php extensions/Woogle/maintenance/WoogleReIndexAll.php [options...]</code> is our recommended approach, since it will rebuild the index without any further prerequisites. The following options can be set:
 +
** <code>-f</code> to force indexing without 10 sec waiting delay
 +
** <code>--server=http://www.mywikihost.org</code> - server name (e.g. http://www.mywikihost.org) - without trailing slashes - to avoid that Woogle stores Wiki page URLs with <code>http://localhost</code> in the index. Alternatively set $wgServer explicitly in LocalSettings.php (c.f. [http://www.mediawiki.org/wiki/Manual:LocalSettings.php#Server_name]).
 +
** <code>--conf=../../../LocalSettings2.php</code> - this is only necessary, but particularly useful, if you would like to create an index without touching the productive Wiki system (which is configured in LocalSettings.php). You may just make a copy of LocalSettings.php and only include Woogle in this configuration.
 +
* If you chose to "index as jobs" in <code>Special:WoogleConfig</code>, Woogle will add small indexing jobs (one job for each Wiki page) to the MediaWiki job system. In a large Wiki, this might produce a huge job queue which takes a while to fill the index, since in MediaWiki, a small fraction of jobs is processed with every user request. However, MediaWiki provides a script <code>/maintenance/runJobs.php</code> to process the job queue from the command line. Follow the [http://www.mediawiki.org/wiki/Manual:Job_queue MediaWiki manual] concerning this option.
=== Modify or translate UI text ===
=== Modify or translate UI text ===

Revision as of 10:10, 28 April 2010

Contents

Installation

Before starting, it is important to understand that Woogle4MediaWiki can be run in two different modes:

  • WoogleNative - which is purely PHP-based - you can only search and index your Wiki as such, but no data outside the Wiki
  • WoogleRemote - which connects to an Integrated Search backend - you can search any kind of data indexed by the backend (including e.g. file systems, SVN ressources etc.)

If you are in a hurry, there is a less elaborate quick install guide for WoogleNative.

Prerequisites

  • Server
    • Woogle4MediaWiki requires MediaWiki in a version >= 1.11.0 (see also compatibility notes)
    • Note: If you are using PHP 5.3.0 and beyond, some compability issues with older versions of MediaWiki (<1.14.1) and Woogle (< 1.0-RC2) will arise. Make sure to work with a recent version.
    • For using WoogleNative,
      • we strongly recommend the PCRE and mb_string extensions to be installed in your PHP runtime environment (although not strictly neccessary)
      • you will need a database user with CREATE TABLE privileges (or the database root account)
    • For WoogleRemote you need an installed Integrated Search backend, and some configuration data (backend URL, repoId, pushIndexAuthKey; c.f. for push indexing). A network connection from the MediaWiki server to the Integrated Search backend is required.
  • Client
    • Woogle will most probably work with any browser which is supported by MediaWiki itself (see also compatibility notes)
    • Some advanced features of Woogle (autocomplete, red link popus, result annotation) make use of Ajax and might not work with old browsers. However, the Woogle core features should not be affected by this.

Installation steps

  • Download the appropriate distribution file. There are distributions for WoogleRemote, WoogleNative and one including both.
  • Extract the content of the Woogle distribution ZIP-file to [Your MediaWiki directory]/extensions/Woogle
  • Grant write permissions for the executing user on [Your MediaWiki directory]/extensions/Woogle/logs and on [Your MediaWiki directory]/extensions/Woogle/addons/Native/index, if you are using the Native addon
  • At the end of the file [Your MediaWiki directory]/LocalSettings.php add the following line: require_once("$IP/extensions/Woogle/Woogle.php");
  • Proceed with configuration

Configuration

General configuration

To configure Woogle, add a statement WoogleConfig::set('parameter', 'value'); to LocalSettings.php for each configuration parameter you want to change. These statements have to be below the inclusion of Woogle.php.

The default values plus description of all parameters can be found in [Your MediaWiki directory]/extensions/Woogle/includes/WoogleConfig.php. Do not modify any values there. Alternatively, you can see all configuration values on the special page [MediaWiki URL]/index.php/Special:WoogleConfig (e.g. Special:WoogleConfig).

Open the file [Your MediaWiki directory]/extensions/Woogle/WoogleConfig.php and scroll down to the WoogleConfig class to configure Woogle. You can call [MediaWiki URL]/Special:WoogleConfig (which requires a WikiSysop user) in your browser to see configured values at runtime.

Example snippet for LocalSettings.php with most important settings:

require_once("$IP/extensions/Woogle/Woogle.php");

// Optional Woogle Configuration
//WoogleConfig::set('core', false);					// set false to completely dectivate Woogle
//WoogleConfig::set('replace', false);				// set false to use MediaWiki built-in search from MediaWiki search box (Woogle is only used when directly called from Special:Woogle then)
//WoogleConfig::set('advanced', false);				// set false to disable Woogle:-Namespace and thus enable Special:Woogle only
//WoogleConfig::set('redLinkInfo', false);			// set false to disable JavaScript popups for red links
//WoogleConfig::set('limitAccessToGroups', true);		// set true to restrict Woogle usage to groups defined in $groups below
//WoogleConfig::set('groups', array('sysop'));		// groups access is limited to if $limitAccessToGroups
//WoogleConfig::set('clickTracking', false);			// set false to avoid tracking of user's result clicks (for statistical purposes)

Woogle creates additional MediaWiki namespaces. If you are using other MediaWiki extensions which define namespaces (such as Semantic MediaWiki), you have to set a starting index of free namespaces Ids for Woogle by setting $wooNamespaceIndex = 110. The implicit default value is 100.


See the documentation for each value on the right side. Typically, there is no need to change most of these values.

For WoogleRemote, you need to make additional settings (see below).

Specific settings for WoogleRemote

Default settings are documented in [Your MediaWiki directory]/extensions/Woogle/addons/Remote.php. As with the general settings, do not edit directly, but add statements to LocalSettings.php

Example snippet for LocalSettings.php with most important settings:

require_once("$IP/extensions/Woogle/Woogle.php");

// Woogle Configuration
WoogleConfig::set('remoteBaseServiceUrl', 'http://octopus13.fzi.de:9999/teamweaverIS-backend/services/'); // set the respective backend URL here
WoogleConfig::set('remoteAuthKey', 'secret');  // key for querying, only required if the backend is set to securityEnabled = true;
WoogleConfig::set('indexGroups', array('group1', 'group2'));
WoogleConfig::set('indexRepository', '123'); // numeric id for indexing (repoId) - as configured in the backend repo_config.xml
WoogleConfig::set('remotePushIndexAuthKey', 'secret2'); // key for indexing - as configured in the backend repo_config.xml

As you can see, some parameters depend on configuration choices related to the integrated search backend configuration (see also repo_config.xml).

Final steps

  • Open MediaWiki and see if the system runs properly (if problems occur - see the "dealing with problems" section below!).
  • Open [MediaWiki URL]/index.php/Special:WoogleConfig in your browser (which requires a WikiSysop user) and check the "status" section on top
    • For WoogleNative, click the button "Create database tables". If Woogle tells "You need db root access to create tables", your MediaWiki database user does not seem to have CREATE TABLE privileges (see prerequisites). You may either grant these rights via the database interface, or do the "Database adminstration via script" (see Additional topics section below)
    • If everything looks fine on the "status" section, you are ready to create the search index for the existing pages (afterwards, all page changes will update the index automatically)
      • Select "(Re-)Create Index (Direct)" to create the index immediately, within one HTTP session. For large Wikis, this will block your browser and probably even result in a script timeout (depending on your server configuration)
      • Thus you may alternatively select "(Re-)Create Index (Jobs)", which will create tasks in your MediaWiki job queue for indexing each page. The job queue usually (depending on your MediaWiki configuration) processes a handful jobs with each user request. Thus, for a large Wiki, it will take considerable time until all indexing jobs have been processed. If your Wiki has low traffic, it might even take longer. You can alread search during this process, but you might not receive full results. See "Run indexing jobs" under "Additional topcis" on how to do batch indexing.
      • For larger wikis (> 500 pages) we strongly recommend to used the command line script php extensions/Woogle/maintenance/WoogleReIndexAll.php [options...] for index creation! (see below for details)
  • Enter [MediaWiki URL]/index.php/Special:Woogle to your browser to check if everything works

Uninstalling Woogle

  • Note that Woogle can be disabled by any user individually via his/her MediaWiki user preferences (Special:Preferences).
  • Woogle can be deactivated by adding WoogleConfig::set('core', false); or by commenting out all Woogle statements in LocalSettings.php. This will do no harm - the MediaWiki system will immediately fall back to the built-in search.
  • If you want to physically remove Woogle, do the following:
    • For WoogleNative, visit [MediaWiki URL]/index.php/Special:WoogleConfig (which requires a WikiSysop user) and click the button "Remove database tables". If you do not have sufficenit permissions to do so, try the "Database adminstration via script" (see Additional topics section below)
    • Comment out or remove all Woogle statements in LocalSettings.php
    • Remove Woogle directory from [Your MediaWiki directory]/extensions/

Additional topcis

Dealing with problems

If problems occur, you might first see the FAQ.

If you see a blank page after enabling Woogle, adding the following code to your LocalSettings.php might help to track down the problem:

# Error reporting (only useful for debugging)
error_reporting( E_ALL );
ini_set( 'display_errors', 1 );
$wgShowExceptionDetails = true; 
$wgShowSQLErrors = true;

Database administration via script

If your MediaWiki database user does not have sufficient privileges to add/delete tables, you have to use the command line script instead of conveniently using the buttons at Special:WoogleConfig.

Therefore, you need to configure a database user with suitable privileges in the file AdminSettings.php in the root of your MediaWiki directory (c.f. [1] resp. [2]).

Afterwards, you may call php extensions/Woogle/addons/Native/maintenance/WoogleNativeDb_setup.php [options...], whereas options are:

  • --user <dbuser> Database user account to use for changing DB layout. If not set, the credentials in AdminSettings.php are used.
  • --password <dbpassword> Password for user account to use. (Instead of custom password.)
  • --delete Delete Woogle database tables. If not selected, this script tries to create them.

Indexing via script

Especially for larger Wikis, it makes sense to run indexing tasks from the command line instead of the web browser. There are two scripts, that can support you with this task:

  • php extensions/Woogle/maintenance/WoogleReIndexAll.php [options...] is our recommended approach, since it will rebuild the index without any further prerequisites. The following options can be set:
    • -f to force indexing without 10 sec waiting delay
    • --server=http://www.mywikihost.org - server name (e.g. http://www.mywikihost.org) - without trailing slashes - to avoid that Woogle stores Wiki page URLs with http://localhost in the index. Alternatively set $wgServer explicitly in LocalSettings.php (c.f. [3]).
    • --conf=../../../LocalSettings2.php - this is only necessary, but particularly useful, if you would like to create an index without touching the productive Wiki system (which is configured in LocalSettings.php). You may just make a copy of LocalSettings.php and only include Woogle in this configuration.
  • If you chose to "index as jobs" in Special:WoogleConfig, Woogle will add small indexing jobs (one job for each Wiki page) to the MediaWiki job system. In a large Wiki, this might produce a huge job queue which takes a while to fill the index, since in MediaWiki, a small fraction of jobs is processed with every user request. However, MediaWiki provides a script /maintenance/runJobs.php to process the job queue from the command line. Follow the MediaWiki manual concerning this option.

Modify or translate UI text

Woogle user interface text is modularized to language files located in /extensions/Woogle/languages/. Edit these files to adapt texts. To create translations for other languages, copy an existing file (probably WoogleLangEnglish.php) and translate the content. Please contribute such translations back to us!

Modify the user interface

Woogle UI can be configured using several configuration parameters, which are documented in WoogleConfig.php.

Besides that, many visual aspects of Woogle are captured in CSS files which you may customize.

Instrumentation addon

Woogle provides an addon for scientific evaluation (called "instrumentation") which captures user activities in a log file. Also, the extension allows to modify certain Woogle (UI) settings for different groups of users and provides a confirmation screen that allows to gain user's explicit consent for logging data.

Instrumentation mode can be enabled by setting $config['instrumentation'] in extensions\Woogle\addons\Instrumentation\Instrumentation.php to "true". Further settings can be changed there as well.