Woogle4MediaWiki/Privacy

From TeamWeaverWiki

Jump to: navigation, search

Woogle4MediaWiki will store data about its usage in order to improve the search experience. This page informs users about data logged by Woogle.

Data captured during regular operations

To help identifying missing information in the Wiki, Woogle captures search queries and search result clicks by its users.

However, the data logged by Woogle differs from data logged by most other search engines as follows:

  • Most popular (web-) search engines keep track of user's individual search queries and result clicks. A separate log entry, including a timestamp and a user id will be stored for each request. E.g. if two users search for "test" and click the first result ("www.test.com") this will typically result in four log items.
  • Woogle maintains an aggregated log of queries and clicks. This means that there will not be one log entry for each user action, but one log entry for every distinct query (i.e. if two users search for "test" this will yield only one log item) resp. distinct click targets/URL.


In particular, Woogle stores for each search query/click target:

  • The query string resp. the click target URL
  • The timestamp of the first and the last query resp. click and an average timestamp of all clicks
  • The total number of query executions/clicks and the number of distinct users issueing the query/clicking the result
  • For calculating the number of distinct users, it stores the anonymous user id of each user. Each anonymized user id is only stored once - i.e. there is no way to see how often a particular user executed the search once there was more than one user.
  • For searching, Woogle will also keep average values about the number of browser result pages

Note that this is less data than logged by typical web server software and other search engines.

Note that Woogle also allows its users to disable all query/click logging for them in the "Woogle"-Tab on Special:Preferences. This can also be set as a global configuration parameter by the Wiki adminstrator. However, we strongly discourage to disable query/click logging, since it will make core features of Woogle useless.

The particular data structures (database table schema) for keeping this log data is documented at Aggquery table (query log) resp. Aggclick table (click log).

Data usage

The stored data is currently used for three purposes:

  • To show information on how often and how many users the title of a "red link" was search for. This information will be shown in a popup window attached to red links (see also Special:WoogleHelp#help_kns).
  • To show information on how often and how many users executed a particular query before. This information will be shown on the search result page (see screenshots on Special:WoogleHelp).
  • To show the most popular searches and unsuccessful searches on Special:WoogleStatistics (currently only visible to WikiSyosops)

Data captured during scientific evaluation (optional)

In general, one has to distinguish between the standard operational deployment of Woogle (described above) and a deployment under scientific observation. During scientific observation, a specific addon has to be installed and configured, which captures addtional data for an evaluation period. See Woogle4MediaWiki/Instrumentation for details.

This page was last modified on 26 April 2010, at 15:33. This page has been accessed 4,923 times.