Woogle4MediaWiki/Instrumentation addon

From TeamWeaverWiki

< Woogle4MediaWiki
Revision as of 11:03, 6 August 2011 by Happel (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Purpose

The purpose of the instrumentation addon is to carry out (usability) experiments in MediaWiki. In a nutshell, different Wiki users receive different experimental treatments (i.e., a different text or image) based on a certain hypothesis or design goal (i.e., improve clickthrough-rate). The addon logs their response/activity after receiving the treatment. Statistically analyzing the resulting data can help researchers or designers to judge their hypothesis resp. design goal (c.f. http://en.wikipedia.org/wiki/A/B_testing).

This addon provides a set of core features for this purpose (detailed description below):

  • Configuring experimental groups & treatments
  • Assignment of users to groups
  • Instantiating experimental treatments at runtime
  • Logging user activities
  • Covering data protection and research ethics topics such as user pseudonymization & informed consent

Reusability

The instrumentation addon was designed as part of Extension:Woogle4MediaWiki to help researching its usage.

However, its general mechanism can be easily adapted for other purposes/extensions as well.

Technically, one could therefore use the whole Woogle4MediaWiki extension and disable all its original features, to just use the instrumentation part. If useful for many users, one could also think about extracting the instrumentation functionality out of Woogle into an independent extension (interested developers should contact us).

Features

Defining evaluation groups

The core of the instrumentation addon is the definition of the different experimental groups (see instrumentationGroups in the function woogleInstConfig in Instrumentation.php). An arbitrary number of groups is possible. For evaluation purposes, the typical setup is to treat one ("control") group with a standard setting, and to treat further groups with modifications based on different experimental conditions. A simple example would be testing if a visual advertisement banner results in more clicks (compared to a standard text link). See also A/B_testing.

The evaluation groups of the instrumentation addon allow for arbritrary modifications. In the context of Woogle, it makes use of Woogle's powerful configuration capabilities. Many options of the Woogle extension can be configured via WoogleConfig.php and according user preferences.

The evaluation groups can be understood as pre-defined, grouped sets of user preferences which are assigned to different users. I.e., for each evaluation group defined in instrumentationGroups, all WoogleConfig.php-parameters can be set to a certain value for that group, resulting in a different user experience/different experimental treatments for each group.

While this is the current default setup, the instrumentation addon is only loosely couple to WoogleConfig.php. Arbitrary other ways of implementing different treatments are possible.

Assigning users to evaluation groups

The assignment of users to the configured evaluation groups (see above) takes place in the function assignInstrumentationGroup() (in Instrumentation.php). Currently, only a random assignment is implemented, but the function can be easily modified.

For logged in users, the assignment will only be made once and remains stable across site visits.

Pseudonymization

For logging purposes, each user is assigned an anonymized id (psyeudonym).

It is implemented in WoogleUtil::getAnonUserID() as follows:

  • for anonymous users: their hashed IP address
  • for logged in users their RandomUserID generated by Woogle or - if this is null - their hashed MediaWiki user name (RandomUserId = md5(rand())

For logged in users, the pseudonym is stored in the MediaWiki database and will remain consistent unless the user resets it via user preferences. However, the idea si that the connection between user name and pseudonym is not visible for the evaluator - only the Wiki Sysop might make this connection.

Informed consent

Users can optionally be presented with a customizable information dialog which explains the goal of the study and requires them to explicitly decide on if to participate or not.

Informed consent is only available for logged in users - not for anonymous accounts! It is realized by the Special Page WoogleSpecialStudy.

User preferences

User may also disable (or re-enable) their participation in Preferences.

Logging user activities

By "logging targets" we mean user activities that should be logged by the instrumentation addon. In general, it can log arbitrary activities.

The usual workflow is like that:

  1. Think about user activity which should be logged
  2. Find a place in the MediaWiki/extension code which is suitable to determine if a user carries out an activity or not
  3. Add instrumentation logging code (i.e., call InstrumentationLogger; see below)
  4. Later: analyze logfiles written

Example:

  1. We want to log successful login activities
  2. The UserLoginComplete seems to be a good place to determine user logins
  3. We register for this hook and call InstrumentationLogger (actual example in here).

Due to their modular nature, MediaWiki hooks have turned out as a good choice to add logging code, but the instrumentation addon is not limited to this.

The instrumentation addon comes with a number of pre-defined logging targets (defined in Instrumentation.php, which can be easily extended:

  • User logging in (code: user_login)
  • User editing new page (redlink)
  • User creating a watch (watch_created)
  • User saving an article (article_save)
  • User saves/changes preferences (pref_save)
  • User receives search result with some results (mwsearch_results)
  • User receives search result with no results (mwsearch_noresults)
  • Woogle4MediaWiki-specific logging targets
    • User executes Woogle search (woogle_search_query)
    • User clicks Woogle search result (woogle_result_click)
    • Icons displayed to the user on Woogle search results (woogle_search_icons)
    • Icons displayed to the user on red links (woogle_redlink_icons)

A logging target should at least log its code (see values above) to allow analysis. Additional parameters may be concatenated to the log string, separated with ";".

The actual logging is implemented in WoogleInstrumentationLogger. In this implementation, the log is text-based (CSV-formatted). Each line contains a timestamp, the pseudonymized user id (see above) and an arbitrary string defined by each logging targets itself.

Data analysis

The instrumentation framework does currently not include any automated data analysis. The usual worflow is to import the CSV-formatted log files to statistical analysis tools such as R, SPSS or any spreadsheet calculator.

Installation

The code for the instrumentation addon can be obtained from SVN (https://waves1.fzi.de/svn/waves/trunk/Woogle4MediaWiki/addons/Instrumentation/).

Instrumentation mode can be enabled by setting $config['instrumentation'] in extensions\Woogle\addons\Instrumentation\Instrumentation.php to "true".

Per default, only user logging in with an own account are targeted by the instrumentation. Setting instrumentationAnon true will also log anonymous users.