mozdev.org

moztorch

resources:

Developer Guide


Machines
Creating
Training
Testing
Classifying Single Instance
Message Observers
Creating
Registering
Setting Message Target
Machine Options
Option Value
Option Type
Option Description
List Option Names
Debugging
Torch Format

This developer guide should give a general idea of how to make use of the MozTorch project.

Machines

The mtIMachine encapsulates most of the MozTorch functionality.

Creating

To create a machine:

  var machine = Components.classes["@pdx.edu/moztorch/mtMachine" + MACHINE_NAME];
where MACHINE_NAME is Knn, Svm, NBayes, etc. See the list of components.

Training

To train the machine:

  machine.batchTrain(FILE_NAME);
where FILE_NAME is the name of a file in Torch format. To view results of the training/testing phases (e.g., error rate), see the Message Observers section.

Testing

To test the machine (after training it):

  machine.batchTest(FILE_NAME);
where, again, FILE_NAME is the name of a file in Torch format.

Classifying Single Instance

An instance vector is a single row of unlabelled data. For instance, if the task is to predict whether a message is spam or not then this instance vector would be all the measurements of the message (e.g., how many times "mortgage" or "viagra" appears in the message). If the machine was trained with a file which had N attributes then the instance vector should have N-1 elements (see Torch Format for more information).
  // assuming machine trained on dataset with 2 feature attributes
  var inst = Array();
  inst[0] = .1;
  inst[1] = .2;
To predict the class label of the instance:
  var cls = machine.classify(inst.length, inst);

Message Observers

To get debug information from your machine, you'll need a message observer. These objects collect messages sent from various sources and print them to a target.

Creating

To create a message observer:

  var msgObserver = Components.classes["@pdx.edu/moztorch/mtMsgObserver;1"].
    createInstance( Components.interfaces.mtIMsgObserver );

Registering

To receive messages from the created machine, it must be registered. To register the message observer use the logger attribute:
  machine.logger = msgObserver;

Setting Message Target

This message observer prints to the javascript console by default. To override where messages are printed, you must access the underlying JS object directly (for more info, see JavaScript XPCOM Components Status under the "Unwrapping" section).

  var wrappedObj = msgObserver.wrappedJSObject;

Next, set the new message target set using the logWindow property. The message observer will then write all messages to this object's value property.

  var wrappedObj.logWindow = document.getElementById("some-text-box");
This is useful for printing log messages to a text box, for example.

Machine Options

Each machine supports its own set of options, such as the number of neighbors in the K-nearest neighbor algorithm, and the number of classes in the target attribute.

Option Value

To query an option's value:

  var value = getMachineOptionValue(machine, OPTION_NAME);
Note: This method lives in mtlib.js.

Alternatively, if the type of the option is known then the value can be queried directly from the machine:

  var real_value = machine.getROption(NAME_OF_REAL);
  var int_value = machine.getIOption(NAME_OF_INT);
  var bool_value = machine.getBOption(NAME_OF_BOOL);

Option Type

To query an option's type:
  var type = machine.getOptionType(OPTION_NAME);
where the type value will be one of mtIMachine.OPT_TYPE_* (one of REAL, INT, or BOOL).

Option Description

Finally, each option has a (possibly empty) help string giving its description. To query an option's description:
  var description = machine.getOptionHelp(OPTION_NAME);

List Option Names

To see which options a given machine supports, request an enumerator over its list of option names:

  var optNames = machine.getOptionNameEnum();
Alternatively, query the option names as an array:
  var names = machineOptionsToArray(machine);
Note: This method lives in mtlib.js.

Debugging

The MozTorch scripts include a method for dumping the contents of javascript variables in a tab-formatted fashion. To format a variable (e.g., a 3D array, object containing other objects, etc.):
  var str = stringify(OBJECT, NAME);
where OBJECT is the object/array/whatever to format and NAME is the optional name to print with the object.
Note: This method lives in mtlib.js.

Torch Format

Given integers N and M, the format of a Torch data file is the following:

N M
frame 1 of M real numbers
...
frame N of M real numbers
where N is the number of examples and M is the number of attributes of each example.
Training data is made up of N-1 feature attributes and 1 class attribute. All elements of dataset are real valued.
Note: For more information see section 5.3.1 of the Torch tutorial.

The moztorch project can be contacted through the mailing list or the member list.
Copyright © 2000-2009. All rights reserved. Terms of Use & Privacy Policy.