Introduction to Mesquite's modular architecture
(updated August 2000)
The Mesquite system includes the basic class libraries (primarily in the package mesquite.lib), and a series of modules (subclasses of mesquite.lib.MesquiteModule). While the basic library includes many classes that serve as windows, user interface objects, and so on, perhaps the dominant theme of Mesquite's architecture is the interaction of the modules. The modules serve as the primary organizers in Mesquite. The modules may not themselves perform all calculations and control all user interface elements, but when they don't, they at least supervise the objects that do. This page has three parts:
A simple introduction to Mesquite modularity, intended for users, is given at How Mesquite works.
Modules and the Employee Tree
All modules are subclasses of the class MesquiteModule. The core application that gets Mesquite up and running is itself a subclass of MesquiteModule, and thus the core application is also a module. It is a subclass of MesquiteTrunk.
When the core Mesquite module (the "trunk") starts up, it finds all of the available modules. (It looks into the subdirectories of the mesquite directory. ) It temporarily instantiates each of the MesquiteModule classes. From each, it gathers information, including its name and the precise subclass of MesquiteModule it represents. This information is stored in a special vector.
Modules are "hired" by other modules as employees to perform certain tasks. Typically a module will decide whether to hire a prospective employee module according to the particular subclass the module belongs to. These subclasses are referred to as duty classes, since each subclass represents a particular job or duty and employee could perform. For instance, one subclass of MesquiteModule is DrawTree, which extends the base class MesquiteModule in having a method createTreeDrawing(), which returns a special tree drawing object. The module responsible for the tree window hires a module that coordinates tree drawing, which in turn looks for a DrawTree module. There may be multiple DrawTree modules available -- one may make square trees ("phenograms"), another diagonal trees ("cladograms"), another circular trees, another may plot the nodes of the tree in a three dimensional space, and so on. The fact that a module is a subclass of DrawTree guarantees to any employer that it will perform a task in a predictable way (that is, it will have the appropriate methods). The coordinating module could choose any of these to hire. The DrawTree module chosen in turn might hire a module to help it assign node locations.
An example For instance, at one point while Mesquite is running, the employee tree for the basic tree window module might look as follows:
Basic Tree Window Maker
Basic Tree Draw Coordinator
Diagonal tree
Node Locations (standard)
Basic Draw Taxon Names
Stored Trees
Trace Character
Stored Characters
Parsimony Ancestral States
Parsimony Irreversible
Parsimony Linear
Parsimony Ordered
Parsimony Unordered
Shade states
The module Basic Tree Window Maker has three employees: Basic Tree Draw Coordinator, Stored Trees, and Trace Character. The first takes care of drawing the tree, the second supplies a tree for the tree window. The third, Trace Character, is active because the user has requested it to trace a character on the tree. In response to the request, the tree window module hired it as an employee. The Basic Tree Draw Coordinator has two employees, Diagonal Tree (which draws "cladogram" shaped trees) and Basic Draw Taxon Names. Trace Character has three employees, Stored Characters (which supplies characters from a data matrix in a file), Parsimony Ancestral States (which reconstructs states at the nodes), and Shade states (which colors the branches to reflect the reconstructed states).
Modules therefore cooperate in a tree-like arrangement of employees. The root or trunk of the whole tree is the core Mesquite class which is a subclass of MesquiteTrunk which is a subclass of MesquiteModule. Each instance of a module is a branch in this tree of employees.
Typically there will be alternative modules to perform any given task, and thus if some modules fire the employees they have and hire some alternatives, we might arrive at the following:
Basic Tree Window Maker
Basic Tree Draw Coordinator
Basic Draw Taxon Names
Circular tree
Node Locations (circle)
Simulated Trees
Harding Branching
Trace Character
Parsimony Ancestral States
Parsimony Irreversible
Parsimony Linear
Parsimony Ordered
Parsimony Unordered
Simulated Characters
Simple evolve characters
Tree of context
Label sets
The tree drawing module is now Circular tree. The source of trees is no longer Trees from file, but now Simulate Tree, which in turn hires tree simulating modules to supply trees. Trace Character is no longer using Data Set to get data from a file, but rather is using as its character source Simulate Characters, which in turn hires modules to simulate characters evolved on the current tree. Trace Character is also using a different module, Label sets, to display the results.
Benefits of modularity
From a programmer's point of view:
- some of the benefits of modularity are merely inherited from its basis in object-oriented programming, such as the isolation of different pieces of code to make programming and debugging easier.
- The fact that all modules derive from a single class and interact in similar ways means that a programmer can learn a shared set of conventions that apply to interactions between these objects, which would not be the case when dealing with a diversity of objects of deriving from different classes.
From a biologist's point of view:
- New analyses and user interfaces can be added to the system merely by adding new modules, thus allowing for an expanding set of features.
- There is great flexibility in the system, because there can be a series of alternative modules to serve a particular function, and the user can often interchange them on the fly.
- Because any given analysis usually involves several components, the number of possible analyses rises considerably more than linearly as new modules are written. If a calculation involves a tree, a character, and a means of display of results, then the number of different possible analyses is the product of the numbers of tree source modules, character source modules, and display modules.
Often it seems that in programming, the number of features rises linearly with the number of hours of effort, while the number of bugs rises considerably more than linearly. The object-orientation combined with the multiplicative rise in possible analyses as modules are added makes Mesquite behave closer to the reverse, with the number of features rising considerably more than linearly, and the number of bugs rises linearly, with programming effort.
Challenges of modularity
Although modularity has many benefits, it presents some considerable challenges. For instance, such a fluid system with different modules installed at different times, or with different modules in use at different times, presents many opportunities to confuse the user. In Mesquite, some of the challenges turned out to be unexpectedly easy to overcome. Other challenges appear more menacing, and it remains to be seen how users will adapt to a modular system. Here are some of the challenges, beginning with a fairly simple one. To understand the solutions to some of these challenges, you should be aware that many Mesquite objects accept text strings as commands, and indeed much of the user interface interacts with the objects by sending such commands to them.
- Where to place menu items needed by a module? Many modules need to have a way to interact with the user, and yet with so many modules operating it wouldn't make sense to have each one create a window to which could be tied the module's menu items and buttons. Mesquite modules can request menu items, but where are they to appear? It turns out that the tree-like structure of module employment provides an easy solution that also accords (in most circumstances) with user expectations.
Most modules do not own (i.e. supervise) a window, but some do. Thus, windows are scattered at various points along the employee tree of modules. Each window has a unique menu bar with menus and menu items. The rules for composition of the window's menu bar explain where a module's menu items will appear. The menu bar of a window will include the menus and menu items of:
all of the employers of the module that owns the window (i.e. the module's employer module, that employer's employer, and so on, back to the Mesquite trunk module).
the window's module.
any employees of the window's module that don't themselves own windows, and their employees, and so on, along the tree of modules until a module owning a window is encountered. That module, and its descendent employees, have their menu items appearing in their window's menu bar instead.
This set of rules means that a module's menu items will appear in the menu bar of that module's window if it owns a window. Otherwise, the menu items will fall back through the employee tree to the menu bar of the window of the nearest employer with a window. This is appropriate, because the user will understand that such options, belonging to an employee of the module owning the window, are options that belong to the display shown by that window. The menu items will also appear in any windows that belong to direct descendent-employees, which also is appropriate, because if they change the status of the module, it could have an effect on the output shown by the descendants (employees) as well.
- How to inform the user what can be done? (i.e. how to present relevant documentation to the user?) A fixed manual can't be written for Mesquite because its features vary as different modules are installed or uninstalled in the system. In addition, even were the set of installed modules fixed, the user interface will change (e.g. menu items will come or go) depending on what modules are currently participating in an analysis. Mesquite has aids to help the user with this confusing situation:
Each window has an information bar which allows the user to select different panels to display what modules are currently involved in a calculation, with links to the manual of each module whose manual (a web page) was found by the Mesquite system. This information bar also allows the user to learn about what the modules do, what their current parameters are, and what are the citations (authors, version, etc.) for the modules.
Mesquite has an automatic documentation system which composes web pages appropriate to the current state of Mesquite. Some of these web pages apply to the current installation of Mesquite (for instance, summarizing the installed modules and the commands available to control each of them). One of these pages applies to the precise situation facing the user at the moment: a window with its current analyses, display, and menu items. The user can request this page (the Menus & Controls Explanations page) to be composed at any time. It includes explanations for all of the menu items of the foremost window (Mesquite determines these by finding the command that the menu item sends, and the explanation for that command), as well as explanations for buttons and other user-interface objects within the window. It is fairly easy for the programmer to support this auto-documentation; the primary obligation is to embed explanation strings in the command handler methods of Commandable objects.
- How to inform the user what has been done? This is perhaps the most vexing challenge, for the solutions to date in Mesquite are not entirely satisfactory. A user planning to present an analysis in a publication needs to know exactly what was done in the analysis. Given the interactivity of Mesquite, and given the number of options presented by it (in part because of its modularity), the user could find him or herself in a situation where various options were explored, but he or she might be uncertain what exactly what was done, and thus what are the assumptions and data behind the current results. There are several features that can help:
The parameters panel of the window, available via the window's information bar, shows a summary of the parameter settings of the modules involved in the window. These parameters might indicate what character matrix or tree is being used in a calculation, what rate is being assumed in a stochastic model, and so on.
The snapshot panel of the window is perhaps the best guide to the current situation, but it is difficult to interpret as it is written in Mesquite's scripting language. It shows the set of commands that would need to be given to the module that owns the window to return the window (with its analyses and display) to its current state. (How Mesquite composes this snapshot is explained under "How to save the current state of the analysis", below.) A similar snapsot is saved to the file whenever a file is saved, so that on reopening the file the user is returned to the same situation, with windows open and analyses active, as when the file was saved. If a user wanted to document the analyses done for a publication, he or she could make available such saved files containing snapshots of the analyses performed.
The log window records commands given by the user, including those provoked by selection of menu items or buttons. This, however, would be difficult to parse through to reconstruct a long exploratory session of using Mesquite.
- How to script a system with unpredictable components? Mesquite is scriptable, but how to send commands to the ancestral state reconstruction module employed by the trace character module employed by the tree window module? The solution is for modules to have commands that return their employee modules. Thus, the tree window module can be sent a command querying it for its employee, the trace character module. The command returns a reference to the module. Because the scripting language uses the variable "It" to refer to the object returned by the previous command, the next line in the script can indicate that subsequent commands be directed to "It". Then, the trace character module can be sent a command querying for its employee, the ancestral state reconstruction module. It will be returned in "It", and then subsequent commands can be directed to it. In general, it is expected that an object that itself is Commandable will include among its commands some that return references to any Commandable objects (employee modules, windows, etc.) that it supervises.
- How to save the current state of an analysis? On saving a file, Mesquite attempts to save the current state of analyses so that when the user reopens the file, he or she is returned to the same situation that he or she left. This may appear a difficult task given the many modules and other objects (windows, etc.) involved. In fact, the solution has been fairly simple. On saving the file, Mesquite visits a module, querying it for what set of commands would, when sent to the module, return it to its current state from its default state. In this "snapshot" set of commands are included commands for the module that will return its employee modules. (Some of these will be the commands telling the module to hire a particular module as an employee; such a command will return a reference to the module hired.) After each such command is inserted the command to direct subsequent commands to the employee returned (see the comments under "How to script a system with unpredicatable components?"). Then, Mesquite queries that employee module for the commands that would return it to its current state. Then, it is queried for its employees, and so on. In this way, Mesquite traverses through the entire tree of employee modules harvesting a script of commands that would return the system to its current state. This "Snapshot" script is saved in the file, and when the file is reread, it is executed, returning the system to its former state. Such snapshots are what is shown in the snapshot panel of windows, and are also used to clone windows.
© W. Maddison & D. Maddison 1999-2000