Metaobject Protocols

In Fall of 2006 I did a small project on Metaobject Protocols for my CS 331 class. Here lie my notes which may perhaps be useful to others. I hope to expand them into something more useful over time.

Background

Object Protocols

An object protocol is a set of methods and specification of the interactions between the methods which provide some generic behavior (e.g. of a sequence) that are then implemented by classes which conform to the protocol (e.g. a vector or list). In most object systems a class contains both the methods which implement a protocol and the data used by the implementation. The intent is to emulate state machines which pass messages between each other.

CLOS Way of OO

The Common Lisp Object System (CLOS) is different. It separates the data and method concepts into classes and generics. A class contains data fields only, and a generic has methods specialized for certain types attached to it. This seems a bit weird at first, but is significantly more powerful as it encourages complete encapsulation through its use of classes primarily for method specialization rather than for state storage.

Classes for Scratch Data and Types

In CLOS classes store data in slots (which are the same as data members). Encapsulation is not provided; any bit of code can use slot-value to access or set the value of a slot. This may seem odd at first, but encapsulation is of questionable importance as the slots are meant only to be used by the protocol defined around the class.

Classes are defined with defclass

(defclass name (superclasses ...)
  ((slot-name :accessor slot-accessor ...)
   ...)
  (class-options ...))

(defclass example ()
  ((foo :accessor foo-of :initform 5)))

(defclass example-child (example)
  ((bar :accessor bar-of :initform (list 1 2 3))))

Slot definitions have several options; the above example shows only the :accessor and :initform options which are the most commonly used. :accessor generates an accessor for the slot (e.g. if you have an instance of example you can (setf (foo-of some-example-instance) 'some-value) to set and (foo-of some-example-instance) to access the value). :initform provides a default initial value for the slot as a symbolic expression to be evaluated when an instance is created in the lexical environment of the class definition.

Generics with Methods that Implement Protocols

Generics are like normal functions in Lisp, but they only provide a lambda list (parameter list). Methods are added to the generic which specialize on the types of their parameters and provide an implementation. This allows writing rich layered protocols which can enable selective modification of individual facets with minimal code.

(defgeneric generic (parameters ...)
  (options) ...)

(defmethod generic-name ((parameter type) parameter ...)
  "documentation string"
  body)

(defgeneric foo (bar baz quux)
  (:documentation "Process the baz with the quux capacitor to make the
foo widget fly into the sky at warp speed"))

(defmethod foo ((bar example) baz (quux capacitor))
  (launch bar (process-with quux baz)))

A method lambda list differs from a normal lambda list only in that it can specify the type of the parameter using the notation (name type). Note also that methods can specialize on the types of every argument and not just the first one. This is quite powerful for reasons outside of the scope of this presentation.

Limitations of Default Language Behavior

The behavior of a language is a compromise between many competing issues that attempts to be as generally useful as possible so that most applications will have no issue with the default behavior. There are, however, certain applications that could be cleanly written with minor modifications to the behavior of the language, but would be impossible or quite difficult to write otherwise.

Slot Storage

Most languages choose to preallocate storage for all of the slots of an instance. Now imagine a contact database that stores information about people in slots of a class. There may be dozens of slots, but often many of them will be left blank. If slot storage is preallocated much memory will be wasted and the database may not be able to fit into the memory of the hardware it must run on (perhaps for financial reasons, huge datasets, etc.).

To save memory the author of the contact database must implement his own system to store properties and allocate them lazily. This represents a fair bit of effort, and would implement a system that differed from the existing slot system of classes only regarding slot storage.

It would be useful if there were a way to customize slot allocation in instances. The customizations would be minor and require overriding only the initial allocation behavior and the behavior of the first assignment to the slot. It is a a trivial problem in a language that allows customization of these behaviors.

Design Patterns

Design Patterns are generalized versions of common patterns found in programs. Many of them are merely methods to get around deficiencies in the language, and can be quite messy to implement in some languages. Ideally a pattern would be subsumed by the language, but real world constraints require language standards to remain fairly static.

Metasoftware

Some types of programs could be written easily if the language were customizable but are nearly impossible to write when it is not.

Runtime Generated Classes

Say you wanted to write a video game where players could create their own objects, attach behaviors to the objects, and perhaps mix different objects together to create new ones. When you abstract the problem this looks just like an object system! Wouldn't it be nice if your program could create new classes and methods on the fly portably?

Object Inspection

Imagine you were developing a complicated program with many different objects that interacted in fairly complex ways. A tool to inspect the structure of objects while debugging would be quite useful, but in a traditional language would be impossible to implement portably. This could force you to purchase a certain compiler implementation which provided an inspector, and even then would likely not be customizable.

This problem can be generalized to apply to most debugging tools; it would be useful to write such tools portably because users of the language and not the compiler need to debug software. Sharing infrastructure would result in better tools (more developers), and save the man-years of wasted effort that comes with having to rewrite unportable tools from scratch multiple times.

Metaobject Protocols

Limited/Generalized Internals of the Implementation

A Metaobject Protocol (MOP) is a generalized and limited subset of the underlying language implementation. It is limited to allow multiple implementation strategies; this, along with careful design, is essential because programming language research is ever advancing and new techniques for creating more reliable and faster implementations are still being discovered.

This subset of the implementation is exported as a set of methods on metaobjects. Thus the language is implemented in itself. The system can then be customized using the extension and overriding features of the language itself.

Classes of MOPs

Reflective

A reflective MOP provides an interface to information about the running system. It exposes class relationships, the methods attached to a generic, etc. A reflective MOP often provides some functionality for creating new classes at runtime. Smalltalk was one of the first languages to expose a reflective MOP.

Example: Object Inspector

(defgeneric example-inspect (instance)
  (:documentation "Simple object inspector using CLOS MOP"))

(defmethod example-inspect ((instance t))
  (format t "Simple Object~% Value: ~S~%" instance))

(defmethod example-inspect ((instance standard-object))
  (let ((class (class-of instance)))
    (format t "Class: ~S, Superclasses: ~S~%"
            (class-name class)
            (mapcar #'class-name
                    (class-precedence-list class)))
    (let ((slot-names (mapcar #'slot-definition-name
                              (class-slots class))))
      (format t "Slots: ~%~{ ~S~%~}" slot-names)
      (inspect-loop slot-names instance #'example-inspect))))

(defun inspect-loop (slots instance inspector)
  (format t "Enter slot to inspect or :pop to go up one level: ")
  (finish-output)
  (let* ((slot (read))
         (found-slot (member slot slots)))
    (cond (found-slot
           (funcall inspector (slot-value instance slot))
           (funcall inspector instance))
          ((eq slot :pop) t)
          (t
           (format t "~S is invalid. Valid slot names: ~S~%"
                   slot
                   slots)
           (inspect-loop slots instance inspector)))))

Example: Runtime Generated Classes and Methods

Intercessory

Intercessory MOPs allow the user to customize language behavior by implementing methods which override certain aspects of the language behavior. This class of MOPs are what make MOPs especially powerful. No longer must a problem be restructured to fit the implementation language; the underlying language can be reshaped to fit the task at hand, and obfuscation of the intended structure of the application can be avoided.

Example: Lazily Allocated Slots

Example: Observer Design Pattern

A simple implementation of the observer pattern is under 100 lines, and the user level code requires only a single line of code to make any existing class observable.

In a language lacking a MOP, implementing the observer pattern requires modifying every accessor of a class to explicitly invoke any observers, and necessitates the addition of a mixin class to the class hierarchy. The fact that an object can be observed is a meta property of the class, and forcing it to be implemented at the application level dirties the inheritance hierarchy and adds unnecessary meta details to the program.

;;; This metaclass adds a slot to instances which use it, and so the
;;; system is defined in its own package to avoid name conflicts
(defpackage :observer
  (:use :cl :c2mop)
  (:export observable register-observer unregister-observer))

(in-package :observer)

;;; Metaclass
(defclass observable (standard-class)
  ()
  (:documentation "Metaclass for observable objects"))

(defmethod compute-slots ((class observable))
  "Add a slot for storing observers to observable instances"
  (cons (make-instance 'standard-effective-slot-definition
                       :name 'observers
                       :initform '(make-hash-table)
                       :initfunction #'(lambda () (make-hash-table)))
        (call-next-method)))

(defmethod validate-superclass ((class observable)
                                (super standard-class))
  t)

(defun register-observer (instance slot-name key closure)
  (register-observer-with-class (class-of instance)
                                instance
                                slot-name
                                key
                                closure))

(defun unregister-observer (instance slot-name key)
  (unregister-observer-with-class (class-of instance)
                                  instance
                                  slot-name
                                  key))

(defun get-observers (instance slot-name)
  (get-observers-with-class (class-of instance)
                            instance
                            slot-name))

(defun add-observer-table (instance slot-name)
  (setf (gethash slot-name (slot-value instance
                                       'observers))
        (make-hash-table)))

(defgeneric register-observer-with-class (class instance slot-name key closure))
(defgeneric unregister-observer-with-class (class
                                            instance
                                            slot-name
                                            key))

(defmethod register-observer-with-class ((class observable)
                                         instance
                                         slot-name
                                         key
                                         closure)
  (setf (gethash key
                 (or (gethash slot-name
                              (slot-value instance 'observers))
                     ;; Lazily add observer hash tables
                     (add-observer-table instance slot-name)))
        closure))

(defmethod unregister-observer-with-class ((class observable)
                                           instance
                                           slot-name
                                           key)
  (remhash key (gethash slot-name
                        (slot-value instance 'observers))))

(defmethod get-observers-with-class ((class observable)
                                     instance
                                     slot-name)
  (gethash slot-name (slot-value instance 'observers)))

(defmethod (setf slot-value-using-class) :before (new-value
                                                  (class observable)
                                                  instance
                                                  slot)
  (let ((slot-name (slot-definition-name slot)))
    (if (not (eq 'observers slot-name))
        (let ((observers
               (get-observers instance (slot-definition-name slot))))
          (if observers
              (maphash #'(lambda (key observer)
                           (funcall observer
                                    (if (slot-boundp instance slot-name)
                                        (slot-value instance slot-name)
                                      nil)
                                    new-value))
                       observers))))))

Violation of Encapsulation?

A MOP may seem like a violation of encapsulation by revealing some implementation details, but in reality a well designed protocol does not reveal anything which was not already exposed. Implementation decisions affect users, and some of these details do leak through to higher levels (e.g. the memory layout of slots). Implicit in the protocol specification are these implementation details, and the MOP merely makes this limited subset available for customization.

A MOP makes it possible to customize certain implementation decisions that do not radically alter the behavior of the base language. The conceptual vocabulary of the system retains its meaning, and so code written in one dialect can interact with code written in another without knowing that they speak different ones.

MOP Design Principles

Layered Protocol

A layered protocol design is good for both meta and normal object protocols, and enables a combinatorial explosion of customizations to the protocol.

Top Level Must Call Lower Level Methods

The top level methods of a layered protocol are required to call certain lower level methods to perform some tasks. This both makes it easier to customize the top level methods (which perform very broad tasks) by providing some pieces of implementation for the programmer, and enables more customization by opening up the replacement of lower level functions as a way to alter a small detail of the high level behavior.

Lower Level Methods are Easier to Customize

The lower level methods of a MOP are limited in scope and can be implemented easily. Often the desired changes to language behavior are minor, and having methods that perform simple tasks which are often customized reduces the effort required to extend the system.

Functional Where Possible

Functional protocols are preferred for MOPs (and object protocols in general). Functional protocols open up several optimizations for the implementation without burdening the user of the protocol.

Memoization

Memoization is the process of saving the results of a function call for future use. This avoids expensive recomputation of values which have not changed (recall that a true function will always return the same result when given the same arguments).

A functional MOP can be optimized easily by exploiting this property to memoize the return values of calls to expensive operations. A MOP must be be very fast to avoid making programs unusably slow, and memoization is able to give an appreciable speedup in many cases without a significant burden on memory usage.

Constant Shared Return Values

Disallowing modification of values returned by protocol methods allows the implementation to return large data structures by reference to avoid expensive copying without having to do expensive data integrity checks or copying.

Procedural Only Where Necessary

Some operations like method invocation are inherently stateful and so must use a procedural protocol. There is no benefit to be gained from using a functional protocol, and indeed an attempt would result in obtuse code that severely restricted the implementation. Do note that only a very small part of method invocation is stateful (the actual call), and most of it can be implemented functionally (e.g. computing the discriminating function).

Real World

UCW and Arnesi

Arnesi uses the CLOS MOP to implement methods which are transparently rewritten into continuation passing style. This allows their execution to be suspended at certain points and resumed later. UCW builds on top of this to support a web framework where the statelessness of http is hidden from the user; displaying a page suspends the execution of the current continuation, and resumes it upon submission. The user level code is completely unaware of this.

CLSQL

CLSQL uses the reflective part of the CLOS MOP to map Common Lisp data types into SQL types, and the intercessory protocol for slot allocation to map slots onto database columns or sql expressions (for implementing relational slots).

Elephant

Elephant uses the CLOS MOP to transparently store any class to disk and handle paging between the disk store and memory efficiently without user intervention.

Sources and Further Reading

Sources

The Art of the Metaobject Protocol

Kiczales, Gregor et al. MIT Press 1991

Highly recommended reading even if you plan to never implement a MOP or use the CLOS one. The design principles it recommends are quite useful.

CLOS MOP Specification

Specification of the MOP for CLOS defined in The Art of the Metaobject Protocol.

Metaobject Protocols: Why We Want Them and What Else They Can Do

A short overview of MOP design principles followed by three example metaobject protocols for Scheme.

Why Are Black Boxes so Hard to Reuse?

Transcription of a talk on the benefits of open implementations of software. It first discusses several problems that black box software implementations pose, and then presents existing solutions. It shows how the existing solutions are insufficient, and then provides metaobject protocols as a solution to most of the problems.

Software

Closer to MOP

Compatibility layer that attempts to present the Art of the Metaobject Protocol MOP specification properly in as many Common Lisp implementation as possible.

Last Modified: January 21, 2013