Atomized

Old McCarthy Had a Form

What is EIEIO?

EIEIO is an Emacs Lisp implementation of (most of) the Common Lisp Object System (CLOS), which is a library for writing object-oriented Lisp. Both CLOS and EIEIO are implemented as normal Lisp code, with no special support from the runtime.

Why OOP Lisp?

Different problems require different solutions, which is why the most successful programming languages are multi-paradigm. There’s no single approach that’s optimal for every program, so choosing the best fit for your problem yields better results than cramming everything into the same model. OOP’s strengths make it a good choice for many medium- to large-sized programs needing encapsulation, abstraction, and/or extensibility.

My goal with this is for Emacs Lisp programmers to recognize cases where EIEIO is a good fit, and think to themselves "aha! there’s a thing for that!", then get on with writing their program instead of creating tooling that reimplements part of what EIEIO already does.

The CLOS Model

If you’ve used OOP languages before, you may find the CLOS approach surprising. It’s object-oriented, but very different than popular OOP languages like Java, Python, Ruby, C++, etc.

Classes

Classes are similar to other OOP languages: they define how a structure encapsulates data. Unlike other languages, CLOS classes only have fields ("slots," in CLOS parlance) — no methods.

Here’s an example of a base class for an EMMS player. It only has one slot, a boolean to indicate if it’s playing or not.

(defclass form/emms-player ()             ; parent class/es
  ((playing :type boolean :initform nil)) ; slot (field) defintions
  :abstract t)                            ; EIEIO extension

The :abstract option is an EIEIO extension, but it’s a good one. Similar to Java, an abstract class may be extended, but cannot be instantiated by itself. They’re not always necessary, but in this case it represents a lower type bound.

Once the base class is defined, it can be extended. Here’s one which can hold an MPV implementation of the form/emms-player:

(defclass form/emms-player-mpv (form/emms-player) ; Child of form/emms-player
  ())                                             ; no additional slots

Instantiation

Defining the class also defines a constructor function with the same name, so getting an object is as simple as:

(form/emms-player-mpv)

This is the normal way of creating objects, but if you’re doing more dynamic programming and need to instantiate different classes based on a symbol value, make-instance makes that easier:

(let ((class 'form/emms-player-mpv))
  (make-instance class))

The #s reader macro will evaluate to an equivalent object, which means objects can be printed and read as normal:

(list #s(form/emms-player-mpv nil)
      (read "#s(form/emms-player-mpv nil)"))

Accessing Slots

Once you have a class instance, you can access the slots in three ways.

The simplest are the oref and oset macros. Since they’re terse and don’t evaluate the slot argument, they can be convenient in some cases.

(let ((p (form/emms-player-mpv)))
  (progn (oset p playing t)
	 (oref p playing)))

The slightly more verbose slot-value function takes an evaluated slot name, making it a good choice if you need to store the slot symbol in a variable.

(let ((slot 'playing))
  (slot-value (form/emms-player-mpv) slot))

Both styles are =setf=’able:

(let ((p (form/emms-player-mpv)))
  (list
   (progn (setf (slot-value p 'playing) t)
	  (slot-value p 'playing))
   (progn (setf (oref p playing) nil)
	  (oref p playing))))

But the nicest way is the with-slots macro, which is a bit like a destructuring let-bind. The first argument is a list of slot symbols, which behave like ordinary variables within the body of the macro. The bound symbols are also =setf=’able:

(with-slots (playing) (form/emms-player-mpv)
  (list playing (progn (setf playing t) playing)))

Slot Initialization

There are three ways to set the initial value of a slot.

The class’ :initform (if specified) is the default if no other value is provided. You’re not limited to constants or static defaults — the form is evaluated at object construction time. So you can do things like initialize a slot with the current time:

(defclass form/now ()
  ((time :initform (current-time-string))))

(list
 (oref (form/now) time)
 (progn (sleep-for 1)
	(oref (form/now) time)))

If the class has an :initarg option set, it specifies the keyword used to initialize the slot when the object is created.

(defclass form/now ()
  ((time :initform (current-time-string)
	 :initarg :time)))

(list (slot-value (form/now :time "about half past noon") 'time)
      (oref (make-instance 'form/now :time "a quarter till 1") time))

The third way is to define an initialize-instance method, which is useful for cases where initialization depends on the value of multiple slots. But I haven’t covered generic functions and methods yet, so let’s do that now.

Generic Functions

CLOS implements dynamic dispatch from a generic function to a method. The generic function is like an interface: it only specifies the function name and argument list. CLOS doesn’t have explicit, named interfaces.

This example shows some generic functions that would work with an EMMS player object. EMMS doesn’t use CLOS, but these are the same functions it needs to define a player backend.

(cl-defgeneric form/emms-playablep (player track))
(cl-defgeneric form/emms-start     (player track))
(cl-defgeneric form/emms-stop      (player t))

Methods & Specialization

Methods are implementations of generic functions. In the same way that a Java interface can have many classes implementing it, a generic function may have many methods. Methods are never called directly; code calls the generic function (just like any other function), and EIEIO dynamically dispatches to an appropriate method. The method definitions include specializers which tell CLOS when to invoke them.

Defining a method will implicitly create the generic function, if it hasn’t already been defined.

Building on the EMMS example, here are some fake implementations of the EMMS player functions:

;; Method applies when PLAYER arg is an instance of form/emms-player-mpv (or a subclass)
(cl-defmethod form/emms-playablep ((player form/emms-player-mpv) track)
  (not (eql :unplayable track)))       ; MPV can play almost anything!

(cl-defmethod form/emms-start     ((player form/emms-player-mpv) track)
  "Started")

(cl-defmethod form/emms-stop      ((player form/emms-player-mpv))
  "Stopped")

In this example, the first argument contains the specializer, which says that the player argument must be an instance of form/emms-player-mpv (or an instance of a subclass of it).

Even though you can dispatch on subclass-of-type, you aren’t limited to it. You can:

  • Dispatch on primitive type: (x integer)
  • Dispatch on value: (x (eql 5))
  • Dispatch on properties of multiple arguments: ((x integer) (y float))
  • Fallback, default dispatch: (x t)
  • Define your own: see cl-generic-generalizers

Qualifiers

CLOS uses a qualifier to support four different kinds of methods. The examples so far have had empty qualifiers, which makes them the primary method; these are equivalent to methods or functions in other languages. In addition to those, CLOS has:

  • :before. Evaluated before the primary method(s).
  • :after. Evaluated after the primary method(s).
    • Both before/after are for side effects only; their return values are discarded.
  • :around. Evaluated around the all other method types. Their return value is the return value of the function, and they may choose to return the primary method’s, or substitute it with their own.

Both the primary and around methods can choose to run the rest of the methods, by calling cl-call-next-method, or choose not to, returning their own value instead.

If you’ve done much Emacs Lisp programming, those last three type will be very familiar: they replicate Emacs’ built-in function advice. But, where advice runs globally on every function call, qualified methods may be composed into an EIEIO class hierarchy. This allows freely mixing in new behavior to existing code, without need for guards.

Multiple Inheritance

In addition to the single inheritance in the previous examples, EIEIO supports full multiple inheritance. This is a very powerful mechanism which allows deeper separation of concerns than most languages, where extremely generic code may be composed to get the precise behavior desired.

For example, here’s an implementation of a logging class. It’s completely separated from any other concerns: it only logs, and only knows about logging.

(defclass form/logger ()
  ((messages :initform nil)))

(cl-defmethod form/log ((logger form/logger) format-string &rest args)
  (with-slots (messages) logger
    (push (apply #'format format-string args) messages)))

(cl-defmethod form/logs ((logger form/logger))
  (with-slots (messages) logger messages))

(cl-defmethod form/latest-log ((logger form/logger))
  (car (form/logs logger)))

(let ((l (form/logger)))
  (form/log l "Hello, %s!" "world")
  (form/latest-log l))

Building on that, a generic logging EMMS player class can be defined, which adapts the logger class to the EMMS player interface.

The methods use the three novel qualifiers: :before is used to emit a log prior to a player starting; :after is emits a lot after the player stops, and :around logs whether a track was playable.

(defclass form/emms-logging-player (form/logger)
  ())

(cl-defmethod form/emms-start :before ((player form/emms-logging-player) track)
  (form/log player "Playing track: `%s'" track))

(cl-defmethod form/emms-stop :after ((player form/emms-logging-player))
  (form/log player "Stopped"))

(cl-defmethod form/emms-playablep :around ((player form/emms-logging-player) track)
  (let ((playable (cl-call-next-method player track)))
    (prog1 playable
      (form/log player "Track `%s' is%s playable" track (if playable "" "n't")))))

With those two classes defined, they can be composed into a logging MPV player, by defining a new class which extends a concrete MPV player class and a logger.

(defclass form/emms-logging-player-mpv (form/emms-logging-player form/emms-player-mpv)
  ())

(let ((ml (form/emms-logging-player-mpv)))
  (list (list (form/emms-playablep ml :unplayable) (form/latest-log ml))
	(list (form/emms-playablep ml :foo) (form/latest-log ml))
	(list (form/emms-start ml :foo) (form/latest-log ml))
	(list (form/emms-stop ml) (form/latest-log ml))))

Note how generic this code is:

  • form/logger only knows about logging: it doesn’t know that it’s logging EMMS actions.
  • form/emms-player-mpv only knows about playing, and doesn’t have any logging code at all.
  • form/emms-logging-player only cares about logging actions, and knows nothing about the specific player backend. It can be composed with any player backend to extend its functionality.
  • form/emms-logging-player-mpv only combines classes, it has no special knowledge of what they do.

The combination of multiple inheritance and before/after/around methods allows extremely rich & flexible functionality. Around methods mean you can add new functionality like memoization or performance profiling to existing code, without having to change it at all.

Structures

If you only need to encapsulate data, there’s also cl-defstruct. They don't support multiple inheritance, but perform better if you're just encapsulating data. Since generic functions can dispatch on any type, they can still participate in interface abstractions.

Nice Properties

There are a bunch of things I like about how this works.

It feels like normal Lisp. There’s limited new syntax, and you interact with the sytem by calling functions, as usual.

If you only need to encapsulate data, you can only use classes or structs. If you don't need a bunch of interface abstraction, you can pass objects to normal functions. If you only need interface abstraction, you can write generic functions without using classes or structs. This ala carte approach allows using only what tools are needed, instead of pulling in unnecessary baggage or boxing you into a particular model.

Multiple inheritance means you get type punning. An instance of form/emms-logging-player-mpv is both a logger and an EMMS player, and can be substituted for code which expects either.

An upshot of that is that third-party code can be adapted to any interface without needing to modify it. Whereas Java-type systems may require building an adapter class (which then isn't an instance of the wrapped type), CLOS doesn't have this problem.

Dynamic dispatch is fully controlled by the methods, rather than the generic functions. A programmer who creates an interface may not be able to forsee all the ways it needs to be used. Methods declaring their own dispatch conditions means that if you’re implementing an interface, you’re guaranteed to be able to express the dispatch conditions you need.

Because the calling convention is (function argA argB…), there’s no possibility for a null pointer exception, because there’s nothing to be null; the generic function always exists.

And because you can define a method specialized to a nil argument, you can have the same nil punning as in the rest of Lisp.

Practical Examples

Now that the model's laid out, how can it be put to use? I think EIEIO solves a lot of very practical problems.

Example: transmission.el

Here’s some abbreviated code from Emacs’ Transmission torrent client:

(defcustom transmission-host "localhost")
(defcustom transmission-service 9091)
(defcustom transmission-rpc-path "/transmission/rpc")
(defcustom transmission-rpc-auth '(:username "transmission"
					     :password "trance mission"))

For this to work, all four variables have to be set, and set consistently. Annoyingly, the customization interface sorts them alphabetically, by variable name, instead of grouped logically, by function. So they end up strewn all over the buffer.

Instead, these could be expressed as an EIEIO class:

(defclass form/transmission-connection ()
  ((host :type string :initarg :host
	 :initform "localhost"
	 :custom string
	 :documentation "Hostname or IP address to connect to.")
   (port :type integer :initarg :port :initform 9091
	 :custom integer :label "Port"
	 :documentation "TCP port on host to connect to.")
   (path :type string :initarg :path :initform "/transmission/rpc"
	 :custom string
	 :label "Path"
	 :documentation "Path to Transmission RPC endpoint on HOST:PORT.")
   (username :type string :initarg :username :initform "transmission"
	     :custom string :label "Username"
	     :documentation "User to authenticate as")
   (password :initarg :password :initform nil
	     :custom (choice
		      (const :tag "Fetch from auth-source" nil)
		      (string :tag "Password"))
	     :label "Password"
	     :documentation "Password for USERNAME"))

  :documentation "A class encapsulating a Transmission connection.")

This has two benefits. Firstly, Emacs’ global namespace is less polluted. With four variables, they all need the transimission- prefix, which is kludgey. Using a class allows for a single global variable, bound to an object, with the data neatly tucked into its slots.

Secondly, Emacs’ customization interface knows how to edit EIEIO objects. If you load eieio-custom, and specify the variable holding the object as type 'object, like this:

(require 'eieio-custom)

(defcustom form/transmission-conn (form/transmission-connection)
  "The Transmission connection."
  :type 'object
  :tag "Transmission connection")

Then the individual slots show up as separate (but grouped) customizable fields, making it very easy to edit, and also to express invariants about validity, making the configuration process much smoother.

Example: sql.el

Here’s the definition of the PostgreSQL backend for the built-in sql.el:

(postgres                               ; from sql-product-alist
 :name "Postgres"
 :font-lock sql-mode-postgres-font-lock-keywords
 :sqli-program sql-postgres-program
 :sqli-options sql-postgres-options
 :sqli-login sql-postgres-login-params
 :sqli-comint-func sql-comint-postgres
 :list-all ("\\d+" . "\\dS+")
 :list-table ("\\d+ %s" . "\\dS+ %s")
 :completion-object sql-postgres-completion-object
 :prompt-regexp "^[[:alnum:]_]*=[#>] "
 :prompt-length 5
 :prompt-cont-regexp "^[[:alnum:]_]*[-(][#>] "
 :statement sql-postgres-statement-starters
 :input-filter sql-remove-tabs-filter
 :terminator ("\\(^\\s-*\\\\g\\|;\\)" . "\\g"))

There’s a similar struct for every supported backend. The keywords on the left are common, shared among all backends, and the values on the right are the backend-specific implementations. Some are simple values, while others are functions.

This is a very clear example of an OOP interface abstraction. It’s already OOP, but it’s expressed with an ad-hoc system instead of a formal one.

Example: Modes

Emacs supports defining modes based on other modes’ behaviors:

(define-derived-mode shell-mode comint-mode "Shell" ...)
(define-derived-mode inferior-python-mode comint-mode "Inferior Python" ...)

What would it look like to have major modes implemented as EIEIO classes? Instead of the special-case define-derived-mode, new modes could extend others with the standard EIEIO mechanism.

Following this line of thought, what about minor modes? They’re very similar to mixins, adding new functionality to an existing mode. What if, instead of enabling minor modes with hooks, they were classes that got mixed in with major modes?

Conclusion

I think the CLOS/EIEIO model is extraordinarily powerful, much more so than other OOP languages I’ve worked with. It’s amazing that Emacs Lisp has this rich functionality out of the box, and it's a testament to Lisp's power that a whole new paradigm can be implemented without modifying the runtime.

This is an expanded version of my EmacsConf 2021 talk, Old McCarthy Had a Form.