Blog Closed

This blog has moved to Github. This page will not be updated and is not open for comments. Please go to the new site for updated content.

Sunday, February 28, 2010

Proposal to Change find_method

Austin Hastings, as part of his Kakapo project (which I now have a commit bit to!) has started creating a mock object framework. We were talking about how to implement expected method calls, so I took a look at the find_method VTABLE of the Object PMC for some inspiration. What I saw was absolutely horrible, so I promptly created a branch to fix it. However, the more I looked and edited, the bigger I found the problems to be. I'll talk more about Kakapo in another post.

When I do code like this:

$P0 = new ['Foo']
$P0.'Bar'()

What is really happening is something similar to this:

$P0 = new ['Foo']
$P1 = find_method $P0, 'Bar'
callmethodcc $P0, $P1

Internally, the find_method opcode calls the VTABLE_find_method function on the given object. The object itself is expected then to walk the method resolution order (MRO) of it's inheritance hierachy to find a suitable method and return it. Along the way, the Object PMC needs to completely violate the encapsulation of the Class PMC to gather information about the MRO and then to search the list of methods in the Class for an entry with the given name. In short version, the C code from Object.find_method looks like this:

int num_classes = VTABLE_elements(interp, class->all_parents);
int i;
for (i = 0; i < num_classes; i++) {
cur_class = VTABLE_get_pmc_keyed_int(interp,class->all_parents, i);
if (VTABLE_exists_keyed_str(interp, class->methods, name))
return VTABLE_get_pmc_keyed_str(interp, class->methods, name);
}

So Object reads the attributes of it's Class PMC directly, and manually traverses the MRO looking for the proper method. This causes a few problems. First, as a mostly stylistic point, this completely breaks encapsulation. We can't make a change to the MRO or the method storage and lookup mechanism in Class without likewise changing the behavior in Object.

Second point, since Object needs to know how to traverse the MRO and lookup methods, and requires intimate internal knowledge of the classes in the MRO, we are extremely limited in the types of objects that can be in the inheritance hierarchy. That is, we can't define our own metaobject types, we must use Class or PMCProxy, or a subclass thereof (and a careful reading of the code suggests that even subclasses will not work). This seems to be a remarkable limitation when you consider some of the diverse high-level languages that Parrot aims to support.

One thing I tried to do was create a find_method VTABLE in the Class PMC, and then delegate traversal of the MRO to Class instead of Object. This helped improve encapsulation greatly, but created another problem: Now I couldn't call methods on Class itself. Here's example code that broke:

$P0 = getclass 'Foo'
$P0.'add_vtable_override'("bar")

What we want to do is call a method on the class object itself, but what we end up doing is finding a method on objects of that type, and then trying to call that method on the class object. Problems.

Let's recap some issues:
  1. Find_method searches for a method to use on a given invocant
  2. The Class type has methods that need to be accessible through find_method
  3. Object has to break encapsulation and monkey around in Class's internals, which means we can only use Class objects, and objects strictly isomorphic to Class (like PMCProxy) in an MRO
  4. We cannot delegate the method lookup operation to the Class object, where it arguably belongs.
With these things in mind, I had an idea that I sent to the list which aims to fix all this: Create a new VTABLE function that searches for a method in a metaobject, instead of searching for the method on the invocant (like find_method does now). In terms of PIR, I'm thinking of enabling this kind of sequence:

$P0 = new ['Foo']
$P1 = getclass 'Foo'
$P2 = find_class_method 'Bar'
callmethodcc $P0, $P2

I don't want to remove find_method or change it in any way. But what I want to have is a way to delegate method lookup to the Class object as well. I think we will find that when we have a way to delegate lookup to the Class object that we will use it much more frequently and to greater effect than we use find_method now. I also think we will find that find_method can eventually be deprecated entirely, but that's another issue for another time.

One other problem that I failed to mention above is that every class has it's own completely linearized resolution order. So if Foo is a Bar, and Bar is a Baz, the Foo class has the MRO ("Foo", "Bar", "Baz"), Bar would have the MRO ("Bar", "Baz"), and Baz would have the MRO ("Baz"). Asking the Foo Class object for a method "Frobulate" would look in Bar, which would ask Baz. Then, Foo would move to the next item in it's MRO, Baz, and ask it. The net result is that Baz would be queried twice, since the Foo Class item doesn't know necessarily that Baz is in Bar's MRO, and Bar doesn't know that it is being queried from Foo (maybe Bar was being queried directly). So what we need is some kind of way to keep track of the MRO up front, and avoid re-defining the search MRO for each new delegation.

I think we could solve this issue if we defined a new VTABLE like this:

VTABLE PMC * find_class_method(STRING *name, PMC *mro_iterator)

In this conception, SELF would be the metaobject currently being searched, name would be the string name of the method to find, and mro_iterator would be an iterator object for the MRO list. When we do the PIR code:

$P0 = getclass "Foo"
$P1 = find_class_method $P0, "Frobulate"

The first call to the Foo class object would be VTABLE_find_class_method("Frobulate", NULL). Foo would then create an iterator over it's MRO (removing itself from the front of the list to avoid direct recursion) and passing that MRO iterator to Bar, which then calls the next item on the list (Baz). This has a few major advantages which are not necessarily obvious up front: Any object that defines find_class_method can be inserted into the MRO. This includes things that aren't really classes like Roles, Mixins, extension methods, and even autoloaders. Second, we gain more flexibility to modify the MRO of a class, because that class (and it's super-classes) can add additional search parents to the iterator as needed. We would also gain the ability to have more manual control over the MRO, because we could add a find_class_method_p_p_s_p op variant that also takes an existing MRO iterator. This would enable us to better implement something like a super() call, where we take the MRO iterator, manually pop the top item off it, and then call find_class_method with it. I've got several bonus points available to whoever can explain how to call a method in a super class when it's overridden in the subclass, without having to hard-code in the name of the parent class. With the new VTABLE and a new op, this becomes trivial.

So that's my idea for method lookups. I've sent a mail to the list with the idea, and I'm going to raise the idea at #ps if I can make it to the meeting. I think it has a lot of merit, enables a few cool new abilities and doesn't take away any existing functionality. I would like to hear any other ideas, but I'm becoming convinced that this one is a winner.

2 comments:

  1. Actually, Kakapo already does super(), without hard-coding the method name. :)

    ReplyDelete
  2. Ah yes, I see that code gem now. Of course, Austin didn't explain it!

    What Kakapo does is exactly what I suggest Parrot should be doing internally: He gets the MRO from the Class, iterates it himself, and searches for the method with the given name manually.

    Considering that super() is such a common and necessary operation in most object-oriented lanuages, it's surprising that it takes this much effort to do it. Parrot should be managing the MRO automatically (with, of course, plenty of flexibility to plug in custom schemes).

    ReplyDelete

Note: Only a member of this blog may post a comment.