Thursday, April 27, 2006

Published methods

Normally not thought of (or used as) an object-oriented features, published methods rely in RTTI to enable runtime lookup of methods by using a string with the method name. This is used extensively by the IDE and VCL when you are writing event handlers at design time.

When you create a new event handler (by double-clicking on an empty event value in the Object Inspector) or when you associate an event property with an existing method (by using the drop down or even manually typing in the method name), the IDE ensures that the method’s parameters matches the parameters of the event type. Likewise, when you assign an event property in code to an method, the compiler performs a compile-time check that the parameters and calling conventions agree.

At run-time there are no such parameter checks. All design-time assigned events are stored in the .DFM file simply by using the method name string. When a .DFM is loaded at run-time, the method is looked up using the TObject.MethodAddress function – see TReader.FindMethod in Classes.pas for the streaming details.

The TObject.MethodAddress function works its magic by scanning through some compiler magic tables known as a Method Table (or Published Method Table, as I prefer – reducing the possible confusion with the virtual and dynamic method tables).

Enabling RTTI
By default extended run-time type information (RTTI) for a class is disabled. In contrast to .NET where meta data is generated for all members, in Delphi RTTI is only generated for published members when a class is compiled with the compiler directive {$M+} enabled, or when it inherits from a class that was compiled in $M+ mode (such as TPersistent, TComponent etc). The long-name alternative to {$M+} is {$TYPEINFO ON}. For the purposes of this discussed, I’ll call such classes MPlus classes – all other classes are MMinus classes.

In addition to explicit published members, all members of a MPlus class in the top of the class declaration that have no explicit visibility specifier are treated as published. For MMinus classes, these members are public. This is why all the component field and event handler declarations in the top of form units are published (TForm is a MPlus class).

The compiler allows publishing object and interface reference fields, properties of most types and methods. We’ll mainly focus on published methods in this article. Lets write a little test program to exercise published members and MPlus and MMinus classes.

program TestMPlus;
{$APPTYPE CONSOLE}
uses Classes, SysUtils, TypInfo;

type
{$M-}
TMMinus = class
DefField: TObject;
property DefProp: TObject read DefField write DefField;
procedure DefMethod;
published
PubField: TObject;
property PubProp: TObject read PubField write PubField;
procedure PubMethod;
end;
{$M+}
TMPlus = class
DefField: TObject;
property DefProp: TObject read DefField write DefField;
procedure DefMethod;
published
PubField: TObject;
property PubProp: TObject read PubField write PubField;
procedure PubMethod;
end;

procedure TMMinus.DefMethod; begin end;
procedure TMMinus.PubMethod; begin end;
procedure TMPlus.DefMethod; begin end;
procedure TMPlus.PubMethod; begin end;

procedure DumpMClass(AClass: TClass);
begin
Writeln(Format('Testing %s:', [AClass.Classname]));
Writeln(Format('DefField=%p', [AClass.Create.FieldAddress('DefField')]));
Writeln(Format('DefProp=%p', [TypInfo.GetPropInfo(AClass, 'DefProp')]));
Writeln(Format('DefMethod=%p', [AClass.MethodAddress('DefMethod')]));
Writeln(Format('PubField=%p', [AClass.Create.FieldAddress('PubField')]));
Writeln(Format('PubProp=%p', [TypInfo.GetPropInfo(AClass, 'PubProp')]));
Writeln(Format('PubMethod=%p', [AClass.MethodAddress('PubMethod')]));
Writeln;
end;

begin
DumpMClass(TMMinus);
DumpMClass(TMPlus);
readln;
end.

A compiler quirk
The purpose of this test program is to verify that RTTI is generated for default and published visibility for MPlus classes and that RTTI is not generated for MMinus classes. We have two classes TMMinus and TMPlus that have identical members but are compiled in different $M modes. We would expect TMMinus to have no RTTI for its members, and TMPlus to have RTTI for all its members.

The DumpMClass routine writes out raw pointer values for the RTTI of the fields, properties and methods in each class. When we run this program, we get this surprising result:

Testing TMMinus:
DefField=00000000
DefProp=00000000
DefMethod=00000000
PubField=008C0A78
PubProp=00000000
PubMethod=00412898

Testing TMPlus:
DefField=008C0AA0
DefProp=00412852
DefMethod=0041289C
PubField=008C0AF4
PubProp=00412874
PubMethod=004128A0

As expected the TMPlus class has RTTI for all of its six members, proving that $M+ enables RTTI and that the default visibility for MPlus classes is published. The strange thing about this result is that the TMMinus class declared with TYPEINFO disabled still has RTTI for two of its members, the explicitly published field and method. This reality contradicts the documentation, which says:

A class cannot have published members unless it is compiled in the {$M+} state or descends from a class compiled in the {$M+} state. Most classes with published members derive from TPersistent, which is compiled in the {$M+} state, so it is seldom necessary to use the $M directive.

This is probably a compiler bug. Notice that the published property didn’t get any RTTI. From the docs (“A class cannot have published members”) it sounds like one should expect a compile-time error (or at least warning) if you try to compile a MMinus class with a published section. But we don’t.

The default visibility of class members is documented like this:

Members at the beginning of a class declaration that don't have a specified visibility are by default published, provided the class is compiled in the {$M+} state or is derived from a class compiled in the {$M+} state; otherwise, such members are public.

This matches what we saw in our experiment. Luckily MMinus class methods and fields with no visibility specifier don’t generate spurious RTTI. In weird cases, you might accidentally have members (fields and methods) in a MMinus class in a published section – these will have RTTI generated from them even if you never intended to use it for anything.

Using published methods polymorphically
While it should probably be viewed as a hack, you can use published methods to implement a very flexible, late-bound polymorphic dispatch mechanism. It is very flexible because the caller and the callee do not have to know about each other or use a common interface. The caller needs to know the name, parameters and calling convention of the method it wants to call, and the callee has to implement this as a published method with the correct name, parameters and calling convention.

To override an existing published method, a descendent class needs to define a new published method with the same name. Because dynamic method lookup first searches the most derived class, this works like a polymorphic lookup.

Lets look at a simple example:

program TestPolyPub;
{$APPTYPE CONSOLE}
uses Classes, SysUtils, TypInfo, Contnrs;

type
{$M+}
TParent = class
published
procedure Polymorphic(const S: string);
end;
TChild = class
published
procedure Polymorphic(const S: string);
end;
TOther = class
published
procedure Polymorphic(const S: string);
end;

procedure TParent.Polymorphic(const S: string);
begin
Writeln('TParent.Polymorphic: ', S);
end;

procedure TChild.Polymorphic(const S: string);
begin
Writeln('TChild.Polymorphic: ', S);
end;

procedure TOther.Polymorphic(const S: string);
begin
Writeln('TOther.Polymorphic: ', S);
end;

function BuildList: TObjectList;
begin
Result := TObjectList.Create;
Result.Add(TParent.Create);
Result.Add(TChild.Create);
Result.Add(TOther.Create);
end;

type
TPolymorphic = procedure (Self: TObject; const S: string);
procedure CallList(List: TObjectList);
var
i: integer;
Instance: TObject;
Polymorphic: procedure (Self: TObject; const S: string);
begin
for i := 0 to List.Count-1 do
begin
Instance := List[i];
// Separate assign-and-call
Polymorphic := Instance.MethodAddress('Polymorphic');
if Assigned(Polymorphic) then
begin
Polymorphic(Instance, IntToStr(i));
// Alternative syntax:
TPolymorphic(Instance.MethodAddress('Polymorphic'))(Instance, IntToStr(i));
end;
end;
end;

begin
CallList(BuildList);
readln;
end.

Here we first define three classes – each with a published method named ‘Polymorphic’ that takes a single string parameter (in addition to the implicit Self parameter) and that uses the default register calling convention. Two of the classes inherit from each other and the TChild class in practice overrides the Polymorphic it inherits from TParent. The TOther class is totally unrelated to the two other classes (well, they all inherit from TObject), but its Polymorphic method can be called “virtually” anyway.

Then we build a heterogeneous list of objects containing instances of each of the three classes. This list is passed to CallList that finds and calls the published Polymorphic method of each instance in the list. The Delphi language has no built-in syntax to call a published method through a name string, so we must manually assign the result of Instance.MethodAddress to a procedural variable and then call through the variable. Alternatively we can combine the operations into a single statement that type casts the MethodAddress result into the correct procedural type and calls through the result. Both syntaxes are demonstrated above.

An interesting feature of calling published methods is you can check at runtime if a specific method is available for an instance or not. This way you can use published methods to implement optional behavior or callbacks. For instance, a generic streaming system could optionally call published BeginStreaming and EndStreaming methods before and after streaming an object instance. Only classes that need to perform special actions would actually implement the methods. Published methods could even be used as a kind of poor-man’s attributes.

The main disadvantage of this technique is that there is no compile time or runtime checking of method signatures. If you are calling a method with a different calling convention or types and number of parameters, “interesting” things (crashes, corruption) can happen at run-time.

7 comments:

Anonymous said...

Very interesting post!

You say, "the callee has to implement this as a published method with the correct name, parameters and calling convention." Is it possible to check the method signature as well in a similar way - in other words, if one class has a method:
procedure Polymorphic(const S: string);
and another class has a method
procedure Polymorphic(const I: integer);
to recognise they're different?

If so, it would be an interesting hack to use this to implement Objective C-like late bound message passing, where if the message handler / method you're trying to call doesn't exist, it calls a default 'not implemented' method. (This is a language feature I really like the look of, it would be an interesting change to Delphi to implement something like this officially.)

Hallvards New Blog said...

"Is it possible to check the method signature"

No. Unfortunately, the compiler does currently not store information abouut the parameters or calling convention anywhere.

If the method you are calling have different parameters than you expect, the program will probably crash. That's why this is currently a hack.

More details will follow in a later article.

Anonymous said...

Now, I have seen Delphi decompilers, and they DO manage to work out parameters to published methods, so it has to be SOMEWHERE.

Hallvards New Blog said...

AFAIK, the parameter info is not currently stored in the RTTI-info.

The decompilers probably do this by combining the class' RTTI info with the event properties that are assigned to these published methods in the .DFM files.

By looking at the .DFM they can find properties of the form and components that point to the published methods in the form. By using TypInfo structures to reflect on these vent properties, they can find the parameters of the event property, and thus the compatible published method.

You *can* do the same at runtime, by scanning all event properties of the form and owned components, checking if any of them points to the published method you are interested in. Then you can get the properties of the event.

The relevant structures from TypInfo are:

TMethodKind = (mkProcedure, mkFunction, mkConstructor, mkDestructor,
mkClassProcedure, mkClassFunction, mkClassConstructor, mkOperatorOverload,
{ Obsolete }
mkSafeProcedure, mkSafeFunction);
...
TParamFlag = (pfVar, pfConst, pfArray, pfAddress, pfReference, pfOut);
{$EXTERNALSYM TParamFlag}
TParamFlags = set of TParamFlag;
...
PTypeData = ^TTypeData;
TTypeData = packed record
case TTypeKind of
...
tkMethod: (
MethodKind: TMethodKind;
ParamCount: Byte;
ParamList: array[0..1023] of Char
{ParamList: array[1..ParamCount] of
record
Flags: TParamFlags;
ParamName: ShortString;
TypeName: ShortString;
end;
ResultType: ShortString});

Maybe another article about this later...?

Anonymous said...

Ya, that would be interesting.

Anonymous said...

great posts keep going

Hallvards New Blog said...

http://hallvards.blogspot.com/2006/05/hack-10-getting-parameters-of.html



Copyright © 2004-2007 by Hallvard Vassbotn