Capabilities
I remembered another important consideration after writing yesterday's blog post: security. Say that you have a feature that lets you dynamically import a module yet restrict the imports that it can access:
safe_modules = ['eve.data.*', 'eve.math', 'eve.regexp']
plugins = list_dir('plugins') -> filter(ext(?, 'eve'), ?)
-> map(dynamic_import(?, safe_modules), ?)
If the only methods that are accessible are those that are explicitly imported, then you know that no code in these modules may access other, unsafe functionality. But if multimethods silently import other implementations, there's no guarantee that one of the multimethods in eve.data.sequence doesn't have an implementation in unsafe.file_access that it can use to stomp on the local hard disk. Security auditing just got much more difficult: instead of simply looking at the data you're passing in to the plugin and ensuring that safe_modules really are safe, you've got to examine every module that subtypes a type in safe_modules.
Method Dispatch
One of the comments in the other post pointed out that Dylan solved many of these multiple dispatch in an intended-for-production-use language. Dylan's method dispatch is similar to Cecil's: it compares each argument for specificity, and if two argument positions conflict, then the methods are ambiguous and don't participate in method dispatch.
Dylan's approach also handles keyword arguments: since it makes no reference to argument position, the arguments don't have to be positional.
If I stick with multimethods, I'll probably go with something like this. Dylan's approach is actually more complicated than I need, because they allow multiple inheritance and so need to worry about monotonic linearizations and such. I should be able to use simple subtyping rules on the declared method types, straight out of TAPL.
Dropping multimethods entirely?
I'm thinking that it may be best to drop multimethods entirely and rely on a very Pythonic single-dispatch object system. Obviously it'll be a bit different since objects are immutable, but languages like Python, Java, Smalltalk, and Ruby have basically figured this out and worked out most of the kinks. I'm reminded of GvR's Language Design is Not Just Solving Puzzles - there's a point where each additional feature you add increases complexity rather than reducing it.
So, we can treat methods as simple fields of records where the first argument is the record itself. Same as in OO C, or Python. Maybe have some syntactic sugar to make it easy. This approach has already been well-tested, it's "good enough", and it lets me get to the more interesting work of defining libraries.
There's one issue with this: in my Python programming, I frequently start prototyping using built-in data structures (dicts, tuples) and standalone functions, and then later refactor it to use classes. The prototyping stage is necessary, because it's usually not clear what fields are necessary, what functions are necessary, and how those functions should be organized. But the process of refactoring introduces a big discontinuity that usually requires that the program be rewritten entirely, because it changes the calling syntax to access everything that becomes a class.
I'd like to be able to transparently "harden" a program, making the classes and data structures more well-defined and organized without having to touch every single place that I use them.
The existing equivalence between function calls and record field access is one attempt at this, but it's limited to being able to compute fields from the existing data in the record. A good start - no need for the @property built-in with this - but it falls down horribly when you need additional arguments.
Some code will help. What I want is for this:
def foo(self, arg1, arg2): ...
foo(obj, x, y)
To be equivalent to this:
class Foo:
def foo(self, arg1, arg2): ...
obj.foo(x, y)
With equivalent meaning that the method calls look exactly the same, so you could first convert the method definition to the class form, have your program still work, and then convert each individual call to the obj.method syntax at your leisure.
Bonus points if the resulting syntax looks like the one for function pipelining; I really dislike the idea of introducing two different function call syntaxes like I suggested yesterday.
The obvious solution is to overload the function call syntax so that if the function isn't found, it looks for a matching field in the first argument and then executes that. Then the method invocation syntax could just be syntactic sugar.
Another neat (though potentially confusing) effect of this is that local function definitions shadow method calls, so you can locally specialize particular method calls. I'm a bit worried about name conflicts with this, though.
There're two problems with this: it leaves no way to get the actual bound method, and the precedence rules are wrong if it needs to coincide with function pipelining. The first problem can be worked around easily enough using partial-application: if obj->foo(arg1, arg2) is syntactic sugar for foo(obj, arg1, arg2), then obj->foo(?, ?) is syntactic sugar for {| x, y | foo(obj, x, y)}, which is exactly what you'd use a bound method for. This is also more flexible, since you can bind any one of the arguments instead of just the message receiver.
The second one is potentially a problem. Normally, you'd expect:
obj->method1(a, b)->method2(c, d)
to invoke method1 on object, then invoke method2 on the result. But with the existing -> definition, it parses to:
method2(method1(obj, a, b), c, d)
Oh wait, that's correct. So the confusing part is if were expecting to use partial application to setup a pipeline, like the example at the beginning:
safe_modules = ['eve.data.*', 'eve.math', 'eve.regexp']
plugins = list_dir('plugins') -> filter(ext(?, 'eve'), ?)
-> map(dynamic_import(?, safe_modules), ?)
This would parse to map(filter(list_dir('plugins'), ext(?, 'eve'), ?) ...), which is nonsense. Perhaps just allowing people people to parenthesize expressions? Combining this with sane methods (like map & filter being methods of sequence types), the above example would look like this:
safe_modules = ['eve.data.*', 'eve.math', 'eve.regexp']
plugins = list_dir('plugins') -> filter(? -> ext('eve'))
-> map(dynamic_import(?, safe_modules))
Not so bad. The big test is what happens if call them in an order that's not bound to the receiver:
FILTERS = [ext(?, 'svn'), starts_with(?, '.'), ends_with(?, '~'), ext(?, 'swp')]
def all_files(dir):
list_dir(dir) -> (fold(FILTERS, map -> flip, ?))
I think that'd work, and it doesn't look so bad, but it's probably best just to try it out and see rather than work it all out in my head.
0 comments:
Post a Comment