Skip to main content
The 2024 Developer Survey results are live! See the results

When to define a new generic and methods vs simple functions?

Created
Active
Viewed 383 times
8 replies
12

R has a somewhat special object-oriented ecosystem in which methods belong not to classes but instead to generic functions. This can be seen as a more functional object-oriented framework that plays really well with the language. The logic of both main class frameworks, S3 and S4 (I think also the forthcoming S7) makes so much sense for all those functions that one could expect to work on a variety, if not every type of object, such as plot, print or summary. However, when a function is heavily tailored at a specific kind of object one can easily end up writing a generic and default method only to get to the actual class method of interest. Something in this line:

foo <- function (...) {
  structure(
    list(...),
    class = 'foo'
  )
}

bar <- function (x, ...) {
  UseMethod('bar')
}

bar.default <- function (x) {
  cat('nothing to do here, just a formality...\n')
  invisible()
}

bar.foo <- function (x) {
  # really interesting stuff done with 'foo'
}

There are lots of cases in between these two extremes, where the method one wants to define could clearly make sense for other classes or for base types, but it's not clear to me how to exactly define the point where this stops making sense and it's better to just write a simple function that expects a specific kind of object.

I've been writing a package for a really specific domain (geochronology) that as such is well suited for class definitions and methods. I currently have a mixture of properly defined generics + methods and also non-generic functions that will only accept an object of a certain class as input. I'm starting to fear this will come back to bite me sometime in the future and would love to hear some thoughts on the topic from more experienced R programmers in this collective.

  • 339
  • 2
  • 8

8 replies

Sorted by:
77265391
0

This is just a personal take, but the choice between generic + 1 method, vs non-generic is essentially defensive programming for package owners.

If you can foresee that some user will eventually ask you to make this function for another object, and you are willing to cater to this user, then make a generic +1 method. If however, there is no conceptual way this function will be connected to another object., then stick to non-generics.

It's the same philosophy between hardcoding values vs. adding them as function parameters. If you have seen, conceptually, a reason to use another number, add it as a parameter.

For example, no one expects that sqrt(x) will exponentiate x by anything other 1/2. Changing this value will change the operation conceptually, it is no longer a square root. A function called root(), on the other hand, would invite a parameter rootindex since one can imagine different powers being used.

Finally, defensive programming may be avoided simply for time. The generic +1 method is a tool to save time in future expansions, at the cost of taking longer to program now. Is this a fair trade at this moment in your life? Do you have time to spare, or are busy and would rather take the gamble that users will not ask for future modifications?

77267198
2
  • 93.8k
  • 12
  • 144
  • 228

There are other object-oriented systems for R that don't use method-despatch on some signature, for example proto, Q7, aoos, and possibly more. These all use object$method(args) syntax, which nicely doesn't clutter the namespace with generics. Q7 looks interesting with its emphasis on composition over inheritance...

77357801
4

Do you have a link for aooi? I’ve never heard of it, it’s not on CRAN or GitHub, and Google finds nothing. And I can’t believe you didn’t mention R6 in this category. ;-)

77358336
1
  • 93.8k
  • 12
  • 144
  • 228

aoos, not aooi, typo fixed in the edit. https://cran.r-project.org/web/packages/aoos/index.html

77530250
0
  • 24.4k
  • 3
  • 35
  • 57

Just to add the base (and often forgotten) reference classes. The doc: ?setRefClass.

77748991
0
  • 4.9k
  • 2
  • 30
  • 45

Nobody has ever been particularly happy about R's object systems, and I don't think anyone ever will be. At my graduate school, Lumley and Gentleman presented on this quite a bit. Obviously, R takes a lot of inspiration from Lisp, and S3 is most in the spirit of a Lisp object system. What that means to me is that classes and methods are more like a matrix, and establishes some intuitive types when inheritance is not such a big component of doing the work. S3 passes a flat faced duck test pretty well.

S3 methods obviously can't deal with more than one signed object. A perfectly reasonable method like combine might be desired to combine datasets, or vectors, or you might combined models i.e. be pooling their respective data and fitting a common model. S3 isn't that smart. S4 obviously handles inheritance better as well; that is to say that my class might be mallard - which is a type of duck - so if I'm checking my walk, I'd be curious if it walks like a mallard first, but if mallards walks aren't special, maybe it's walk will eventually reveal that it is, in fact a duck. Anyway R has bent my thinking over so many years that I don't try to broach such a problem. Whereas I often think about generics and methods and new objects in day-to-day work, I rarely think about inheritance.

For package development, we are less frequently trying to define methods with far reaching implications than we are trying to define new objects and methods to handle them. The lme4 package is a perfect example. This package is about the merMod object and all the ways you can plot, summarize, confint, and aov any object of class merMod.

77778317
1

Eh. S3 is generally alright, despite some problems. The fact that it only supports single dispatch is not a problem: almost no other mainstream OO language supports native multiple dispatch, and this is generally unproblematic. Where multiple dispatch is really needed (e.g. your combine example), there are established solutions (pattern matching, or the visitor design pattern). Native multiple dispatch support is cute but generally unnecessary and not really worth the complexity cost it incurs.

And I don’t get your mallard example: S3 can model that just fine, no need for S4.

77842370
0

Seems like your issue rather than R or class system is trying to create an overgeneralized solution based on a hypothetical future issue that might never come to fruition.

While not thinking about future generalisation at all is also bad, you shouldn't overengineer your solution to handle all possible hypothetical future problems.

https://softwareengineering.stackexchange.com/questions/361605/should-the-solution-be-as-generic-as-possible-or-as-specific-as-possible

Functions that are tailored to a specific class make sense.

Functions that are generics and can work for a specific class also make sense.

Functions that could work for a particular set of classes (with `foo.default = \(x){ stop("Unimplemented")}) also make sense.