Created 9 months ago

Active 5 months ago

Viewed 383 times

8 replies

R has a somewhat special object-oriented ecosystem in which methods belong not to classes but instead to generic functions. This can be seen as a more functional object-oriented framework that plays really well with the language. The logic of both main class frameworks, S3 and S4 (I think also the forthcoming S7) makes so much sense for all those functions that one could expect to work on a variety, if not every type of object, such as plot, print or summary. However, when a function is heavily tailored at a specific kind of object one can easily end up writing a generic and default method only to get to the actual class method of interest. Something in this line:

foo <- function (...) {
  structure(
    list(...),
    class = 'foo'
  )
}

bar <- function (x, ...) {
  UseMethod('bar')
}

bar.default <- function (x) {
  cat('nothing to do here, just a formality...\n')
  invisible()
}

bar.foo <- function (x) {
  # really interesting stuff done with 'foo'
}

There are lots of cases in between these two extremes, where the method one wants to define could clearly make sense for other classes or for base types, but it's not clear to me how to exactly define the point where this stops making sense and it's better to just write a simple function that expects a specific kind of object.

I've been writing a package for a really specific domain (geochronology) that as such is well suited for class definitions and methods. I currently have a mixture of properly defined generics + methods and also non-generic functions that will only accept an object of a certain class as input. I'm starting to fear this will come back to bite me sometime in the future and would love to hear some thoughts on the topic from more experienced R programmers in this collective.

oop r r-s3 r-s4

created Oct 6, 2023 at 18:53

Coy

8 replies

Sorted by:

JMenezes

1.1k
1
6
13

This is just a personal take, but the choice between generic + 1 method, vs non-generic is essentially defensive programming for package owners.

If you can foresee that some user will eventually ask you to make this function for another object, and you are willing to cater to this user, then make a generic +1 method. If however, there is no conceptual way this function will be connected to another object., then stick to non-generics.

It's the same philosophy between hardcoding values vs. adding them as function parameters. If you have seen, conceptually, a reason to use another number, add it as a parameter.

For example, no one expects that sqrt(x) will exponentiate x by anything other 1/2. Changing this value will change the operation conceptually, it is no longer a square root. A function called root(), on the other hand, would invite a parameter rootindex since one can imagine different powers being used.

Finally, defensive programming may be avoided simply for time. The generic +1 method is a tool to save time in future expansions, at the cost of taking longer to program now. Is this a fair trade at this moment in your life? Do you have time to spare, or are busy and would rather take the gamble that users will not ask for future modifications?

Oct 10 at 11:30

Spacedman

93.8k
12
144
228

There are other object-oriented systems for R that don't use method-despatch on some signature, for example proto, Q7, aoos, and possibly more. These all use object$method(args) syntax, which nicely doesn't clutter the namespace with generics. Q7 looks interesting with its emphasis on composition over inheritance...

Oct 10 at 15:42, edited Oct 25 at 9:38

Konrad Rudolph

541k
136
952
1.2k

Do you have a link for aooi? I’ve never heard of it, it’s not on CRAN or GitHub, and Google finds nothing. And I can’t believe you didn’t mention R6 in this category. ;-)

Oct 25 at 8:25

Spacedman

93.8k
12
144
228

aoos, not aooi, typo fixed in the edit. https://cran.r-project.org/web/packages/aoos/index.html

Oct 25 at 9:39

nicola

24.4k
3
35
57

Just to add the base (and often forgotten) reference classes. The doc: ?setRefClass.

Nov 22 at 13:14

AdamO

4.9k
2
30
45

Nobody has ever been particularly happy about R's object systems, and I don't think anyone ever will be. At my graduate school, Lumley and Gentleman presented on this quite a bit. Obviously, R takes a lot of inspiration from Lisp, and S3 is most in the spirit of a Lisp object system. What that means to me is that classes and methods are more like a matrix, and establishes some intuitive types when inheritance is not such a big component of doing the work. S3 passes a flat faced duck test pretty well.

S3 methods obviously can't deal with more than one signed object. A perfectly reasonable method like combine might be desired to combine datasets, or vectors, or you might combined models i.e. be pooling their respective data and fitting a common model. S3 isn't that smart. S4 obviously handles inheritance better as well; that is to say that my class might be mallard - which is a type of duck - so if I'm checking my walk, I'd be curious if it walks like a mallard first, but if mallards walks aren't special, maybe it's walk will eventually reveal that it is, in fact a duck. Anyway R has bent my thinking over so many years that I don't try to broach such a problem. Whereas I often think about generics and methods and new objects in day-to-day work, I rarely think about inheritance.

For package development, we are less frequently trying to define methods with far reaching implications than we are trying to define new objects and methods to handle them. The lme4 package is a perfect example. This package is about the merMod object and all the ways you can plot, summarize, confint, and aov any object of class merMod.

Jan 2 at 23:08

Konrad Rudolph

541k
136
952
1.2k

Eh. S3 is generally alright, despite some problems. The fact that it only supports single dispatch is not a problem: almost no other mainstream OO language supports native multiple dispatch, and this is generally unproblematic. Where multiple dispatch is really needed (e.g. your combine example), there are established solutions (pattern matching, or the visitor design pattern). Native multiple dispatch support is cute but generally unnecessary and not really worth the complexity cost it incurs.

And I don’t get your mallard example: S3 can model that just fine, no need for S4.

Jan 8 at 9:21

Colombo

Seems like your issue rather than R or class system is trying to create an overgeneralized solution based on a hypothetical future issue that might never come to fruition.

While not thinking about future generalisation at all is also bad, you shouldn't overengineer your solution to handle all possible hypothetical future problems.

https://softwareengineering.stackexchange.com/questions/361605/should-the-solution-be-as-generic-as-possible-or-as-specific-as-possible

Functions that are tailored to a specific class make sense.

Functions that are generics and can work for a specific class also make sense.

Functions that could work for a particular set of classes (with `foo.default = \(x){ stop("Unimplemented")}) also make sense.

Jan 18 at 20:59

Share perspectives, advice, and insights

Use Discussions to engage in deeper dialogue, have opinion-based conversations, and exchange perspectives about a technical concept. See full Discussions guidelines.

Discussions is different than Q&A

Discussions exists separately from the traditional question-and-answer space. If you have a specific programming question, go to Stack Overflow Q&A to post your question.

Be welcoming and patient

All users are expected to treat one another with kindness and respect. Remember, everyone is here to learn, and sometimes while learning, people make mistakes. See code of conduct.

No resume or job listings

Discussions are not for sharing your resume or job listing.

Avoid self-promotion

If your post happens to be about your product or website, you must disclose your affiliation. See spam guidelines and best practices.

Collectives™ on Stack Overflow

When to define a new generic and methods vs simple functions?

8 replies