Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API for exposing codec HW capabilities (hardwareAccelerated) #21

Closed
henbos opened this issue Nov 25, 2019 · 12 comments
Closed

Add API for exposing codec HW capabilities (hardwareAccelerated) #21

henbos opened this issue Nov 25, 2019 · 12 comments

Comments

@henbos
Copy link
Collaborator

henbos commented Nov 25, 2019

This is a follow-up to PR w3c/webrtc-pc#2356 / Issue w3c/webrtc-pc#2355. See previous discussion.

TL;DR:

  • Knowing if there is hardwareAccelerated support or not for a codec allows smarted negotiation by the application, which increases the odds of improving user experience (performance, fan speed, battery usage, etc).
  • The proposal to add this to RTCRtpCodecCapabilities (RTCRtpSender/RTCRtpReceiver.getCapabilities()) did not go through due to privacy concerns (exposing HW information increases fingerprinting surface and getCapabilities() is synchronous and not behind a permissions prompt) and due to webrtc-pc spec having "Feature Freeze" (the spec is attempting to graduate to Proposed Recommendation).

Let's continue the discussion here @snyderp @drkron

@henbos
Copy link
Collaborator Author

henbos commented Nov 25, 2019

With regards to this information being available elsewhere, @drkron said:

I think that what's exposed through MediaCapabilities API and WebGL is the biggest concern here, however they are outside of the WebRTC specification and cannot be expected to be solved within this forum.

Starting a stream just to determine if there's HW acceleration support for that particular codec is a bit far fetched, especially since the methods mentioned above are available.

I think that a good trade off right now is to keep the spec as is (unless there's a clear evidence that something is being misused) and be careful when adding new stuff.

An argument can be made to say that just because information has been exposed in the past, that doesn't give us free pass to expose it again. Personally, I don't know where to draw the line.

If this information is exposed behind an asynchronous API we could both measure its usage and mitigate the exposure by adding a prompt if this is to be considered privacy sensitive, without harming existing getCapabilities() usages.

How does one determine how privacy sensitive something is, or what a good prompt is? @foolip @aboba @jan-ivar @youennf @alvestrand

@drkron
Copy link

drkron commented Nov 25, 2019

I propose to add two promise-based functions to RTCPeerConnection:

  • getEncoderCapabilities()
  • getDecoderCapabilities()
    that would return a sequence, where RTCRtpCodecCapability is extended with the hardwareAccelerated field only if this promise-based function is called (i.e., not for the existing static functions).

If these would be added to RTCRtpSender/RTCRtpReceiver rather than RTCPeerConnection, you would first have to initiate the connection which is what this proposal tries to avoid.

@youennf
Copy link
Contributor

youennf commented Nov 25, 2019

It seems hard to explain to the user what this is all about so a prompt is probably difficult to implement.
Why not handling this within media capabilities? If seems to belong there.

@drkron
Copy link

drkron commented Nov 25, 2019

This would give the browser a possibility to ask the user, it doesn't have to. It could also be an automated process where the browser decides that a particular web site is suspicious and therefore returns an empty list.

@pes10k
Copy link

pes10k commented Nov 25, 2019

It could also be an automated process where the browser decides that a particular web site is suspicious

This kind of black-box magic should not be in web standards, unless the the "how to determine if a site is suspicious" logic is also in the standard.

In general, I'm strongly against "allow the harm but notify it occurred" or "add more prompts at the per feature granularity level" approaches.

Perhaps it would be more helpful to think through the types of scenarios where this information would be useful (in terms of user goals / flows) and thinking of the highest place where this data, along with other hardware-capability-revelaing-data, could all be gated behind a single, user-understandable permission or action? (I appreciate the WG has done so in many other cases, but suggesting thinking where / how this could be folded into those efforts)

@aboba
Copy link
Contributor

aboba commented Feb 21, 2020

@drkron Presumably the reason to add promise-based methods getEncoderCapabilites to the RTCRtpSender and getDecoderCapabilities to the RTCRtpReceiver would be to deal with the issue of resource availability. That is, within the current static getCapabilities method, indicating support for hardware acceleration does not provide any performance guarantees. The hardware resources might be available when they are negotiated, or they might have already been consumed by another browser tab or another application, in which case a software implementation will be used instead, with potentially degraded performance.

However, it is not clear to me how a promise-based method actually deals with the resource competition issue. Are we assuming that the browser controls access to (some? all?) of the acceleration resources, and therefore can ensure that they are available at the time that the promise is resolved? How does the application actually reserve the resources? Does this have to be done with Promise.all to prevent a race where the resource is advertised, but then consumed prior to being claimed?

@henbos
Copy link
Collaborator Author

henbos commented Feb 21, 2020

However, it is not clear to me how a promise-based method actually deals with the resource competition issue.

It doesn't. And that's not why the proposal changed into using a promise.

This was in response to privacy issues. An asynchronous API has the possibility of gating its return value behind a permissions prompt, should we want to.

This issue is not about tackling the existing privacy concerns in general, but as to not leak more device info in a synchronous API.

However if this API actually had a prompt, you would be less likely to want to use it for legitimate purposes, since the permissions prompt would likely be too confusing for real users.

@henbos
Copy link
Collaborator Author

henbos commented Feb 21, 2020

Not knowing what to do here, this issue has not progressed.

@pes10k
Copy link

pes10k commented Feb 22, 2020

This issue in PrivacyCG might be of interest privacycg/proposals#9

Basically we know there are

  1. situations where sites need privacy-risking information to debug
  2. this information doesn't seem necessary in the common case
  3. having per feature prompts is not a nice UX

Having a global "debug mode" in the browser seems like a way to cut a bunch of these knots, including possibly the ones in this issue

(the above is super early days discussion, and doesn't at this point represent any common opinion, but I'm offering it as another way to possibly solve these problems / clashes, in an on going way)

@henbos
Copy link
Collaborator Author

henbos commented Feb 24, 2020

This is not just debugging information.

WebRTC requires the application to negotiate which codecs to use. The more encoder/decoder capability information available to the application, the smarter it can be at making decisions. It would not necessarily have to know about HW/SW - what the application really wants to know is which encoder is performant. "hardwareAccelerated" would be the low-hanging fruit used as a heuritstic for "isPerformant". A more sophisticated API might express these things in terms of expected performance metrics in various setups. But implementing the sophisticated version of the API would be impossible without running various performance benchmarks prior to invoking the API.

In some cases, HW accelerated vs SW only could be the difference between an HD experience and a VGA experience.

Today the application is left to aimlessly guess unless it remembers its own "benchmarking" from previous application experiences, but trial-and-error would likely be worse than trying out the HW accelerated option in most cases.

This is a lot more complicated than that, but the point is, this is useful information to the application - it's not debugging only.

@pes10k
Copy link

pes10k commented Feb 25, 2020

@henbos Ah, thank you for clarifying, my error!

So sounds like a "debug" or "analytics" mode wouldn't solve the problem here. There is still the same privacy harm / implications in the spec though. Im happy to continue trying to help the group figure out a solution there, to address the PING blocker

(as a side note, the WebRTC family of specs is, as you know enormous and dense. I've done my best to familiarize myself with how all the specs work together, but if you get the sense that I'm repeatedly confused or miss understanding the system we've designed, maybe it would be helpful for me to join a call of yours and we could discuss more there?

@aboba
Copy link
Contributor

aboba commented Apr 19, 2022

Closing since Media Capabilities API seems to have met this need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants