Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ai-generated attribute considered? #2046

Open
mbgower opened this issue Sep 20, 2023 · 10 comments
Open

ai-generated attribute considered? #2046

mbgower opened this issue Sep 20, 2023 · 10 comments
Assignees

Comments

@mbgower
Copy link

mbgower commented Sep 20, 2023

Description of bug or feature request

This not so much a feature request as a question about the need/potential for a feature...

We're seeing many designs starting to surface where pieces of data are being generated from AI, and that is being surfaced visually to users.

This indication isn't always happening with icons, but may be flagged through visual styles which are more difficult to provide an ALT text equivalent for -- border styling, colour, etc.

As an example, in the following 6-row, 7-column table (which lists system events, including their name, description, type, timestamp, tags, confidence and actions), the bottom two rows have a little AI icon in the lower corner of the first cells and the rows are visually treated with a mauve gradient to indicate that they represent issues that were generated by AI, and there is an action to Review in the final column for a user to confirm the validity.
image

Please set aside design considerations with this example, and treat it only as illustrative of the larger problem -- visual treatments are being used to identify the ai-generated data.

I'm wondering if it's feasible or desirable to explore an aria attribute of aria-ai-generated that could be applied to surface such data or page elements programmatically (somewhat like aria-required). I realize that existing sources of the data could be from anyone (or anything) and there's nothing intrinsically special about AI-generated. But my sense is that this will become a pronounced enough subset of information, with specific considerations surrounding its usage and treatment, that it's worth asking the question.

Also, an AT may NOT want the noise of having this surfaced all the time, and if this is programmatically indicated in a way that I can suppress with my AT, similar to how I treat some text styling, I can turn on the cues when necessary, and keep the noise down other times through my AT, instead of suffering the diverse efforts of authors to surface it.

Will this require a change to CORE-AAM?

If unknown, leave blank.

Will this require a change to the ARIA authoring guide?

If unknown, leave blank.

@scottaohara
Copy link
Member

linking this related HTML issue, since it probably make the most sense to not have this be an ARIA-only feature whatwg/html#9479

@TheRealRitMan
Copy link

This is a good use-case for the post you linked to. Your question about this information being highlighted could be determined via settings. The user can choose to highlight AI generated content or not and it could be very simple to identify AI generated content without disrupting anything else.

@tombrunet
Copy link

tombrunet commented Sep 25, 2023

@scottaohara Re-writing that - mis-posted, and then lost the comment. Anyway, that HTML issue you linked seems to have gone down a strange path because it was originally intended to just be an SEO flag to indicate AI existed anywhere in the page. They then tried to hack in container-level identification, which doesn't make any sense in a meta tag. And, maintaining that list of CSS selectors for a dynamic page would be complex and error prone.

I can't really see this working except as an HTML or ARIA property, but not sure how that would map through the accessibility APIs.

@mbgower
Copy link
Author

mbgower commented Sep 26, 2023

@mcking65 do you have any thoughts on this?
I'm seeing some use of what people are calling a slug (a little "AI" symbol) on data portions that are ai-generated. Those are relatively easy to solve, IMO.
The challenge I'm seeing is with visual effects where there may be a collection of ai-generated data, say a whole section of a page that encompasses a number of paragraphs and their contents. In such a case there is a visual indicator that does not necessarily map to headings or any traditional content structure. I also see similar visual effects where, due to space or whatever, it is felt the slug cannot be used.

@jnurthen
Copy link
Member

I'm not questioning that this is happening but IMO adding AI to these is really just adding noise. Is the fact that the information is AI generated really relevant? Is this different from a non ML-based algorithm being used? Or if the information was created by crowdsourcing data? Isn't this really just a flag that the data source is potentially untrusted?
Yes - this should be flagged in the UI but I don't see an ARIA attribute as being the correct way to do this - it should be done by good UI design so it is available for everyone.

@mbgower
Copy link
Author

mbgower commented Sep 26, 2023

I agree that meta information about data source is not new. But I think there are a few new considerations here, primarily due to the speed of adoption and the number of novel uses AI may be put to.

First, there are a number of use cases for why someone may flag information as AI. They can include a need to verify, a need to indicate source, and a need to take the data with "a grain of salt". I'm positive we'll be encountering lots more in the coming months. Depending on the user and use case, different ai-generation data may be considered more or less worthy of note.

Let's imagine two potential outcomes if we count on UI design alone to address this.

  1. A designer, fearful of not surfacing AI generation to all users, adds granular indications throughout content.
  2. A designer, regarding the AI source as primarily marketing noise, treats it as unimportant information. Checking out the language of Non-Text Contrast, they decide that styling on ai-generated text is not required to identify components or states, and gives very slight visual indicators, say a grey 10 stroke.

Neither of these scenarios seems great to me, but neither would surprise me. Thinking about where either could be identified as an accessibility failure --and what criteria would be used to fail them -- is a bit of an interesting exercise. Scenario 1 could be a real verbosity nightmare depending on how it was implemented. But it seems to me that situation 2 is the more likely to happen, with a failure against Info & Relationships.

In such a situation, what would a team use today to indicate the information represented by this light grey treatment? Someone suggested aria-describedby. That tends to be appended to the end of the string. How would you apply it to sentences? Paragraphs? Whole nested structures of ai-generated information? Is the user going to be forced to listen through each chunk waiting until the end to find out if it was ai-generated? Would they be able to determine the method of review necessary to properly locate and understand the references?

Would a team propose shoving in spans around each of the grey areas and put in some kind of flag? Would they decide to shove a hidden text at the front? End? Both?

Would a team try to use a bunch of regions to flag each chunk of ai treated information?

I don't really regard any of those as elegant. If some kind of aria attribute isn't going to be used, what recommendations/suggestions do people have for dealing with this?

@mbgower mbgower changed the title ai-generated state considered? Sep 27, 2023
@jnurthen jnurthen self-assigned this Sep 28, 2023
@cookiecrook
Copy link
Contributor

Some comments, but mostly I have questions.

Screen readers may defer to image alt text with the assumption that's it's human-curated, or otherwise better than the screen reader's ability to generate automated descriptions, but sometimes the ML-generated description from the screen reader is much better than the ML-generated alt text that the website embedded in the alt attribute...

So I'd encourage those considering this "ML-generated flag" feature to consider ways to distinguish which parts of element are ML-generated. In the case of an HTML <img> element, there are several translatable attributes: alt, title, aria-label... Does the flag apply to all of them? What about an SVG's contents? Would there be a way to distinguish if the SVG contents were ML-generated, but the alt text were not?

Or a way to distinguish if the image rendering itself were ML-generated, but the alt text for the image was written by a human? What about hybrid content such as written prose that is generated by a machine and edited by a human, or vice versa? Sports scores, weather forecasts, etc.

Likewise, what's the value of applying this at a page or section level? Most major web sites have some "help" from ML algorithms... Should they mark the whole page as ML, even though humans are involved? Might this lead to a scenario like California's Prop 65 (~"Warning: everything is known to cause cancer.") where the warning is so ubiquitous that it no longer provides the intended value?

I'm sure that's not the end of my questions on the matter, but I'll end by echoing others that, if considered, this should NOT be an ARIA-only or accessibility-only thing, so I'd recommend closing this particular ARIA issue and reopening in another context... Perhaps the W3C TAG?

@mbgower
Copy link
Author

mbgower commented Sep 29, 2023

Thanks for the questions.

My primary concern surrounds ai-generated text, not images. In regard to the translatable attributes you cite, obviously the alt only exists for images. the aria-label attribute cannot be put on most containers for text, and title seems like a problematic, touch- and keyboard-inaccessible way of trying to accommodate the need.

I am not proposing a page-level marker. The HTML issue referenced by @scottaohara has that as its starting point. It's nothing I'm focused on.

I'm also not concerned with the matter of deciding what to identify. In the scenarios I'm encountering, the business cases are dictating that; the designers are being requested to visually indicate that this part and this part are ai-generated.

Since opening this issue, I've realized that we don't have time to wait for anything from w3c on this. Instead, I'm doing what you suggested and trying to address with good design -- pushing hard against using visual stying alone as the indicators for this content -- and I'm confident I've been pulled in early enough in the design phase that we will come up with a good, accessible approach.

But just because we've decided we don't have time to wait for new attributes to help solve this, doesn't mean the need isn't there. I suspect similar discussions are going on in a lot of organisations, and accessibility considerations may not be given as much focus.

I'll leave someone else to carry this forward to TAG or whatever group seems likely to tackle. From my perspective, this issue can be closed.


Postscript

In the case of an HTML element...

As you note, the problem seems doable already for an image, because it already has an alt attribute that allows an author to include any text they want about the image.

So, in the situation where an ALT is ai-generated, it could simply be a question of appending ", ai-generated description" to the end of whatever alternative text was generated ,and the attribution to a non-human generator has been accomplished. Or, in your second scenario, "ai-generated image and description", etc. It would be nice if the solution was a little more elegant (affording the AT user an ability to turn it off, say).

I will say that this digression into images demonstrates one of these future use cases I'm talking about. If ai is going to generate a short description, there's no reason it couldn't generate a longer description too. It would be handy if an attribute existed that allowed both short and long descriptions to be captured, so that users could get more detail on demand. Verbosity tends to be in factors of three, so maybe even an alt-mediumtoo? Insane to bring up long descriptions again, i know, but there ya go.

It speaks to my point about how ai may factor into existing and future designs. Hopefully someone is thinking about architecture that can ensure the benefits are available to all users.

@cookiecrook
Copy link
Contributor

cookiecrook commented Sep 29, 2023

@mbgower wrote:

we don't have time to wait for anything from w3c on this.

For your initial use case, yes, appending "(AI)" or similar to the content or alt text seems fine.

this digression into images demonstrates one of these future use cases I'm talking about.

Images are the more interesting scenario to me, because it could be an objective marker that alternative text for a related resource (the image) was generated... This could mean the AT could ignore the generated alt text completely (user pref, of course) and generate its own description (VoiceOver's automated image descriptions are better than any others I've seen).

@mbgower
Copy link
Author

mbgower commented Sep 29, 2023

Yep, I suspect that the AT is going to fill the image description problem, not the author, going forward.

But this move to do visual styling to indicate the context of chunks of text is a different problem, which I anticipate becoming an accessibility challenge if it's done incorrectly, and authors lack an obvious way to remediate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
6 participants