-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Meta Tag for AI Consent Management #9334
Comments
Why should the author explicitly choose Doesn't this sentence duplicate the existing |
These are great points! Thank you for pointing these out. I had considered just proposing I also agree that this is more a question of organization than technology. The details of implementation aren't important to me so much as agreeing on a standard for establishing consent specifically in the case of model training and search. Perhaps this can indeed be handled using a |
By the way, if you leave it in force something similar to DNT, you can move the proposal to the Microformats Wiki (which will be officially recognized as a specification), or go with the same to WICG. Also, bikeshedding: something like |
Thank you! |
No problem. What I suggested to you in the comment above is a move away from metadata in favor of a link type. You can, of course, write a specification and send it to MetaExtensions, but this is a chore and “However, a new metadata name should not be created in any of the following cases: If the name is for something expected to have processing requirements in user agents; in that case it ought to be standardized” might be applicable given that crawlers are also UA in some way. So |
wow that's awesome |
FYI, DeviantArt and SketchFab came up with |
I believe that bots can crawl non-HTML resource files, such as source codes or images. Isn't it better to define this in (or on top of) the robots.txt protocol? |
@myakura, no, because there are countless crawlers in the future, and the author cannot be made responsible for following them. In addition, no one wants to limit crawling in this case, only the use of the collected content. |
One consideration here is that crawlers would need to download each individual page to find out if it has an The
This could be hosted under a well-known URI such as |
Introduction
With the rapid growth of artificial intelligence, and especially machine learning models that train on web data, the issue of data usage consent has become more relevant than ever. Currently, there is no standard way for website owners to express their consent or otherwise for AI models to use their data for training or crawling purposes. This proposal seeks to address this issue by introducing a new HTML meta tag called
ai-consent
.The Proposed Solution
I propose the introduction of an HTML meta tag named
ai-consent
. This tag would have acontent
attribute with the following possible values:all
: The website owner consents to the use of their content for both AI model training and live search operations.search-only
: The website owner consents to the use of their content for live search operations only, provided the source website is cited by the AI agent. They do not consent to the use of their content for AI model training.none
: The website owner does not consent to the use of their content by AI for any purpose.The tag would appear in the
<head>
of an HTML document. For example:Use Cases and Examples
Below are some examples of how the
ai-consent
tag could be used:Considerations
This proposal introduces a method for website owners to manage consent regarding AI data usage and is similar in intent to the
noindex
meta tag. However, it does not enforce the consent. It would be the responsibility of AI creators and operators to respect and enforce these tags, which might not happen short of robust regulation. Additionally, the proposed tag would need to be included in popular web crawlers' whitelists of meta tags.Conclusion
The proposed
ai-consent
meta tag provides a standard method for website owners to express their consent for AI data usage. It would promote transparency and respect for website owners' data preferences, contributing to a more ethical web environment for AI.The text was updated successfully, but these errors were encountered: