Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear interaction between property-microformat collapsing and implied properties #66

Open
JKingweb opened this issue Jul 17, 2023 · 2 comments

Comments

@JKingweb
Copy link

JKingweb commented Jul 17, 2023

The general parsing rules state:

  • if that child element itself has a microformat ("h-*" or backcompat roots) and is a property element, add it into the array of values for that property as a { } structure, add to that { } structure:
    • value:
      • if it's a p-* property element, use the first p-name of the h-* child
      • else if it's an e-* property element, re-use its { } structure with existing value: inside.
      • else if it's a u-* property element and the h-* child has a u-url, use the first such u-url
      • else use the parsed property value per p-*,u-*,dt-* parsing respectively

A strict reading excludes implied name and url (they are not p- or u- properties, technically) despite their being suitable values, such that the parent's name property here has a value of ABBA rather than C as the child does:

<div class="h-parent">
  <div class="p-name h-child">
    <div>
      A<abbr title="C">BB</abbr>A
    </div>
  </div>
</div>

Current parser behaviour:

C: PHP, JavaScript, Go, Rust, Haskell, Ruby
ABBA: Python

@gRegorLove
Copy link
Member

Since the parser has already recursed and parsed the child element at that point, I wonder if these lines should be changed to use the parsed properties from the child.

This line:

if it's a p-* property element, use the first p-name of the h-* child

Could become:

if it's a p-* property element, use the parsed name property of the h-* child

  • If the parsed name property is a { } structure, use its value property
  • Else use the first value in the name array

And so on for the other prefixes.

I think this is what php-mf2 does in practice. I wonder what the other parsers do.

A php-mf2 example with odd usage of e-name to demonstrate the above:

<div class="h-feed">
  <article class="p-x-articles h-entry">
    <h1 class="e-name"><b>Lorem ipsum</b></h1>
  </article>
</div>
"type": [
    "h-feed"
],
"properties": {
    "x-articles": [
        {
            "type": [
                "h-entry"
            ],
            "properties": {
                "name": [
                    {
                        "html": "<b>Lorem ipsum</b>",
                        "value": "Lorem ipsum"
                    }
                ]
            },
            "value": "Lorem ipsum"
        }
    ]
}
@JKingweb
Copy link
Author

JKingweb commented Jul 17, 2023

Since the parser has already recursed and parsed the child element at that point, I wonder if these lines should be changed to use the parsed properties from the child.

This line:

if it's a p-* property element, use the first p-name of the h-* child

Could become:

if it's a p-* property element, use the parsed name property of the h-* child

  • If the parsed name property is a { } structure, use its value property
  • Else use the first value in the name array

And so on for the other prefixes.

I think this is what php-mf2 does in practice.

This seems pretty sensible to me, though I think your text is incorrect. I suspect you meant something more like this:

if it's a p-* property element and the element's microformat has at least one name property, use the first name property of the h-* child as follows:

  • If the first name property is a { } structure, use its value property
  • Else use the first name property as parsed

The language is a bit tortured, unfortunately, but I think it expresses the spirit of your proposal accurately.

I wonder what the other parsers do.

Modifying the test so that it is instead:

<div class="h-feed">
  <article class="p-x-articles h-entry">
    Fall through <h1 class="e-name"><b>Lorem ipsum</b></h1>
  </article>
</div>
  • Go falls through to the "use regular p- processing" step, what I believe to be the correct behaviour per the current text
  • JavaScript, Rust, Haskell, and Ruby use the entire name structure of the child
  • Python transcludes the name structure into the child microformat so that it has both value and html keys as siblings of properties
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants