Make WordPress Core

Changeset 53754

Timestamp:
07/21/2022 09:09:56 PM (2 years ago)
Author:
audrasjb
Message:

Formatting: Normalize to Unicode NFC encoding before converting accent characters in remove_accents().

This changeset adds Unicode sequence normalization from NFD to NFC, via the normalizer_normalize() PHP function which is available with the recommended intl PHP extension.

This fixes an issue where NFD characters were not properly sanitized. It also provides a unit test for NFD sequences (alternate Unicode representations of the same characters).

Props NumidWasNotAvailable, targz, nacin, nunomorgadinho, p_enrique, gitlost, SergeyBiryukov, markoheijnen, mikeschroder, ocean90, pento, helen, rodrigosevero, zodiac1978, ironprogrammer, audrasjb, azaozz, laboiteare, nuryko, virgar, dxd5001, onnimonni, johnbillion.
Fixes #24661, #47763, #35951.
See #30130, #52654.

Location:
trunk
Files:
2 edited

Legend:

Unmodified
Added
Removed
  • trunk/src/wp-includes/formatting.php

    r53455 r53754  
    15851585 * @since 5.7.0 Added locale support for `de_AT`.
    15861586 * @since 6.0.0 Added the `$locale` parameter.
     1587
    15871588 *
    15881589 * @param string $string Text that might have accent characters.
     
    15981599
    15991600    if ( seems_utf8( $string ) ) {
     1601
     1602
     1603
     1604
     1605
     1606
     1607
     1608
     1609
    16001610        $chars = array(
    16011611            // Decompositions for Latin-1 Supplement.
  • trunk/tests/phpunit/tests/formatting/removeAccents.php

    r53562 r53754  
    1010    public function test_remove_accents_simple() {
    1111        $this->assertSame( 'abcdefghijkl', remove_accents( 'abcdefghijkl' ) );
     12
     13
     14
     15
     16
     17
     18
     19
     20
     21
     22
     23
     24
     25
     26
     27
     28
     29
    1230    }
    1331
Note: See TracChangeset for help on using the changeset viewer.