New in Symfony 6.1: HtmlSanitizer Component
Symfony 6.1 will be released at the end of May 2022 and it will require
PHP 8.1 or higher. This is the first article of the series that shows the most
important new features introduced by Symfony 6.1.
Contributed by
Titouan Galopin
in #44681.
Web applications often need to work with HTML contents generated by users. It’s
difficult to do so in a safe way. Rendering those unsafe HTML contents in a
Twig template or injecting them via JavaScript in the innerHTML property of
elements can lead to unwanted and dangerous JavaScript code execution.
HTML sanitization is „the process of examining an HTML document and
producing a new HTML document that preserves only whatever tags or attributes
that are designated safe and desired“.
Most of the times, this sanitization process is used to protect against attacks
such as cross-site scripting (XSS). However, sanitization is also about fixing
wrong HTML contents in the best way possible:
<!– an example of a wrong HTML input provided by the user –>
Original: <div><em>foo</div>
<!– the best solution to fix this HTML code is to add the missing tag –>
Sanitized: <div><em>foo</em></div>
<!– however, if the HTML error appears in other elements, the fix could be different –>
Original: <textarea><em>foo</textarea>
<!– the best solution in this case is to HTML encode the wrong tag –>
Sanitized: <textarea><em>foo</textarea>
In Symfony 6.1 we’re adding a PHP-based HTML sanitizer so you can transform
user generated HTML content into safe HTML content. This new component is similar
to the upcoming W3C HTML Sanitizer API and we even use the same method names
whenever possible to ease the learning curve.
use SymfonyComponentHtmlSanitizerHtmlSanitizerConfig;
// By default, any elements not included in the allowed or blocked elements
// will be dropped, including its children
$config = (new HtmlSanitizerConfig())
// Allow „safe“ elements and attributes. All scripts will be removed
// as well as other dangerous behaviors like CSS injection
->allowSafeElements()
// Allow the „div“ element and no attribute can be on it
->allowElement(‚div‘)
// Allow the „a“ element, and the „title“ attribute to be on it
->allowElement(‚a‘, [‚title‘])
// Allow the „span“ element, and any attribute from the Sanitizer API is allowed
// (see https://wicg.github.io/sanitizer-api/#default-configuration)
->allowElement(’span‘, ‚*‘)
// Drop the „div“ element: this element will be removed, including its children
->dropElement(‚div‘)
;
In addition to adding and removing HTML elements and attributes, you can force
the value of some attributes to improve the resulting HTML contents:
$config = (new HtmlSanitizerConfig())
// …
// Forcefully set the value of all „rel“ attributes on „a“
// elements to „noopener noreferrer“
->forceAttribute(‚a‘, ‚rel‘, ’noopener noreferrer‘)
// Drop the „data-custom-attr“ attribute from all elements:
// this attribute will be removed
->dropAttribute(‚data-custom-attr‘, ‚*‘)
// Transform all HTTP schemes to HTTPS
->forceHttpsUrls()
// Configure which hosts are allowed in img/audio/video/iframe (by default all are allowed)
->allowedMediaHosts([‚youtube.com‘, ‚example.com‘])
;
In addition to these, there are many other configuration options. Check out the
docs for the HtmlSanitizer bundle. Once configured, use the sanitizer as follows:
use SymfonyComponentHtmlSanitizerHtmlSanitizer;
$sanitizer = new HtmlSanitizer($config);
// this sanitizes contents in the <body> context, removing any tags that are
// only allowed inside the <head> element
$sanitizer->sanitize($userInput);
// this sanitizes contents to include them inside a <head> tag
$sanitizer->sanitizeFor(‚head‘, $userInput);
// this sanitizes contents in the best way possible for the HTML element
// provided as the first argument (sometimes it will add missing tags and
// other times it will HTML-encode the unclosed tags)
$sanitizer->sanitizeFor(‚textarea‘, $userInput); // it will encode as HTML entities
$sanitizer->sanitizeFor(‚div‘, $userInput); // it will sanitize same as <body>
Symfony Blog
Read More