From 09c606181084b78dba9c279dee91a633eae7dce4 Mon Sep 17 00:00:00 2001 From: Random Penguin <205060075+randompenguin1@users.noreply.github.com> Date: Sun, 20 Apr 2025 12:05:26 -0500 Subject: [PATCH] Strip HTML tags from content sent as Markdown MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The "toMarkdown" function prepares content to be sent, primarily, to Diaspora. The HTML to Markdown converter by default "preserves HTML tags without Markdown equivalents like `` and `
.`" At least according to the README in _/friendica/vendor/league/html-to-markdown/_ - which also says "To strip HTML tags that don’t have a Markdown equivalent while preserving the content inside them, set strip_tags..." Diaspora, however, does not appear to know what to DO with the HTML sent to it. It actually appears to _encode_ the HTML and displays the *code* in the post body rather than rendering it as HTML. In which case it would make more sense to strip out all tags that have no Markdown equivalents. --- src/Content/Text/HTML.php | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/Content/Text/HTML.php b/src/Content/Text/HTML.php index f5cf2c6eca..f29f4a148e 100644 --- a/src/Content/Text/HTML.php +++ b/src/Content/Text/HTML.php @@ -689,7 +689,7 @@ class HTML public static function toMarkdown(string $html): string { DI::profiler()->startRecording('rendering'); - $converter = new HtmlConverter(['hard_break' => true]); + $converter = new HtmlConverter(['hard_break' => true, ‘strip_tags’ => true]]); $markdown = $converter->convert($html); DI::profiler()->stopRecording();