[dreamwidth/dreamwidth] 64b109: Remove dead utf8convert links, handle invalid UTF-...
Branch: refs/heads/main Home: https://github.com/dreamwidth/dreamwidth Commit: 64b109f6fdd36a9130ef4a90057e71e07be5ec86 https://github.com/dreamwidth/dreamwidth/commit/64b109f6fdd36a9130ef4a90057e71e07be5ec86 Author: Mark Smith mark@dreamwidth.org Date: 2026-03-12 (Thu, 12 Mar 2026)
Changed paths: M bin/upgrading/deadphrases.dat M cgi-bin/DW/Controller/Create.pm M cgi-bin/DW/Controller/Manage/Profile.pm M cgi-bin/LJ/TextUtil.pm M t/plack-request.t M t/textutil.t M views/create/setup.tt M views/manage/profile.tt M views/manage/profile.tt.text
Log Message:
Remove dead utf8convert links, handle invalid UTF-8 in profiles (#3535)
- Remove dead utf8convert links and handle invalid UTF-8 in profiles
The utf8convert page was removed years ago, but the profile editing and account creation pages still linked to it when a user's name or bio contained invalid UTF-8. This left users unable to edit those fields at all.
Instead of hiding fields behind a dead link, clean invalid UTF-8 byte sequences on load using a new LJ::clean_utf8() utility function. This strips broken sequences while preserving valid multi-byte characters, so the edit fields are always shown.
- Add LJ::clean_utf8() to LJ::TextUtil
- Clean name/bio on load in profile and create controllers
- Remove text_in/is_utf8 conditionals from profile.tt and setup.tt
- Remove name_absent/bio_absent hidden input fallback logic
- Mark dead translation strings in deadphrases.dat
- Add 16 regression tests for text_in, text_trim, and clean_utf8
Fixes #1894
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
- Add tests for undef input and 4-byte UTF-8 (emoji) in clean_utf8
Cover edge cases: undef returns empty string, emoji (4-byte sequences) are preserved, and truncated 4-byte sequences are properly stripped while preserving valid preceding characters.
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
