fu: Close-up of Fu, bringing a scoop of water to her mouth (Default)
fu ([personal profile] fu) wrote in [site community profile] changelog2010-06-14 04:05 pm

[dw-free] tags with non-standard characters keep capitalization

[commit: http://hg.dwscoalition.org/dw-free/rev/de76dd13ff5c]

http://bugs.dwscoalition.org/show_bug.cgi?id=1598

Lowercase the tags, even when they contain non-unicode characters. Note:
Will not affect the display of older tags, until the entry they're on is
edited.

Patch by [personal profile] kareila.

Files modified:
  • cgi-bin/LJ/Tags.pm
  • cgi-bin/ljtextutil.pl
--------------------------------------------------------------------------------
diff -r 7bf1aafc3576 -r de76dd13ff5c cgi-bin/LJ/Tags.pm
--- a/cgi-bin/LJ/Tags.pm	Mon Jun 14 23:58:29 2010 +0800
+++ b/cgi-bin/LJ/Tags.pm	Tue Jun 15 00:11:38 2010 +0800
@@ -640,8 +640,7 @@ sub is_valid_tagstring {
         $tag = LJ::trim($tag);
         $tag =~ s/\s+/ /g; # condense multiple spaces to a single space
         $tag = LJ::text_trim($tag, LJ::BMAX_KEYWORD, LJ::CMAX_KEYWORD);
-        $tag = lc $tag
-            if $tag !~ /[\x7f-\xff]/;
+        $tag = LJ::utf8_lc( $tag );
         return $tag;
     };
 
diff -r 7bf1aafc3576 -r de76dd13ff5c cgi-bin/ljtextutil.pl
--- a/cgi-bin/ljtextutil.pl	Mon Jun 14 23:58:29 2010 +0800
+++ b/cgi-bin/ljtextutil.pl	Tue Jun 15 00:11:38 2010 +0800
@@ -297,6 +297,20 @@ sub is_utf8_wrapper {
     }
 }
 
+
+# alternate version of "lc" that handles UTF-8
+# args: text string for lowercasing
+# returns: lowercase string
+sub utf8_lc {
+    use Encode;  # Perl 5.8 or higher
+
+    # get the encoded text to work with
+    my $text = decode( "UTF-8", $_[0] );
+    # return the lowercased text
+    return encode( "UTF-8", lc $text );
+}
+
+
 # <LJFUNC>
 # name: LJ::text_out
 # des: force outgoing text into valid UTF-8.
--------------------------------------------------------------------------------
sophie: A cartoon-like representation of a girl standing on a hill, with brown hair, blue eyes, a flowery top, and blue skirt. ☀ (Default)

[personal profile] sophie 2010-06-16 12:14 am (UTC)(link)
I just realised that you said "non-unicode characters" in this log - obviously you meant non-ASCII, not non-Unicode. :)