fu: Close-up of Fu, bringing a scoop of water to her mouth (Default)
fu ([personal profile] fu) wrote in [site community profile] changelog2010-11-30 03:09 pm

[dw-free] unknown8bit entries cannot be edited, and break /editjournal if they exist in the range.

[commit: http://hg.dwscoalition.org/dw-free/rev/25d3da5f57a0]

http://bugs.dwscoalition.org/show_bug.cgi?id=3311

Stop putting the unknown8bit on entries; instead treat entries from all
clients as if they were in utf8. Allow people to edit existing unknown8bit
entries.

Patch by [personal profile] exor674.

Files modified:
  • cgi-bin/LJ/Entry.pm
  • cgi-bin/ljprotocol.pl
  • htdocs/editjournal.bml
  • htdocs/editjournal.bml.text
--------------------------------------------------------------------------------
diff -r 98cbf5191113 -r 25d3da5f57a0 cgi-bin/LJ/Entry.pm
--- a/cgi-bin/LJ/Entry.pm	Tue Nov 30 22:43:58 2010 +0800
+++ b/cgi-bin/LJ/Entry.pm	Tue Nov 30 23:09:07 2010 +0800
@@ -2196,6 +2196,8 @@ sub item_toutf8
     my $convert = sub {
         my $rtext = shift;
         my $error = 0;
+        return unless defined $$rtext;
+
         my $res = LJ::text_convert($$rtext, $u, \$error);
         if ($error) {
             LJ::text_out($rtext);
@@ -2208,8 +2210,9 @@ sub item_toutf8
     $convert->($subject);
     $convert->($text);
 
-    # FIXME: really convert all the props?  what if we binary-pack some in the future?
+    # FIXME: Have some logprop flag for what props are binary
     foreach(keys %$props) {
+        next if $_ eq 'xpost' || $_ eq 'xpostdetail';
         $convert->(\$props->{$_});
     }
     return;
diff -r 98cbf5191113 -r 25d3da5f57a0 cgi-bin/ljprotocol.pl
--- a/cgi-bin/ljprotocol.pl	Tue Nov 30 22:43:58 2010 +0800
+++ b/cgi-bin/ljprotocol.pl	Tue Nov 30 23:09:07 2010 +0800
@@ -963,26 +963,6 @@ sub common_event_validation
         return fail($err,203,"Invalid minute value.");
     }
 
-    # column width
-    # we only trim Unicode data
-
-    if ($req->{'ver'} >=1 ) {
-        $req->{'subject'} = LJ::text_trim($req->{'subject'}, LJ::BMAX_SUBJECT, LJ::CMAX_SUBJECT);
-        $req->{'event'} = LJ::text_trim($req->{'event'}, LJ::BMAX_EVENT, LJ::CMAX_EVENT);
-        foreach (keys %{$req->{'props'}}) {
-            # do not trim this property, as it's magical and handled later
-            next if $_ eq 'taglist';
-
-            # Allow syn_links and syn_ids the full width of the prop, to avoid truncating long URLS
-            if ($_ eq 'syn_link' || $_ eq 'syn_id') {
-                $req->{'props'}->{$_} = LJ::text_trim($req->{'props'}->{$_}, LJ::BMAX_PROP);
-            } else {
-                $req->{'props'}->{$_} = LJ::text_trim($req->{'props'}->{$_}, LJ::BMAX_PROP, LJ::CMAX_PROP);
-            }
-
-        }
-    }
-
     # setup non-user meta-data.  it's important we define this here to
     # 0.  if it's not defined at all, then an editevent where a user
     # removes random 8bit data won't remove the metadata.  not that
@@ -1004,25 +984,41 @@ sub common_event_validation
         LJ::is_ascii(join(' ', values %{$req->{'props'}})) ))
     {
 
-        if ($req->{'ver'} < 1) { # client doesn't support Unicode
+        if ($req->{ver} < 1) { # client doesn't support Unicode
             # only people should have unknown8bit entries.
             my $uowner = $flags->{u_owner} || $flags->{u};
             return fail($err,207,'Posting in a community with international or special characters require a Unicode-capable LiveJournal client.  Download one at http://www.livejournal.com/download/.')
                 if ! $uowner->is_person;
-
-            # so rest of site can change chars to ? marks until
-            # default user's encoding is set.  (legacy support)
-            $req->{'props'}->{'unknown8bit'} = 1;
         } else {
             return fail($err,207, "This installation does not support Unicode clients") unless $LJ::UNICODE;
-            # validate that the text is valid UTF-8
-            if (!LJ::text_in($req->{'subject'}) ||
-                !LJ::text_in($req->{'event'}) ||
-                grep { !LJ::text_in($_) } values %{$req->{'props'}}) {
+        }
+
+        # validate that the text is valid UTF-8
+        if (!LJ::text_in($req->{subject}) ||
+            !LJ::text_in($req->{event}) ||
+            grep { !LJ::text_in($_) } values %{$req->{props}}) {
                 return fail($err, 208, "The text entered is not a valid UTF-8 stream");
-            }
-        }
-    }
+        }
+    }
+    
+
+    # column width
+
+    $req->{'subject'} = LJ::text_trim($req->{'subject'}, LJ::BMAX_SUBJECT, LJ::CMAX_SUBJECT);
+    $req->{'event'} = LJ::text_trim($req->{'event'}, LJ::BMAX_EVENT, LJ::CMAX_EVENT);
+    foreach (keys %{$req->{'props'}}) {
+        # do not trim this property, as it's magical and handled later
+        next if $_ eq 'taglist';
+
+        # Allow syn_links and syn_ids the full width of the prop, to avoid truncating long URLS
+        if ($_ eq 'syn_link' || $_ eq 'syn_id') {
+            $req->{'props'}->{$_} = LJ::text_trim($req->{'props'}->{$_}, LJ::BMAX_PROP);
+        } else {
+            $req->{'props'}->{$_} = LJ::text_trim($req->{'props'}->{$_}, LJ::BMAX_PROP, LJ::CMAX_PROP);
+        }
+
+    }
+
 
     ## handle meta-data (properties)
     LJ::load_props("log");
@@ -2435,16 +2431,9 @@ sub getevents
 
         # now that we have the subject, the event and the props,
         # auto-translate them to UTF-8 if they're not in UTF-8.
-        if ($LJ::UNICODE && $req->{'ver'} >= 1 &&
-                $evt->{'props'}->{'unknown8bit'}) {
-            my $error = 0;
-            $t->[0] = LJ::text_convert($t->[0], $uowner, \$error);
-            $t->[1] = LJ::text_convert($t->[1], $uowner, \$error);
-            foreach (keys %{$evt->{'props'}}) {
-                $evt->{'props'}->{$_} = LJ::text_convert($evt->{'props'}->{$_}, $uowner, \$error);
-            }
-            return fail($err,208,"Cannot display this post.")
-                if $error;
+        if ($LJ::UNICODE && $req->{ver} >= 1 && $evt->{props}->{unknown8bit}) {
+            LJ::item_toutf8($uowner, \$t->[0], \$t->[1], $evt->{props});
+            $evt->{converted_with_loss} = 1;
         }
 
         if ($LJ::UNICODE && $req->{'ver'} < 1 && !$evt->{'props'}->{'unknown8bit'}) {
@@ -3937,7 +3926,7 @@ sub getevents
     my $pct = 0;
     foreach my $evt (@{$rs->{events}}) {
         $ect++;
-        foreach my $f (qw(itemid eventtime logtime security allowmask subject anum url poster)) {
+        foreach my $f (qw(itemid eventtime logtime security allowmask subject anum url poster converted_with_loss)) {
             if (defined $evt->{$f}) {
                 $res->{"events_${ect}_$f"} = $evt->{$f};
             }
diff -r 98cbf5191113 -r 25d3da5f57a0 htdocs/editjournal.bml
--- a/htdocs/editjournal.bml	Tue Nov 30 22:43:58 2010 +0800
+++ b/htdocs/editjournal.bml	Tue Nov 30 23:09:07 2010 +0800
@@ -434,6 +434,11 @@ body<=
             $entry->{'richtext_default'} = $entry->{"prop_used_rte"} ? 1 : 0,
 
             my $onload;
+
+            if ( $res{events_1_converted_with_loss} ) {
+                $ret .= "<?standout $ML{'.invalid_encoding'} standout?><br/>";
+            }
+
             $ret .= LJ::entry_form($entry, \$$head, \$onload);
             $ret .= "</form></div>";
             $ret .= "</td>";
diff -r 98cbf5191113 -r 25d3da5f57a0 htdocs/editjournal.bml.text
--- a/htdocs/editjournal.bml.text	Tue Nov 30 22:43:58 2010 +0800
+++ b/htdocs/editjournal.bml.text	Tue Nov 30 23:09:07 2010 +0800
@@ -53,3 +53,4 @@
 
 .viewwhat=View What Entries:
 
+.invalid_encoding=This entry may contain unknown characters, which have been replaced with a "?". Please ensure the entry is exactly as intended before posting.
--------------------------------------------------------------------------------
matgb: Artwork of 19th century upper class anarchist, text: MatGB (Default)

[personal profile] matgb 2010-11-30 08:57 pm (UTC)(link)
If this does what I think it does (ie make Delicious import posts editable) then this gets a massive woo hoo from me.

Now I need to find a PHP server to put the files back on, easy...
matgb: Artwork of 19th century upper class anarchist, text: MatGB (Default)

[personal profile] matgb 2010-12-01 08:59 am (UTC)(link)
Ah, well, I only tried one import as a test ages back, and Support told me it was a unicode issue and I should get Delicious to improvbe their kit. Given they've been promising an improvement for over a year and have done nothing, this is much better. Danke.
cheyinka: A grayscale Metroid on the head of a Dreamwidth usericon (metroiduser)

[personal profile] cheyinka 2010-11-30 10:58 pm (UTC)(link)
As the submitter (as my writing journal) of one of the support requests that this fixes, I am delighted to see this :D
matgb: Artwork of 19th century upper class anarchist, text: MatGB (Default)

[personal profile] matgb 2010-12-03 03:07 am (UTC)(link)
This actually fixes Both of my extant unsolved support tickets (#7151 & #1303), Delicious posts are now editable and the weird encoding issues with several of my feeds have cleared up. This is very good indeed.