Problems with umlauts
-
Hello,
i have installed Buddypress the bbPress Forum. Both are the latests builds. If i write in bbPress direct, i can use öäüß and the forum shows them. The look also ok in buddypress.
If i writhe öäüß into an Buddypress Group forum, i get somthing like that ß Ö„Ü¼/p>. In bbPres it looks like that aus /amp/auml;/amp/ouml;/amp/uuml;/amp/szlig; /amp/Ouml;/amp/Auml;/amp/Uuml;
In both configs WPMU und bbPress is utf8 defined
Any hint how to solve the problem? It seems to be a problem of the filtering funktion in the buddypress forum plugin.
Best,
Karl-Heinz
-
No one can help?
Can you confirm that they are making it back to the bbpress tables correctly? IE if you open the post in bbpress, are they already corrupt?
Or is it only when they get displayed back to Buddypress?
Also if you make a post in bbpress, does it show up in buddypress correctly or is it corrupt there also?
Answering these questions may help to narrow down the bug, or may prove to be a wide spread issue.
Sorry I couldn’t give you a solution, but first we must understand the problem at hand.
Brad
If i write in the BuddyPress Group Forum in the Buddypress Frontend, all the Umlauts are gone and look like in the first post above.
If i view this in the bbPress frontend the Umlauts are also broken.
When i write into the bbPress Frontend direkt then the Umlauts are ok. They are also ok, when i list them in the BuddyPress Group Forum. So the transfer from bbPress to BuddyPress is ok, the other way around it does not work.
Lemme go look at this. I was just in that area about a week ago.
(later) Works for me. Lots of u-umlaut chars going to and from bp group forums, to bbpress and back again. Showing up as proper u-umlaut chars. Andy just installed some fixes lately in this area. Try upgrading to the latest bp trunk.
The fixes include an upgrade to the buddypress-enable.php plugin that runs in bbpress. Don’t forget that.
Hello,
i have installed the latest version. Now in Buddypress the umlauts in the Title of the forum message are ok. but the umlauts in the textfield are filtered completly out. That means if i put umlauts like öäüÖÄÜß there, then send it and list it, i just see an empty textfield.
Best,
Karl
I can’t reproduce the problem you are describing.
In bp the title of a post going from bp group forums to bbpress goes through all the same filtering as the content except the following:
wpautop
make_clickable
bp_forums_filter_encode / bp_forums_filter_decode filters. The content gets that treatment.
Are your umlaut chars showing ok in bbpress if you enter them from bp forums?
Just to narrow things down. Can you temporarily comment out line 4 in bp-forums-filters.php which reads: add_filter( ‘bp_forums_new_post_text’, ‘bp_forums_filter_encode’ );
Then try adding a new post with umlaut chars. This filter needs to be there but let’s see if it is the problem.
If you enter umlaut chars in the content of a forum post from bp and they are displayed correctly in bbpress then the problem isn’t the ‘encoding’ filter. It’s the ‘decoding’ filter which has an extra step in it.
On the decoding site in bp when the content comes back from bbpress we also call another filter inside the decode filter. This one is called: wp_filter_kses() and it gets called from the bp filter bp_forums_filter_decode() in bp-forums-filters.php.
If commenting out the line in my previous message doesn’t solve the problem then uncomment the line in the above message and try commenting out line 53 which reads
$content = stripslashes( wp_filter_kses( $content ) );
.Hello,
if i comment the filter out, everything is ok.
Best,
Karl
In bp_forums_filter_encode() line 42 should read:
$content = htmlentities( $content, ENT_COMPAT, "UTF-8" );
What does yours say?in line 42 i have exactly the same
$content = htmlentities( $content, ENT_COMPAT, “UTF-8” );
strange
Best,
Karl-Heinz
I wish I could reproduce your problem. It’s tough to track down a problem if it isn’t reproducible somehow.
Hello fishbowl81,
is there a way to use an other filter than the one above? What can happen if i leave the filter out?
I also use the latest WPMU Beta. Can this be the reason?
Best,
Karl
Hello.
maybe a hint.
after the update of the files the umlauts in the title of the message are ok. So if i write öäüßÖÄÜ in the title they appear.
If i write the same umlauts öäüßÖÄÜ? in the textbox, they totally disappear now. So if i write only this small chain of umlauts, the textbox is empty. This happening also with the lates fixes i get from the trunk.
Best,
Karl
ps. btw. If i edit an entry in this forum here and send it, i get the message “Topic not found.”, but the edited text appears.
Hello,
no idea?
Best,
Karl
Hello,
i still have not found the problem with the umlauts.
the strange thing is, that in the forum title (BuddyPress frontend) the umlauts are there, but in the text the are deletet.
So if the title is like öäüÖÄÜß, after sending it, it looks like öäüÖÄÜß
If i write the same umlauts in the textbox, then after sending it the text is empty.
Is the filter for the title and the textbox not the same in BuddyPress?
If i write it in the bbPress Frontend, everything is ok.
Best,
Karl
I put in a ticket in trac on this. It’s become and ‘official’ problem.
I actually was stumbing across this last night. Maybe your encoding isn’t UTF-8 and it is being force? How about changing the code to get the current sites encoding instead of it being hard coded in that query?
For example:
$content = html_entity_decode($content, ENT_COMPAT, get_option('blog_charset'));
Trent
I’ve been looking at this problem and getting frustrated. I realized finally that the filter function actually does two things:
$content = htmlentities( $content, ENT_COMPAT, "UTF-8" );
$content = str_replace( '&', '/amp/', $content );From http://loadaveragezero.com/app/drx/Data_Formats/Character_Encoding
[…] But there are a number of other issues to deal with. In particular, because UTF-8 is a multibyte encoding, meaning one character can be represented by more one or more bytes. This causes trouble for PHP, because the language parses and processes strings based on bytes, not characters, and makes mincemeat multibyte strings – for example, by splitting characters ‘in half’, bodging up regular expressions, and rendering email unreadable.
Karl can you just comment out the following lines please:
Line 46
$content = str_replace( '&', '/amp/', $content )
in bp-forums-filters.phpand line 52
$post_text = str_replace( '/amp/', '&', $post_text );
in buddypress-enable.php on the bbpress side.I’d like to narrow this down to the htmlentities fn.
I’m gonna help solve this or just move to a planet where only ASCII is spoken.
Hello,
i comment out this lines, but i get the same result.
I do some more tests, maybe they help.
if i put this in the title: öäüÖÄÜß
and this 3 lines in the content:
öäüÖÄÜß
1. fdgfd gdfgd sfgsdgs dfggf dfgd
2. fdgdsgd fdgdfgd gdfsadfsdafsd dfg
i get this:
Title: öäüÖÄÜß
Content: all lines deleted
If i put this in the Title: öäüÖÄÜß
and this in the content:
1. fdgfd gdfgd sfgsdgs dfggf dfgd
2. fdgdsgd fdgdfgd gdfsadfsdafsd dfg
öäüÖÄÜß
i get this:
Title: öäüÖÄÜß
Content:
1. fdgfd gdfgd sfgsdgs dfggf dfgd
2. fdgdsgd fdgdfgd gdfsadfsdafsd dfg
Last test. If i put this line in the content:
öäüÖÄÜß this is a test
after sending it the whole line is empty /get deleted.
All this is only in the BuddyPress frontend.
So in the title i can use any umlauts, but in the content the filter routine deletes not only the umlauts, but depending on where the umlauts are also the rest of the text.
Best,
Karl
Hi Trent,
i try your suggestion, but it dont change anything.
Best,
Karl
Got anything to add to the trac discussion Karl? https://trac.buddypress.org/ticket/436
Just disabling the filters is not a good idea. The filtering of content is bound up with stripping sensitive data. Maybe Andy can decouple those two and enable the content filters for those users that get bitten by the libxml2 problem.
Hi,
everything works on the bbPress side with umlauts. Whould it be possible to use the same filters for the BuddyPress side of the forum?
Best,
Karl-Heinz
I had issues with UTF-8 encoding, although not the ones Karl mentions, due to my host running my sites on a PHP 4 server. After a move to a PHP 5 server the problems went away.
Hi,
i still have the same Problem. Any good news
Best,
Karl
No offense but the “fix” for this bug is not acceptable. “a few international characters not working” /is/ a big deal to the rest of the world.
The characters show up fine in Buddypress no matter where I post the text; BB or BP.
The characters show up fine in BP and wrong in BB if I post the text in BP.
There has to be another way around this.
- The topic ‘Problems with umlauts’ is closed to new replies.