Date: Sat, 29 May 2004 11:00:51 +1000 From: Lachlan Andrew To: Gabriele Bartolini , "ht://Dig - Dev" Subject: Re: [htdig-dev] Internationalisation tests and ... euro sign Greetings Gabriele, Someone just reported this as a bug... I suggested replacing the "¬" entity by a "€" entity (see attached patch). That is a real hack, since the "&#xxx;" representation of "¬" will be incorrectly displayed as a euro sign instead of a not, but it should do until we get unicode support. Should we commit this hack, or just leave it as an optional patch? Lachlan On Tue, 24 Feb 2004 11:22 pm, Gabriele Bartolini wrote: > Ciao guys, > > I have tested the attributes that are part of the > 'internationalistaion' task and they are all successfull according > to me and my locale settings (it_IT@euro). > > The only problem regards the correct translation of the euro > character, which is an HTML entity € part of the LATIN 9 > charset. > > Indeed, this character, spreadly used not only in the european > community countries, is not currently imploded/exploded in the > digging and searching phase, returning a terrifying "€" string > in the results. > > Any ideas and workarounds? I found an interesting discussion of > this topic on this URL: http://www.cs.tut.fi/~jkorpela/latin9.html > > Ciao and thanks, > -Gabriele --- htcommon/HtSGMLCodec.cc.orig 2004-05-29 10:18:37.000000000 +1000 +++ htcommon/HtSGMLCodec.cc 2004-05-29 10:49:13.000000000 +1000 @@ -40,7 +40,7 @@ else { myTextFromString = " |¡|¢|£|¤|¥|¦|§|"; - myTextFromString << "¨|©|ª|«|¬|­|®|¯|°|"; + myTextFromString << "¨|©|ª|«|€|­|®|¯|°|"; myTextFromString << "±|²|³|´|µ|¶|·|¸|"; myTextFromString << "¹|º|»|¼|½|¾|¿|À|"; myTextFromString << "Á|Â|Ã|Ä|Å|Æ|Ç|È|";