Skip to main content

Posts

Showing posts with the label unicode

Why does Java allow control characters in its identifiers?

The Mystery In exploring precisely which characters were permitted in Java identifiers, I have stumbled upon something so extremely curious that it seems nearly certain to be a bug. I’d expected to find that Java identifiers conformed to the requirement that they start with characters that have the Unicode property ID_Start and are followed by those with the property ID_Continue , with an exception granted for leading underscores and for dollar signs. That did not prove to be the case, and what I found is at extreme variance with that or any other idea of a normal identifier that I have heard of. Short Demo Consider the following demonstration proving that an ASCII ESC character (octal 033) is permitted in Java identifiers: $ perl -le 'print qq(public class escape { public static void main(String argv[]) { String var_\033 = "i am escape: \033"; System.out.println(var_\033); }})' > escape.java $ javac escape.java $ java escape | cat -v i am escape: ^[

Reading Unicode characters from MySQL with PHP

I've inherited a MySQL database which contains a field named Description of type text and collation of latin1_swedish_ci . The problem with this field is it contains utf-8 data with some Unicode characters, e.g. character 733, etc. Sometimes this character also exists in the field represented as HTML encoded "&#733" as well. I'm trying to read the table and export the data to a CSV file and I need to represent this character as a double quote. Reading the HTML encoded character is easy enough. However, it appears that the actual Unicode character is converted to utf-8 before I can do anything with it resulting in a "?". How do I read in the Unicode character 733 (U+02DD), recognize it and convert it? Here's a simplified (not tested) version of the code. <? $testconn=odbc_connect ("TESTLIB", "......", "......"); $query="SELECT Description FROM TestTable"; $rsWeb=mysql_query($query)); $WebRow=mysq