Skip to main content

Can"t read Arabic text file in Java



i am trying to read arabic text using Java , yet the scanner does not see any elements and thus reading is unsuccessful although LineNumberReader recognizes lines in the text file.





i have tried the same code on English text and it works fine.





i am using netbeans 7.0.1





here is my code :







public class ReadFile {

private int number_of_words;

private File f1;

private String array[][],lines[];

private Scanner scan1;



public ReadFile(String sf1) throws FileNotFoundException

{

f1=new File(sf1);

scan1=new Scanner(f1);



}



public String[][] getA()

{

return array;

}



public void read() throws IOException

{

int counter=0,i=0;



LineNumberReader lnr = new LineNumberReader(new FileReader(f1));

lnr.skip(Long.MAX_VALUE);

number_of_words=lnr.getLineNumber();

array = new String[2][number_of_words];

lines = new String[number_of_words];

while(scan1.hasNext())

{

String temp;

temp=scan1.nextLine();

lines[counter++] = temp;

System.out.println(lines[counter-1]+"\t"+lines.length);



}



Arrays.sort(lines);

counter=0;



while(i<lines.length)

{

String temp = lines[i++];

StringTokenizer tk=new StringTokenizer(temp,"\t");



array[0][counter] = tk.nextToken();

array[1][counter++] = tk.nextToken();

}

}

}




Comments

  1. By default scanner uses system encoding. You need to use correct character encoding while reading data special characters.

    scan1=new Scanner(f1, "UTF-8");


    If UTF-8 didn't work you need to try with arabic specific encoding.

    Here are couple of links may be useful File reading practices and Java supported encodings

    ReplyDelete
  2. Try reading the file with this:

    FileInputStream fis = new FileInputStream(f1);
    LineNumberReader lnr = new LineNumberReader(new InputStreamReader(fis, "UTF-8"));


    You need to use the right Charset when reading the file.

    ReplyDelete
  3. Scanner(System.in, "UTF-8")


    is most probably what you are looking for.

    Cheers, Eugene.

    ReplyDelete

Post a Comment

Popular posts from this blog

Why is this Javascript much *slower* than its jQuery equivalent?

I have a HTML list of about 500 items and a "filter" box above it. I started by using jQuery to filter the list when I typed a letter (timing code added later): $('#filter').keyup( function() { var jqStart = (new Date).getTime(); var search = $(this).val().toLowerCase(); var $list = $('ul.ablist > li'); $list.each( function() { if ( $(this).text().toLowerCase().indexOf(search) === -1 ) $(this).hide(); else $(this).show(); } ); console.log('Time: ' + ((new Date).getTime() - jqStart)); } ); However, there was a couple of seconds delay after typing each letter (particularly the first letter). So I thought it may be slightly quicker if I used plain Javascript (I read recently that jQuery's each function is particularly slow). Here's my JS equivalent: document.getElementById('filter').addEventListener( 'keyup', function () { var jsStart = (new Date).getTime()...

Is it possible to have IF statement in an Echo statement in PHP

Thanks in advance. I did look at the other questions/answers that were similar and didn't find exactly what I was looking for. I'm trying to do this, am I on the right path? echo " <div id='tabs-".$match."'> <textarea id='".$match."' name='".$match."'>". if ($COLUMN_NAME === $match) { echo $FIELD_WITH_COLUMN_NAME; } else { } ."</textarea> <script type='text/javascript'> CKEDITOR.replace( '".$match."' ); </script> </div>"; I am getting the following error message in the browser: Parse error: syntax error, unexpected T_IF Please let me know if this is the right way to go about nesting an IF statement inside an echo. Thank you.