Skip to main content

How to split Tamil characters in a string in PHP


How do I split Tamil characters in a string?



When I use preg_match_all('/./u', $str, $results) ,





I get the characters "த", "à®®", "ி", "à®´" and "்".



How do I get the combined characters "த", "à®®ி" and "à®´்"?


Source: Tips4all

Comments

  1. I think you should be able to use the grapheme_extract function to iterate over the combined characters (which are technically called "grapheme clusters").

    Alternatively, if you prefer the regex approach, I think you can use this:

    preg_match_all('/\pL\pM*|./u', $str, $results)


    where \pL means a Unicode "letter", and \pM means a Unicode "mark".

    (Disclaimer: I have not tested either of these approaches.)

    ReplyDelete

Post a Comment

Popular posts from this blog

Slow Android emulator

I have a 2.67 GHz Celeron processor, 1.21 GB of RAM on a x86 Windows XP Professional machine. My understanding is that the Android emulator should start fairly quickly on such a machine, but for me it does not. I have followed all instructions in setting up the IDE, SDKs, JDKs and such and have had some success in staring the emulator quickly but is very particulary. How can I, if possible, fix this problem?