unicode - Escaped Characters Outside the Basic Multilingual Plane (BMP) in Prolog -


for reference, i'm using prolog v7.4.2 on windows 10, 64-bit

entering following code in repl:

write("\u0001d7f6"). % mathematical monospace digit 0 

gives me error in output:

error: syntax error: illegal character code error: write(" error: ** here ** error: \u0001d7f6") . 

i know fact u+1d7f6 valid unicode character, what's up?

swi-prolog internally uses c wchar_t represent unicode characters. on windows these 16 bit , intended hold utf-16 encoded strings. swi-prolog uses wchar_t nice arrays of code points , supports ucs-2 on windows (code points u0000..uffff).

on non-windows systems, wchar_t 32 bits , complete unicode range supported.

it not trivial thing fix handling wchar_t utf-16 looses nice property each element of array 1 code point , using our own 32-bit type means cannot use c library wide character functions , have reimplement them in swi-prolog. not work, replacing them pure c versions looses optimization typically present in modern c runtime libraries.


Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -