|
BurgerLib
|
Conversion routines to the UTF32 format. More...
#include <ststring.h>
Public Types | |
| enum | { BAD = -1, ENDIANMARK = 0xFEFF, BE = 0xFFFE0000, LE = 0xFEFF } |
Static Public Member Functions | |
| static Word BURGER_API | IsValid (Word32 Input) |
| Validate a UTF32 value. | |
| static Word BURGER_API | IsValid (const Word32 *pInput) |
| Check a UTF32 "C" string for validity. | |
| static Word BURGER_API | IsValid (const Word32 *pInput, WordPtr uInputSize) |
| Check a UTF32 Word32 array for validity. | |
| static Word32 BURGER_API | FromUTF8 (const char *pInput) |
| Return a UTF32 code from a UTF8 stream. | |
| static Word BURGER_API | FromUTF8 (Word32 *pOutput, WordPtr uOutputSize, const char *pInput) |
| Convert a UTF8 "C" string into a UTF32 stream. | |
| static Word BURGER_API | FromUTF8 (Word32 *pOutput, WordPtr uOutputSize, const char *pInput, WordPtr uInputSize) |
| Convert a UTF8 stream into a UTF32 Word32 array. | |
Conversion routines to the UTF32 format.
UTF32 is simplest data format for Unicode data to be stored in a 32 bit wide "C" string. It can easily contain all of the characters for the worlds languages. These functions allow conversion from UTF8, which Burgerlib is based on, to UTF32 which some foreign APIs require for internationalization. Please note that these functions operate on strings that are native endian.
| anonymous enum |
| BAD |
Value returned if a routine failed. If a function doesn't return true or false for failure, it will return this value instead. Please see the documentation for each function to know which ones use true/false pairs or this value. |
| ENDIANMARK |
Byte stream token for native endian. When writing a text file using UTF32, you may need to write this value as the first character to mark the endian that the data was saved at. This value is the correct value for the native endian of the machine. Use Burger::UTF32::BE or Burger::UTF32::LE to test incoming data to determine the endian of data that's unknown. |
| BE |
32 bit token for Big Endian UTF16 data. If a token was read in the matched this constant, then you must assume that all of the following data is Big Endian. |
| LE |
32 bit token for Little Endian UTF16 data. If a token was read in the matched this constant, then you must assume that all of the following data is Little Endian. |
| Word32 BURGER_API Burger::UTF32::FromUTF8 | ( | const char * | pInput | ) | [static] |
Return a UTF32 code from a UTF8 stream.
Convert from a UTF8 stream into a 32 bit Unicode value (0x00 to 0x10FFFF). This function will perform validation on the incoming stream and will flag any data that's invalid.
| pInput | Pointer to a valid UTF8 "C" string. NULL will page fault. |
| Word BURGER_API Burger::UTF32::FromUTF8 | ( | Word32 * | pOutput, |
| WordPtr | uOutputSize, | ||
| const char * | pInput | ||
| ) | [static] |
Convert a UTF8 "C" string into a UTF32 stream.
Take a "C" string that is using UTF8 encoding and convert it to a UTF32 encoded "C" string. The function will return the size of the string after encoding. This size is valid, even if it exceeded the output buffer size. The output pointer and size can be null to have this routine calculate the size of the possible output so the application can allocate a buffer large enough to hold it.
| pOutput | Pointer to UTF8 buffer to receive the converted string. NULL is okay if uOutputSize is zero, otherwise it will page fault. |
| uOutputSize | Size of the output buffer in bytes. |
| pInput | UTF32 encoded "C" string. NULL will page fault. |
| Word BURGER_API Burger::UTF32::FromUTF8 | ( | Word32 * | pOutput, |
| WordPtr | uOutputSize, | ||
| const char * | pInput, | ||
| WordPtr | uInputSize | ||
| ) | [static] |
Convert a UTF8 stream into a UTF32 Word32 array.
Take a byte array that is using UTF8 encoding and convert it to a UTF32 Word32 encoded "C" string. The function will return the size of the string after encoding. This size is valid, even if it exceeded the output buffer size. The output pointer and size can be null to have this routine calculate the size of the possible output so the application can allocate a buffer large enough to hold it.
| pOutput | Pointer to a byte buffer to receive the UTF32 string. NULL is okay if uOutputSize is zero, outwise a page fault will occur. |
| uOutputSize | Size of the output buffer in bytes. |
| pInput | UTF8 encoded byte array. NULL is okay if uInputSize is zero. |
| uInputSize | Size of the input byte array. |
| Word BURGER_API Burger::UTF32::IsValid | ( | Word32 | Input | ) | [static] |
| Word BURGER_API Burger::UTF32::IsValid | ( | const Word32 * | pInput | ) | [static] |
Check a UTF32 "C" string for validity.
Check a "C" string if it's a valid UTF32 stream. Return false if there was an error, or true if the bytes represent a valid UTF32 pattern.
| pInput | Pointer to a zero terminated string. NULL will page fault. |
| Word BURGER_API Burger::UTF32::IsValid | ( | const Word32 * | pInput, |
| WordPtr | uInputSize | ||
| ) | [static] |
Check a UTF32 Word32 array for validity.
Check a Word32 array and see if it's a valid UTF32 stream. Return false if there was an error, or true if the bytes represent a valid UTF32 pattern.
| pInput | Pointer to UTF32 data. Can be NULL if uInputSize is zero, otherwise page fault. |
| uInputSize | Length of the data in bytes, if zero, then the function will return true. If the length is odd, the low bit will be masked off to force it even. |
1.8.0