<cahute/text.h> – Text encoding related utilities for Cahute¶
Macro definitions¶
CAHUTE_TEXT_ENCODING_* are constants representing how a given
picture’s data is encoded.
- 
CAHUTE_TEXT_ENCODING_LEGACY_8¶
- Constant representing the Variable width encoding with the legacy character table. 
- 
CAHUTE_TEXT_ENCODING_LEGACY_16_HOST¶
- Constant representing the Fixed-width encoding with the legacy character table, and host endianness. 
- 
CAHUTE_TEXT_ENCODING_LEGACY_16_BE¶
- Constant representing the Fixed-width encoding with the legacy character table, and big endian. 
- 
CAHUTE_TEXT_ENCODING_LEGACY_16_LE¶
- Constant representing the Fixed-width encoding with the legacy character table, and little endian. 
- 
CAHUTE_TEXT_ENCODING_9860_8¶
- Constant representing the Variable width encoding with the fx-9860G character table. 
- 
CAHUTE_TEXT_ENCODING_9860_16_HOST¶
- Constant representing the Fixed-width encoding with the fx-9860G character table, and host endianness. 
- 
CAHUTE_TEXT_ENCODING_9860_16_BE¶
- Constant representing the Fixed-width encoding with the fx-9860G character table, and big endian. 
- 
CAHUTE_TEXT_ENCODING_9860_16_LE¶
- Constant representing the Fixed-width encoding with the fx-9860G character table, and little endian. 
- 
CAHUTE_TEXT_ENCODING_CAT¶
- Constant representing the CAT data encoding. 
- 
CAHUTE_TEXT_ENCODING_CTF¶
- Constant representing the CTF data encoding. 
- 
CAHUTE_TEXT_ENCODING_UTF32_HOST¶
- Constant representing the UTF-32 encoding, with host endianness. 
- 
CAHUTE_TEXT_ENCODING_UTF32_BE¶
- Constant representing the UTF-32 encoding, with big endian. 
- 
CAHUTE_TEXT_ENCODING_UTF32_LE¶
- Constant representing the UTF-32 encoding, with little endian. 
- 
CAHUTE_TEXT_ENCODING_UTF8¶
- Constant representing the UTF-8 encoding. 
Function declarations¶
- 
int cahute_convert_text(cahute_context *context, void **bufp, size_t *buf_sizep, void const **datap, size_t *data_sizep, int dest_encoding, int source_encoding)¶
- Convert text from one encoding to another. - Note - When - CAHUTE_TEXT_ENCODING_UTF32_HOST,- CAHUTE_TEXT_ENCODING_UTF32_BE,- CAHUTE_TEXT_ENCODING_UTF32_LEor- CAHUTE_TEXT_ENCODING_UTF8is used as the destination encoding, Normalization Form C (NFC) is employed; see Unicode Normalization Forms for more information.- Errors you can expect from this function are the following: - CAHUTE_OK
- The conversion has finished successfully, and there is no more bytes in the input buffer to read. 
- CAHUTE_ERROR_TERMINATED
- A sentinel has been found, and the conversion has been interrupted. - Note - If this error is raised, - *datapis set to after the sentinel, and- *data_sizepis set accordingly.- This is useful in case you have multiple text blobs placed back-to-back. 
- CAHUTE_ERROR_SIZE
- The destination buffer had insufficient space, and the procedure was interrupted. 
- CAHUTE_ERROR_TRUNC
- The source data had an incomplete sequence, and the procedure was interrupted. 
- CAHUTE_ERROR_INVALID
- The source data contained an unknown or invalid sequence, and the procedure was interrupted. 
- CAHUTE_ERROR_INCOMPAT
- The source data contained a sequence that could not be translated to the destination encoding. 
 - At the end of its process, this function updates - *bufp,- *buf_sizep,- *datapand- *data_sizepto the final state of the function, even in case of error, so that:- You can determine how much of the destination buffer was filled, by substracting the final buffer size to the original buffer size. 
- In case of - CAHUTE_ERROR_SIZE, you can get the place at which to get the leftover bytes in the source data.
- In case of - CAHUTE_ERROR_TRUNC, you can get the place at which to get the leftover bytes in the source data to complete with additional data for the next conversion.
- In case of - CAHUTE_ERROR_INVALIDor- CAHUTE_ERROR_INCOMPAT, you can get the place of the problematic input sequence.
 - Currently supported conversions are the following: Src. ⯈▼ Dst.- LEGACY_*- 9860_*- CAT- CTF- UTF*- LEGACY_*- x - x - 9860_*- x - x - CAT- CTF- UTF*- x - x - x - For specific guides on how to use this function, see Converting text from an encoding to another. - Parameters:
- context – Context in which to run the function. 
- bufp – Pointer to the destination buffer pointer. 
- buf_sizep – Pointer to the destination buffer size. 
- datap – Pointer to the source data pointer. 
- data_sizep – Pointer to the source data size. 
- dest_encoding – Destination encoding. 
- source_encoding – Source encoding. 
 
- Returns:
- Error, or 0 if the operation was successful. 
 
- 
int cahute_convert_to_utf8(cahute_context *context, char *buf, size_t buf_size, void const *data, size_t data_size, int encoding)¶
- Convert the provided data to UTF-8, and place a terminating NUL character. - This is a utility that calls - cahute_convert_text(), for simple scripts using the Cahute library.- Parameters:
- context – Context in which to run the function. 
- buf – Destination buffer. 
- buf_size – Destination buffer size. 
- data – Source data. 
- data_size – Size of the source data. 
- encoding – Encoding of the source data. 
 
- Returns:
- Error, or 0 if the operation was successful.