Lingua-Identify-CLD2
view release on metacpan or search on metacpan
src/cld2/public/compact_lang_det.h view on Meta::CPAN
typedef struct {
int offset; // Starting byte offset in original buffer
int32 bytes; // Number of bytes in chunk
uint16 lang1; // Top lang, as full Language. Apply
// static_cast<Language>() to this short value.
uint16 pad; // Make multiple of 4 bytes
} ResultChunk;
typedef std::vector<ResultChunk> ResultChunkVector;
// These initial simple versions all cascade through the full-blown last
// version which it would be better for you to use directly because you will
// get better results passing in any available hints.
// Scan interchange-valid UTF-8 bytes and detect most likely language
// If the input is in fact not valid UTF-8, this returns immediately with
// the result value UNKNOWN_LANGUAGE and is_reliable set to false.
//
// In all cases, valid_prefix_bytes will be set to the number of leading
// bytes that are valid UTF-8. If this is < buffer_length, there is invalid
// input starting at the following byte.
( run in 0.446 second using v1.01-cache-2.11-cpan-49f99fa48dc )