Vulnerability Title: Out-of-Bounds Read in CAPECharacterHelper::GetUTF16FromUTF8 Function in Monkey's Audio
Discovered by: tzh00203
Contact Information: [email protected]
An out-of-bounds read vulnerability has been discovered in Monkey's Audio, specifically in the CAPECharacterHelper::GetUTF16FromUTF8 function. The issue arises from improper handling of the length of the input UTF-8 string, causing the function to read past the memory boundary. This vulnerability may result in a crash or expose sensitive data.
The CAPECharacterHelper::GetUTF16FromUTF8 function is responsible for converting UTF-8 strings to UTF-16. The function parses the input UTF-8 string byte by byte but does not perform adequate boundary checks. Specifically, it does not account for multi-byte UTF-8 characters, such as four-byte characters (e.g., 😀, 🧑), which can appear in certain strings, including file paths on Windows systems. When processing these characters, if the input string is not correctly terminated or its length is miscalculated, the function may attempt to read beyond the valid memory space, leading to an out-of-bounds read.

In this code snippet, nIndex is continuously incremented, but the boundary of pUTF8 is not adequately checked. If the UTF-8 string does not end with a null character (\\0), or if its actual length is inconsistent with expectations, the index (nIndex) will go beyond the valid memory range.
Furthermore, there is no handling of UTF-8 four-byte characters, such as emojis (e.g., 😀), which are allowed in Windows file paths but can cause further parsing issues in the pUTF8.
This vulnerability affects Monkey's Audio version 11.31 and earlier.