Most Computer Code Compilers Vulnerable to Novel Attacks
Most computer code compilers are at risk of ‘Trojan source’ attacks in which adversaries can introduce targeted vulnerabilities into any software without being detected, according to researchers from the University of Cambridge.
The paper, Trojan Source: Invisible Vulnerabilities, detailed how weaknesses in text encoding standards such as Unicode can be exploited “to produce source code whose tokens are logically encoded in a different order from the one they are displayed.” This leads to very difficult vulnerabilities for human code reviewers to detect, as the rendered source code looks perfectly acceptable.
Specifically, the weakness was observed in Unicode’s bi-directional (Bidi) algorithm, which handles displaying text that includes mixed scripts with different display orders, such as Arabic – which is read right to left – and English (left to right). Unicode currently defines more than 143,000 characters across 154 different language scripts.
The researchers noted that in some cases, Bidi override control characters enable switching the display ordering of groups of characters.
Most programming languages allow these Bidi overrides to be put in comments and strings, which developers largely ignore. This enables targeted vulnerabilities to be inserted into source code without detection.