Invisible Threats and the Blind Spots of Security
How GlassWorm Exploited Unicode Shadows in VS Code Supply Chains

Prefer to read a PDF?
Introduction
In October 2025, researchers at KOI Security uncovered a malware campaign that targeted the Open VSX Registry. Nicknamed “GlassWorm” by the KOI Security team, the malware campaign published multiple malicious VS Code extensions that together were downloaded more than 35,000 times. The compromised packages shared several notable traits for stealthy payload delivery, including an unconventional Unicode-based obfuscation technique that remained invisible within IDEs.
This research was motivated by the 2021 Trojan Source research, which showed how carefully chosen Unicode control characters can change how code is displayed or interpreted. That work forced platforms and IDEs, including GitHub and many editors, to surface warnings for suspicious Unicode sequences. The campaign we analyzed, however, uses a different and under-observed class of characters (variation selectors) that remain largely invisible to common tooling.
While the earlier reports surfaced the campaign and its broad implications, our follow-up research dives deeper into the technical underpinnings of GlassWorm’s obfuscation and propagation mechanisms. We conducted a detailed reverse engineering of the malicious VSIX extensions, deconstructed the encoded payloads. This post dissects GlassWorm’s design and capabilities, explains why conventional tooling missed it, and outlines defensive steps defenders and developers should adopt.
This report has three sections
About the author
What makes GlassWorm different
GlassWorm is notable for how it blends multiple evasion and resilience techniques into a single campaign:
- Invisible Unicode obfuscation. The payload is embedded as long runs of invisible Unicode characters (Variation Selectors / Private Use Area code points). Those characters do not display in most editors and are therefore difficult for casual human review to notice. The malware maps these code points to bytes and reconstructs a Base64 blob that decodes to executable JavaScript.
- Decentralized, resilient C2 channels. Rather than relying on a single server, GlassWorm uses multiple fallback channels:
- Transaction memos on the Solana blockchain as an immutable, censorship‑resistant command-and-control (C2) channel.
- Direct HTTP(S) endpoints hosted at attacker IPs.
- Encoded Google Calendar event titles as an additional fallback communication vector.
- Credential and token harvesting. Once active, the malware searches the host for developer credentials (GitHub, NPM, OpenVSX tokens, crypto wallets), enabling further compromise of repositories and package uploads.
- Repurposing developer machines. Infected hosts are converted into covert infrastructure: SOCKS proxies, hidden VNC (HVNC) servers, and remote execution nodes (via WebRTC or spawned Node.js processes). That gives attackers anonymized network access into corporate and personal networks and a platform to propagate further.
- Bundled, multi‑platform payloads. The final payload is a single large bundled artifact containing many standard modules and native decoders for different OSes. This reduces dependency visibility and eases execution across platforms.
High-level execution flow
Stage 1: Payload Decoding
1. Extension activation
The extension’s package.json registers an activation event that immediately loads ./extension.js. extension.js enforces rate-limited activation, a helper routine runs on first activation and then only after a configured cooldown (~2 days), persisting timestamps to context.globalState so it survives restarts. An in‑session flag prevents multiple invocations per process lifetime.

Observed behavior:
- First run: helper() executes and the activation timestamp is recorded.
- Subsequent runs (within 2 days): no action is taken, helper() is not called.
- After 2 days: helper() runs once more and the timestamp is updated.
- Within one process lifetime: isActivated ensures helper() cannot be invoked more than once.

2. Payload entry and decoding
The helper() function launches index.js (or an OS‑specific entrypoint). index.js contains the invisible Unicode string payload. Based on the host OS it selects a decoder:
- Windows x86: index_win32_ia32.node (native)
- Windows x64: index_win32_x64.node (native)
- Linux: index_linux.node (native)
- macOS: decode.js (JavaScript)
These native binaries were first uploaded to VirusTotal on 2025-03-23 and remained completely undetected with 0 detections until the attack on 2025-10-22. This suggests that the binaries were likely off-the-shelf tools reused from previous campaigns or public sources, rather than newly developed components.
Links to Virustotal

3. The invisible‑Unicode technique
The decoder maps each invisible Unicode code point to a byte value (using a simple offset arithmetic) to reconstruct a Base64-encoded blob. That Base64 decodes into JavaScript which is then executed.
The comparison below shows the payload file opened in both the IDE and a hex editor. While the payload appears invisible in the IDE, its underlying byte values are clearly visible in the hex editor.

Before we dive into the technique, let's understand:
1. Code Points
Code Points, the foundation of Unicode: A code point is a number that uniquely identifies a character in Unicode. Think of it as an address or ID for a character:
- The letter “A” = code point U+0041 (decimal 65)
- The letter “€” = code point U+20AC (decimal 8364)
Every character you can type, letters, symbols, emojis, or even invisible formatting characters, has a unique code point number. The malware exploits this by using invisible characters whose code points can be mathematically converted into byte values (0-255)..
2. Unicode Variation Selectors (VS)
Unicode Variation Selectors are special invisible characters used to select different visual representations of the same character. There are two ranges:
- VS1-VS16: U+FE00 to U+FE0F (code points 65024-65039, 16 selectors)
- VS17-VS256: U+E0100 to U+E01EF (code points 917760-918015, 240 selectors)
These characters don't render in most text editors making it invisible and has 256 possible values = 256 byte values (0-255)
The real decoding of invisible characters from decode.js

Here's the exact conversion of the first 10 invisible payloads from `index.js` that results in a Base64 string "dmFyIF9fY3". The math is simple arithmetic: codepoint - base_offset +16 = byte_value
Character-by-character decoding:
- U+E0154 → 0xE0154 - 0xE0100 + 16 = 100 (0x64) = 'd'
- U+E015D → 0xE015D - 0xE0100 + 16 = 109 (0x6D) = 'm'
- U+E0136 → 0xE0136 - 0xE0100 + 16 = 70 (0x46) = 'F'
- U+E0169 → 0xE0169 - 0xE0100 + 16 = 121 (0x79) = 'y'
- U+E0139 → 0xE0139 - 0xE0100 + 16 = 73 (0x49) = 'I'
- U+E0136 → 0xE0136 - 0xE0100 + 16 = 70 (0x46) = 'F'
- U+E0129 → 0xE0129 - 0xE0100 + 16 = 57 (0x39) = '9'
- U+E0156 → 0xE0156 - 0xE0100 + 16 = 102 (0x66) = 'f'
- U+E0149 → 0xE0149 - 0xE0100 + 16 = 89 (0x59) = 'Y'
- U+E0123 → 0xE0123 - 0xE0100 + 16 = 51 (0x33) = '3'
The invisible string consists of 6,492 Unicode characters, it is decoded to Base64, then Base64-decoded into a JavaScript source file which is executed via eval().

Stage 2: C2 discovery and payload retrieval
The decoded invisible string is converted into JavaScript, which contains the attackers’ command-and-control (C2) URLs.



Stage 3: Stealer: Credential Harvesting and Exfiltration
The work of this decoded payload is to hunt for credentials, it searches for:
- Cryptocurrency wallets: more than 70 hardcoded cryptocurrencies wallets are queried.
- Github Tokens: used to compromise other repositories maintained by the developer
- NPM tokens: used to carry out supply-chain attacks against downstream packages.
- OpenVSX credentials: used to inject malicious code into additional extensions hosted on OpenVSX


Interestingly, for the next-stage payload, the malware queries a Google Calendar entry at `https://calendar.app.google/M2ZCvM8ULL56PD1d6`, the calendar item's title is a Base64 string (aHR0cDovLzIxNy42OS4zLjIxOC9nZXRfem9tYmlfcGF5bG9hZC9xUUQlMkZKb2kzV0NXU2s4Z2dHSGlUdg==), which decodes to the next stage payload http://217.69.3.218/get_zombi_payload/qQD%2FJoi3WCWSk8ggGHiTdg%3D%3D.
This functions as a secondary C2 channel, if communications via Solana or the direct IP are blocked, the Google Calendar lookup provides a backup path to obtain the payload.

Stage 4 (Final Stage): ZOMBI: End-to-End Remote Access & Control
Similar to the previous step, querying the Google Calendar URL returns a Base64-encoded payload that is AES-encrypted, with the IV and secret key supplied in the HTTP response headers.

The payload is a bundled application, official npm modules (e.g., `adm-zip`, `socket.io-client`, `bittorrent-dht`, etc.) are compiled into a single standalone file so it can run without external dependencies. That increases the bundle size significantly, it contains over 147 packages which aids deployment and complicates analysis. Below is a screenshot of the code and grep results showing several of the bundled modules.

The bundled files include multiple packages that implement attacker functionality and persistence mechanisms, few important packages to observe:
- socket.io-client - Real-time C2 communication
- engine.io-client - WebSocket transport
- xmlhttprequest-ssl - HTTP requests
- bittorrent-dht - P2P network for decentralized C2
- k-bucket, k-rpc - DHT routing
- bencode - BitTorrent encoding
- sodium-javascript - Complete NaCl crypto
- blake2b, sha256-wasm, sha512-wasm - Hashing
- chacha20-universal, xsalsa20 - Encryption
- adm-zip, yauzl - ZIP handling for payloads
ZOMBI Behaviour and Capabilities
Let's unpack the capabilities of the ZOMBI payload and examine how it earned its name.
1. BitTorrent DHT - Decentralized C2 Communication
Code Points, the foundation of Unicode: A code point is a number that uniquely identifies a character in Unicode. Think of it as an address or ID for a character:
- A Distributed Hash Table (DHT) is a decentralized system that lets computers in a network share and find information without needing a central server.
- Instead of connecting to a fixed domain, it queries the decentralized DHT network using a public key to find its C2 server details, Glassworm uses BitTorrent DHT network to retrieve C2 server configuration without hardcoded domains. And this technique has also been used by some malware in the past to hide their command-and-control servers.

- DHT Retrieval Function: Uses public key `858d53e806734c539b50f15ca72580437ce47ba9` to query DHT, Retries every 5 minutes on failure, Retrieves JSON with base64-encoded C2 IP address, Uses cryptographic signature verification

2. WebSocket C2 Communication
- 2 communication uses a persistent, bidirectional connection between an infected machine and the attacker’s server.
- Once established, the channel lets the attacker send commands and receive data in real time without repeated HTTP requests, commands are explained in the next section.
- Because WebSockets can run over TLS (wss://), traffic often looks like normal encrypted web traffic, making it harder to spot.
- Malware using WebSockets commonly implements reconnection and keepalive logic so the link stays up reliably.

3. Command handler
- A command handler is the routine that receives instructions from the C2 server and turns them into actions on the infected machine.
- It typically parses the incoming command, validates parameters, and then executes the below mentioned action
- Commands supported:
- start_hvnc - Start Hidden VNC
- stop_hvnc - Stop Hidden VNC
- start_socks - Start SOCKS proxy
- stop_socks - Stop SOCKS proxy
- command - Execute arbitrary JavaScript code

4. SOCKS Proxy
- A SOCKS proxy is a network relay that forwards a device’s traffic through another host, allowing that host to act as an intermediary for connections.
- This technique has been used by attackers to hide activity and move laterally within compromised networks.
- Once running, the attacker can route the victim’s outbound traffic through the proxy to browse anonymously or reach internal network resources.
- Routes victim's network traffic through SOCKS proxy for anonymity or to access victim's internal network.
- Fetches proxy script from C2 server
- Spawns child Node.js process to run proxy
- State tracking with context["socks_proxy"]
5. Hidden VNC (HVNC)
- A Hidden VNC (HVNC) is a remote desktop setup that creates a virtual display on the infected machine that the attacker can see and control, but the desktop is hidden from the local user.
- The attacker connects to the hidden virtual desktop using VNC or similar remote-display protocols, giving full GUI access without interrupting or alerting the user.
- Because the session is invisible locally, HVNC is ideal for stealthy credential harvesting, interactive control, or manual post-exploitation activities.

- Downloads encrypted ASAR archive from C2. An encrypted ASAR archive is a package containing code or native modules that the malware will run.
- The malware verifies the archive’s hash first to ensure it wasn’t tampered with before proceeding.
- For compatibility with older or 32-bit native modules, it spawns a 32-bit (x86) Node.js process to run the decrypted code.
- On Windows the loader often uses start /B to run the Node process in the background without opening a visible window.
6. Encrypted Communications (AES-128-CBC)
- All communications and payloads are encrypted using AES-128-CBC so the malware’s traffic and files are hard to inspect.
- Encryption keys (and the per-payload random IV) are sent dynamically from the C2, often carried in HTTP headers like “iv” and “secret” as seen in the above examples.
- Native modules and payloads are stored encrypted on disk, and only decrypted in memory after the keys are retrieved.
- This design helps attackers evade network/endpoint detection and makes static analysis of on-disk artifacts much harder.

Final Thoughts
The recent GlassWorm incident underscores how threat actors continue to find creative, if not entirely novel, ways to hide malicious code. Using invisible Unicode characters to embed payloads within open-source software is clever in its subtlety, but not unprecedented. What GlassWorm illustrates is less about breakthrough tactics and more about the steady refinement of evasion techniques that can bypass traditional code reviews and automated scanning.
Looking forward, we can expect such tactics to evolve further, potentially combining invisible code with obfuscated behavior, decentralized command-and-control, and legitimate service misuse to complicate detection. This incremental sophistication calls for a defensive mindset that prioritizes understanding context and behavior over relying solely on signature or syntactic detection.
At a practical level, developers can mitigate risk by adopting minimal but effective practices like enforcing consistent code formatting rules that reveal invisible or unexpected characters, integrating automated checks for Unicode anomalies, maintaining stricter dependency vetting processes, and employing runtime monitoring tools that flag unusual activities even from trusted packages. Cultivating a culture of vigilance, transparency, and collaboration within the software community remains key to staying ahead of subtle supply chain threats like GlassWorm.
