By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
18px_cookie
e-remove

Invisible Threats and 
the Blind Spots of Security

How GlassWorm Exploited Unicode Shadows in VS Code Supply Chains

Magnifying glass highlighting a red warning triangle with an exclamation mark on a dark red and black background.

Prefer to read a PDF?

Introduction

In October 2025, researchers at KOI Security uncovered a malware campaign that targeted the Open VSX Registry. Nicknamed “GlassWorm” by the KOI Security team, the malware campaign published multiple malicious VS Code extensions that together were downloaded more than 35,000 times. The compromised packages shared several notable traits for stealthy payload delivery, including an unconventional Unicode-based obfuscation technique that remained invisible within IDEs.

This research was motivated by the 2021 Trojan Source research, which showed how carefully chosen Unicode control characters can change how code is displayed or interpreted. That work forced platforms and IDEs, including GitHub and many editors, to surface warnings for suspicious Unicode sequences. The campaign we analyzed, however, uses a different and under-observed class of characters (variation selectors) that remain largely invisible to common tooling.

While the earlier reports surfaced the campaign and its broad implications, our follow-up research dives deeper into the technical underpinnings of GlassWorm’s obfuscation and propagation mechanisms. We conducted a detailed reverse engineering of the malicious VSIX extensions, deconstructed the encoded payloads. This post dissects GlassWorm’s design and capabilities, explains why conventional tooling missed it, and outlines defensive steps defenders and developers should adopt.

This report has three sections

About the author

1

What makes GlassWorm different

GlassWorm is notable for how it blends multiple evasion and resilience techniques into a single campaign:

  • Invisible Unicode obfuscation. The payload is embedded as long runs of invisible Unicode characters (Variation Selectors / Private Use Area code points). Those characters do not display in most editors and are therefore difficult for casual human review to notice. The malware maps these code points to bytes and reconstructs a Base64 blob that decodes to executable JavaScript.
  • Decentralized, resilient C2 channels. Rather than relying on a single server, GlassWorm uses multiple fallback channels:
    • Transaction memos on the Solana blockchain as an immutable, censorship‑resistant command-and-control (C2) channel.
    • Direct HTTP(S) endpoints hosted at attacker IPs.
    • Encoded Google Calendar event titles as an additional fallback communication vector.
  • Credential and token harvesting. Once active, the malware searches the host for developer credentials (GitHub, NPM, OpenVSX tokens, crypto wallets), enabling further compromise of repositories and package uploads.
  • Repurposing developer machines. Infected hosts are converted into covert infrastructure: SOCKS proxies, hidden VNC (HVNC) servers, and remote execution nodes (via WebRTC or spawned Node.js processes). That gives attackers anonymized network access into corporate and personal networks and a platform to propagate further.
  • Bundled, multi‑platform payloads. The final payload is a single large bundled artifact containing many standard modules and native decoders for different OSes. This reduces dependency visibility and eases execution across platforms.
2

High-level execution flow

Stage 1: Payload Decoding

1. Extension activation

The extension’s package.json registers an activation event that immediately loads ./extension.js. extension.js enforces rate-limited activation, a helper routine runs on first activation and then only after a configured cooldown (~2 days), persisting timestamps to context.globalState so it survives restarts. An in‑session flag prevents multiple invocations per process lifetime.

JSON code defining a Recoil extension with version 0.7.4, a dark theme for Rust and Zig languages, published by ellacrity.
Image of initial execution hook package,json

Observed behavior:

  • First run: helper() executes and the activation timestamp is recorded.
  • Subsequent runs (within 2 days): no action is taken,  helper() is not called.
  • After 2 days: helper() runs once more and the timestamp is updated.
  • Within one process lifetime: isActivated ensures helper() cannot be invoked more than once.
JavaScript async function named activate managing activation state with global context and executing helper function marked as malware payload.
Image of time based execution

2. Payload entry and decoding

The helper() function launches index.js (or an OS‑specific entrypoint). index.js contains the invisible Unicode string payload. Based on the host OS it selects a decoder:

  • Windows x86: index_win32_ia32.node (native)
  • Windows x64: index_win32_x64.node (native)
  • Linux: index_linux.node (native)
  • macOS: decode.js (JavaScript)

These native binaries were first uploaded to VirusTotal on 2025-03-23 and remained completely undetected with 0 detections until the attack on 2025-10-22. This suggests that the binaries were likely off-the-shelf tools reused from previous campaigns or public sources, rather than newly developed components.

Links to Virustotal

JavaScript function 'getPath' that returns platform-specific file paths based on OS platform detection.
Image of decoder selection based on OS

3. The invisible‑Unicode technique

The decoder maps each invisible Unicode code point to a byte value (using a simple offset arithmetic) to reconstruct a Base64-encoded blob. That Base64 decodes into JavaScript which is then executed.

The comparison below shows the payload file opened in both the IDE and a hex editor. While the payload appears invisible in the IDE, its underlying byte values are clearly visible in the hex editor.

Split screen showing hex dump on the left and JavaScript code for OS info decoding and path selection on the right.
Image of comparison between HexEditor and VS Code

Before we dive into the technique, let's understand: 

1. Code Points

Code Points, the foundation of Unicode: A code point is a number that uniquely identifies a character in Unicode. Think of it as an address or ID for a character:

  • The letter “A” = code point U+0041 (decimal 65)
  • The letter “€” = code point U+20AC (decimal 8364)

Every character you can type, letters, symbols, emojis, or even invisible formatting characters, has a unique code point number. The malware exploits this by using invisible characters whose code points can be mathematically converted into byte values (0-255).. 

2. Unicode Variation Selectors (VS)

Unicode Variation Selectors are special invisible characters used to select different visual representations of the same character. There are two ranges:

  • VS1-VS16: U+FE00 to U+FE0F (code points 65024-65039, 16 selectors)
  • VS17-VS256: U+E0100 to U+E01EF (code points 917760-918015, 240 selectors)

These characters don't render in most text editors making it invisible and has 256 possible values = 256 byte values (0-255)

The real decoding of invisible characters from decode.js

JavaScript code defining functions variationSelectorToByte and decode to convert variation selectors to bytes and exporting decode function.
Image of decoder functions from decode.js

Here's the exact conversion of the first 10 invisible payloads from `index.js` that results in a Base64 string "dmFyIF9fY3". 
The math is simple arithmetic: codepoint - base_offset +16 = byte_value

Character-by-character decoding:

  1. U+E0154 → 0xE0154 - 0xE0100 + 16 = 100 (0x64) = 'd'
  2. U+E015D → 0xE015D - 0xE0100 + 16 = 109 (0x6D) = 'm'
  3. U+E0136 → 0xE0136 - 0xE0100 + 16 =  70 (0x46) = 'F'
  4. U+E0169 → 0xE0169 - 0xE0100 + 16 = 121 (0x79) = 'y'
  5. U+E0139 → 0xE0139 - 0xE0100 + 16 =  73 (0x49) = 'I'
  6. U+E0136 → 0xE0136 - 0xE0100 + 16 =  70 (0x46) = 'F'
  7. U+E0129 → 0xE0129 - 0xE0100 + 16 =  57 (0x39) = '9'
  8. U+E0156 → 0xE0156 - 0xE0100 + 16 = 102 (0x66) = 'f'
  9. U+E0149 → 0xE0149 - 0xE0100 + 16 =  89 (0x59) = 'Y'
  10. U+E0123 → 0xE0123 - 0xE0100 + 16 =  51 (0x33) = '3'

The invisible string consists of 6,492 Unicode characters, it is decoded to Base64, then Base64-decoded into a JavaScript source file which is executed via eval().

Code snippet showing a decoded buffer and decoded bytes as arrays, followed by a decoded string variable with base64 encoded text.
Image of decoded payload

Stage 2: C2 discovery and payload retrieval

The decoded invisible string is converted into JavaScript, which contains the attackers’ command-and-control (C2) URLs.

JavaScript function getUrl asynchronously fetches Solana wallet transaction signatures, filters for a memo field, waits with timeouts, and parses the memo as JSON.
Image of decoded payload from invisible chars
The reconstructed script continuously polls the Solana blockchain (every ~10s) looking for transactions sent to a specific wallet “28PKnu7RzizxBzFPoLp69HLXp9bJL3JFtT2s5QzHsEA2”. It extracts the transaction memo field, interprets it as Base64/JSON, and obtains C2 IP addresses 217.69.3[.]21
Screenshot of Memo Program v2 instruction showing program logs, a memo link, compute units consumed, and a success message.
Image of Solana transactions’ memo
The malware contacts the C2 URL to retrieve an encrypted STAGE 3 payload. HTTP headers carry anivand anencryption key, the response contains Base64 data which is then decrypted (AES‑based) to obtain the next‑stage code and executed via eval().
Code snippet showing an async JavaScript function fetching data with headers including OS info, handling response text and headers, and extracting secret keys, alongside HTTP response headers and an eval base64 decoded string.
Image of HTTP response from 217.69.3[.]218

Stage 3: Stealer: Credential Harvesting and Exfiltration

The work of this decoded payload is to hunt for credentials, it searches for:

  1. Cryptocurrency wallets: more than 70 hardcoded cryptocurrencies wallets are queried.
  2. Github Tokens: used to compromise other repositories maintained by the developer
  3. NPM tokens:  used to carry out supply-chain attacks against downstream packages.
  4. OpenVSX credentials: used to inject malicious code into additional extensions hosted on OpenVSX
Screen displaying code with a list of wallet names assigned to corresponding string values, with Eth_and_Polk_Web3_Wallet line highlighted.
Image of crypto wallets targeted
Code snippet showing a JavaScript function using obfuscated variable names and Git command execution to retrieve remote origin URL and parse password.
Image of Github Token retrieval

Interestingly, for the next-stage payload, the malware queries a Google Calendar entry at `https://calendar.app.google/M2ZCvM8ULL56PD1d6`, the calendar item's title is a Base64 string (aHR0cDovLzIxNy42OS4zLjIxOC9nZXRfem9tYmlfcGF5bG9hZC9xUUQlMkZKb2kzV0NXU2s4Z2dHSGlUdg==), which decodes to the next stage payload http://217.69.3.218/get_zombi_payload/qQD%2FJoi3WCWSk8ggGHiTdg%3D%3D.

This functions as a secondary C2 channel, if communications via Solana or the direct IP are blocked, the Google Calendar lookup provides a backup path to obtain the payload.

Google Calendar event confirmation showing date Tuesday, 17 March 2026, time 04:00–05:00, with 1 guest and organizer email displayed.
Image of Google Calendar invite

Stage 4 (Final Stage): ZOMBI: End-to-End Remote Access & Control

Similar to the previous step, querying the Google Calendar URL returns a Base64-encoded payload that is AES-encrypted, with the IV and secret key supplied in the HTTP response headers.

Screenshot of JavaScript code importing crypto, fs, and path modules, creating a buffer from a base64 string, and initializing an AES-256-CBC decipher.
Image of response from Google Calendar mentioned IP

The payload is a bundled application, official npm modules (e.g., `adm-zip`, `socket.io-client`, `bittorrent-dht`, etc.) are compiled into a single standalone file so it can run without external dependencies. That increases the bundle size significantly, it contains over 147 packages which aids deployment and complicates analysis. Below is a screenshot of the code and grep results showing several of the bundled modules.

JavaScript function getUrl asynchronously fetches Solana wallet transaction signatures, filters for a memo field, waits with timeouts, and parses the memo as JSON.
Image of grep showing the bundled modules

The bundled files include multiple packages that implement attacker functionality and persistence mechanisms, few important packages to observe:

Network & Communication
  • socket.io-client - Real-time C2 communication
  • engine.io-client - WebSocket transport
  • xmlhttprequest-ssl - HTTP requests
BitTorrent DHT Stack
  • bittorrent-dht - P2P network for decentralized C2
  • k-bucket, k-rpc - DHT routing
  • bencode - BitTorrent encoding
Cryptography
  • sodium-javascript - Complete NaCl crypto
  • blake2b, sha256-wasm, sha512-wasm - Hashing
  • chacha20-universal, xsalsa20 - Encryption
File Operations
  • adm-zip, yauzl - ZIP handling for payloads

ZOMBI Behaviour and Capabilities

Let's unpack the capabilities of the ZOMBI payload and examine how it earned its name.

1. BitTorrent DHT - Decentralized C2 Communication

Code Points, the foundation of Unicode: A code point is a number that uniquely identifies a character in Unicode. Think of it as an address or ID for a character:

  • A Distributed Hash Table (DHT) is a decentralized system that lets computers in a network share and find information without needing a central server.
  • Instead of connecting to a fixed domain, it queries the decentralized DHT network using a public key to find its C2 server details, Glassworm uses BitTorrent DHT network to retrieve C2 server configuration without hardcoded domains. And this technique has also been used by some malware in the past to hide their command-and-control servers.
JavaScript code snippet defining a constant PUBLIC_KEY as a hex buffer, initializing a client_default with a crypto signature verification method, and a listener setting dht_is_ready to true and calling _x86_downloadAndRunFile function.
  1. DHT Retrieval Function: Uses public key `858d53e806734c539b50f15ca72580437ce47ba9` to query DHT, Retries every 5 minutes on failure, Retrieves JSON with base64-encoded C2 IP address, Uses cryptographic signature verification
JavaScript code snippet defining an async function reGet2 with conditional retries, DHT get and put operations, JSON parsing, and a call to download and run a file.
Image of function that queries DHT 

2. WebSocket C2 Communication

  • 2 communication uses a persistent, bidirectional connection between an infected machine and the attacker’s server.
  • Once established, the channel lets the attacker send commands and receive data in real time without repeated HTTP requests, commands are explained in the next section.
  • Because WebSockets can run over TLS (wss://), traffic often looks like normal encrypted web traffic, making it harder to spot.
  • Malware using WebSockets commonly implements reconnection and keepalive logic so the link stays up reliably.
Highlighted JavaScript code defining a connectionWS function to manage WebSocket connection events with retries.
Image of function that connects attacker's server

3. Command handler

  • A command handler is the routine that receives instructions from the C2 server and turns them into actions on the infected machine.
  • It typically parses the incoming command, validates parameters, and then executes the below mentioned action
  • Commands supported:
    • start_hvnc - Start Hidden VNC
    • stop_hvnc - Stop Hidden VNC
    • start_socks - Start SOCKS proxy
    • stop_socks - Stop SOCKS proxy
    • command - Execute arbitrary JavaScript code
JavaScript code snippet handling tasks to start or stop hvnc and socks proxy with conditional event emitting and status checks.
Image of function that connects attacker's server

4. SOCKS Proxy

  • A SOCKS proxy is a network relay that forwards a device’s traffic through another host, allowing that host to act as an intermediary for connections.
  • This technique has been used by attackers to hide activity and move laterally within compromised networks.
  • Once running, the attacker can route the victim’s outbound traffic through the proxy to browse anonymously or reach internal network resources.
  • Routes victim's network traffic through SOCKS proxy for anonymity or to access victim's internal network.
    • Fetches proxy script from C2 server
    • Spawns child Node.js process to run proxy
    • State tracking with context["socks_proxy"]

Image of SOCKS proxy connection

5. Hidden VNC (HVNC)

  • A Hidden VNC (HVNC) is a remote desktop setup that creates a virtual display on the infected machine that the attacker can see and control,  but the desktop is hidden from the local user.
  • The attacker connects to the hidden virtual desktop using VNC or similar remote-display protocols, giving full GUI access without interrupting or alerting the user.
  • Because the session is invisible locally, HVNC is ideal for stealthy credential harvesting, interactive control, or manual post-exploitation activities.
JavaScript code snippet showing decryption using AES-128-CBC, reading and writing files synchronously, and emitting tasks with status handling.
Image of HVNC setup
  • Downloads encrypted ASAR archive from C2. An encrypted ASAR archive is a package containing code or native modules that the malware will run.
  • The malware verifies the archive’s hash first to ensure it wasn’t tampered with before proceeding.
  • For compatibility with older or 32-bit native modules, it spawns a 32-bit (x86) Node.js process to run the decrypted code.
  • On Windows the loader often uses start /B to run the Node process in the background without opening a visible window.

6. Encrypted Communications (AES-128-CBC)

  • All communications and payloads are encrypted using AES-128-CBC so the malware’s traffic and files are hard to inspect.
  • Encryption keys (and the per-payload random IV) are sent dynamically from the C2, often carried in HTTP headers like “iv” and “secret” as seen in the above examples.
  • Native modules and payloads are stored encrypted on disk, and only decrypted in memory after the keys are retrieved.
  • This design helps attackers evade network/endpoint detection and makes static analysis of on-disk artifacts much harder.
JavaScript code snippet fetching encrypted data, extracting encryption key and IV from headers, decrypting AES-128-CBC data, and running a function on the decrypted result.
Image of Encrypted communication routine
3

Final Thoughts

The recent GlassWorm incident underscores how threat actors continue to find creative, if not entirely novel, ways to hide malicious code. Using invisible Unicode characters to embed payloads within open-source software is clever in its subtlety, but not unprecedented. What GlassWorm illustrates is less about breakthrough tactics and more about the steady refinement of evasion techniques that can bypass traditional code reviews and automated scanning.

Looking forward, we can expect such tactics to evolve further, potentially combining invisible code with obfuscated behavior, decentralized command-and-control, and legitimate service misuse to complicate detection. This incremental sophistication calls for a defensive mindset that prioritizes understanding context and behavior over relying solely on signature or syntactic detection.

At a practical level, developers can mitigate risk by adopting minimal but effective practices like enforcing consistent code formatting rules that reveal invisible or unexpected characters, integrating automated checks for Unicode anomalies, maintaining stricter dependency vetting processes, and employing runtime monitoring tools that flag unusual activities even from trusted packages. Cultivating a culture of vigilance, transparency, and collaboration within the software community remains key to staying ahead of subtle supply chain threats like GlassWorm.

Download the Report