Research

The science behind vibeware detection

Distribution Shift in AI-Generated Malware

Traditional classifiers trained on human-authored malware exhibit significant accuracy degradation when tested against LLM-generated samples. We quantify this distribution shift across five feature spaces — byte n-grams, import tables, section entropy, API call sequences, and TLSH fingerprints — and propose a retraining strategy using augmented corpora generated by open-weight code models. Results show that ensemble models with drift-aware retraining pipelines maintain above 0.91 F1 on held-out AI-generated samples.

MLLLMDistribution ShiftPE Analysis

TLSH Clustering for Polymorphic Variant Detection

Locality-sensitive hashing via TLSH enables grouping of malware variants that share structural similarity despite bytewise divergence. We present a clustering pipeline that builds variant graphs at ingest time and updates them incrementally as new samples arrive. Experiments on a dataset of 40,000 PE binaries demonstrate that TLSH-based clustering reduces analyst review burden by grouping 73% of novel samples into existing known-family clusters within seconds of submission.

TLSHClusteringFuzzy HashingPolymorphism

CatBoost Feature Importance in PE Analysis

We perform a systematic feature importance study using CatBoost's built-in SHAP attribution on a dataset of 25,000 labeled PE files. Section entropy variance, import hash, and the ratio of virtual to raw section size emerge as the three most discriminative features. The analysis also reveals that categorical features — such as subsystem type and machine architecture — contribute disproportionately when CatBoost handles them natively versus one-hot encoded inputs to gradient boosting competitors.

CatBoostSHAPFeature ImportancePE Headers