Paper
7 May 2012 Fractals, malware, and data models
Holger M. Jaenisch, Andrew N. Potter, Deborah Williams, James W. Handley
Author Affiliations +
Abstract
We examine the hypothesis that the decision boundary between malware and non-malware is fractal. We introduce a novel encoding method derived from text mining for converting disassembled programs first into opstrings and then filter these into a reduced opcode alphabet. These opcodes are enumerated and encoded into real floating point number format and used for characterizing frequency of occurrence and distribution properties of malware functions to compare with non-malware functions. We use the concept of invariant moments to characterize the highly non-Gaussian structure of the opcode distributions. We then derive Data Model based classifiers from identified features and interpolate and extrapolate the parameter sample space for the derived Data Models. This is done to examine the nature of the parameter space classification boundary between families of malware and the general non-malware category. Preliminary results strongly support the fractal boundary hypothesis, and a summary of our methods and results are presented here.
© (2012) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Holger M. Jaenisch, Andrew N. Potter, Deborah Williams, and James W. Handley "Fractals, malware, and data models", Proc. SPIE 8408, Cyber Sensing 2012, 84080X (7 May 2012); https://doi.org/10.1117/12.941769
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Fractal analysis

Prototyping

Binary data

Statistical analysis

Platinum

Brain

Back to Top