|
| 1 | +This directory is provided as a courtesy. It includes the MalConv model to which we compared to in https://arxiv.org/abs/1804.04637. |
| 2 | + |
| 3 | +For more details about MalConv, please see (and cite) the [original paper](https://arxiv.org/abs/1710.09435). |
| 4 | + |
| 5 | +``` |
| 6 | +Raff, Edward, et al. "Malware detection by eating a whole exe." arXiv preprint arXiv:1710.09435 (2017). |
| 7 | +``` |
| 8 | + |
| 9 | +If you use the pre-trained weights or code in your work, we also ask that you please cite [our paper](https://arxiv.org/pdf/1804.04637.pdf) for the implementation of MalConv, is it differs in a few subtle ways from the original. |
| 10 | + |
| 11 | +``` |
| 12 | +H. Anderson and P. Roth, "EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models”, in ArXiv e-prints. Apr. 2018. |
| 13 | +
|
| 14 | +@ARTICLE{2018arXiv180404637A, |
| 15 | + author = {{Anderson}, H.~S. and {Roth}, P.}, |
| 16 | + title = "{EMBER: An Open Dataset for Training Static PE Malware Machine Learning Models}", |
| 17 | + journal = {ArXiv e-prints}, |
| 18 | + archivePrefix = "arXiv", |
| 19 | + eprint = {1804.04637}, |
| 20 | + primaryClass = "cs.CR", |
| 21 | + keywords = {Computer Science - Cryptography and Security}, |
| 22 | + year = 2018, |
| 23 | + month = apr, |
| 24 | + adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180404637A}, |
| 25 | +} |
| 26 | +``` |
| 27 | + |
| 28 | +## Can I use this code to train MalConv on my own dataset? |
| 29 | +The code provided is instructional and nonfunctional. With a few minor changes, it can be made functional. In particular, you must provide a URL to fetch file contents by sha256 hash. |
| 30 | + |
| 31 | +## How does this MalConv model differ from that of Raff et al.? |
| 32 | + * The original paper used `batch_size = 256` and `SGD(lr=0.01, momentum=0.9, decay=UNDISCLOSED, nesterov=True )`. We used |
| 33 | + `decay=1e-3` and `batch_size=100`. |
| 34 | + * It is unknown whether the original paper used a special symbol for padding. |
| 35 | + * The paper allowed for up to 2MB malware sizes, we use 1MB because of memory limits on a commonly-used Titan X. |
0 commit comments