RJ 🐢
@rjjoyce8
Helping analysts with their reverse engineering work and finding similar functions is a problem. A lot of existing literature doesn't really work on real data. We can get better results by being dumber! @NeurIPSConf #NeurIPS2024 @BoozAllen w/ @krismicinski @rjzak
Lead by @rjjoyce8 , #EMBER24 has arrived @kdd_news #KDD25, the best, most open, and versatile malware detection benchmark ever! w/ @rjzak @mrphilroth @drhyrum & others, let's try to barely summarize all the new things you can do now! @BoozAllen @CrowdStrike @Cisco 🧵👇
EMBER2024 -- A Benchmark Dataset for Holistic Evaluation of Malware Classifiers. arxiv.org/abs/2506.05074
tinyurl.com/27l4fqdp Researchers unveil EMBER2024, a groundbreaking dataset advancing malware analysis. With diverse file formats, tasks, and undetected evasive files, it empowers robust classifier evaluation and fosters new research opportunities in malware detection.
A new EMBER malware benchmark has been released, with new datasets and 14 baseline LightGBM models. It’s one of the best open-source resources for malware classification and adversarial-testing experiments.
To label your own malware collection, download ClarAVy here! github.com/FutureComputin…
Would you like to have the world's most accurate malware label predictor? RJ @BoozAllen has you covered w/ ClarAVy: A Tool for Scalable and Accurate Malware Family Labeling, an extension of our previous @CamlisOrg work 🧵👇
I love this for bitcoin belivers
Too stupid. It can’t be. It cannot be real. It’s too dumb man.
15 UMBC CSEE students successfully defended their PhD dissertations this year so far, bringing the total number of PhDs the department has produced to 426. See the list, mentors, and dissertation titles via the link below. Congrats to the new doctors! csee.umbc.edu/phd-graduates/
𝑨𝒔𝒔𝒆𝒎𝒃𝒍𝒂𝒈𝒆: 𝑨𝒖𝒕𝒐𝒎𝒂𝒕𝒊𝒄 𝑩𝒊𝒏𝒂𝒓𝒚 𝑫𝒂𝒕𝒂𝒔𝒆𝒕 𝑪𝒐𝒏𝒔𝒕𝒓𝒖𝒄𝒕𝒊𝒐𝒏 𝒇𝒐𝒓 𝑴𝒂𝒄𝒉𝒊𝒏𝒆 𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈 Compiling projects @github . Randomize the compiler version & settings. We are building 𝑡ℎ𝑒 tool for making datasets for research! 2 years work!
Preprint of some work with @EdwardRaffML and friends arxiv.org/abs/2405.03991. Let us know if you would like huge binary corpuses (for LLMs, etc..), we have >1mil.
Malware Bytes, an article going into some of the big-picture factors on the malware work I've done over the past ~8 years! govinfo.gov/content/pkg/GP… A lot of my different research activities tie together in a fun weird goal of tackling un-meet needs in government!
Inspired by @_albertgu recent works in state space models, can we merge them with #VSA and #HRR for our long sequence classification needs in #malware? Our HGConv says yes! With some interesting results on pros/cons. Lead by @rea1mma w/ @BlancheMinerva Tim Oates & Jim Holt!
Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection ift.tt/wtZR31f
I've recommended this to lots of people who are in security and want to dive deeper into the details of ML, and I always hear the same basic comment: it's super readable and eases you into the math with approachable code that explains *why* you need the math. Fantastic book.
I'm now officially a published book author @ManningBooks! Inside Deep Learning mng.bz/8M2g ! Filling the need for a combination of practical "get something running" and understanding why things work and how the math relates to the code. @KirkDBorne for the forward!
Very excited to be in Copenhagen for @acm_ccs! I'll be presenting our research on embedding VirusTotal scans (in collaboration w/ @EdwardRaffML and @cknicholas) at the AISec workshop tomorrow
Trying to query for related malware across a massive dataset? @EdwardRaffML, @cknicholas, and I are very excited to announce AVScan2Vec: Vector embeddings for antivirus scan results! arxiv.org/abs/2306.06228
It took me almost 3 years, but I finally restructured the Practical Malware Analysis section of my site. This is now broken up into more digestible sections, and I've also revamped the MITRE ATT&CK tests to come with appropriate categories and tagging. jaiminton.com/Tutorials/Prac…
When I started teaching, I was surprised at how little research there is to what impacts students grades. Thanks to @UMBC @GoogleColab @ManningBooks , I now have some answers for how starting HW earlier improves student grades in deep learning! arxiv.org/abs/2311.09228 📜🧵👇