Data Mining for CyberSecurity(CS 7301.003) 

Time and Location :  F 10:00am-12:[email protected] 2.112

Instructor                           :   Murat Kantarcioglu
Office Hours & Location :   Friday 8:00-10:[email protected] 3.225

Teaching Assistant           :  N/A       
Office Hours & Location : N/A

         :  Machine Learning


  •   Project                      %50
  •   Paper presentation   %30 (Please use the ppt template provided (ppt) )  
  •   Class Discussion      %10


Course Topics: (tentative)

Increasingly, detecting and preventing cyber attacks require sophisticated use of data mining
and machine learning tools. This seminar class will cover the theory and practice of using data mining
tools in the context of cybersecurity where we need to deal with intelligent adversaries that try
to avoid being detected.

Textbook: We will cover selected theoretical and practical papers on the topic.


Course Outline:



  • Overview of the Class.
  • Measuring Classifier performance.
    • Chapter 2 of Devroye et al. book (pdf)


  • M. Kantarcioglu, B. Xi, and C. Clifton, "Classifier evaluation and attribute selection against active adversaries," Data Min. Knowl. Discov., vol. 22, pp. 291–335, January 2011. (pdf)
  • Y. Zhou, M. Kantarcioglu, B. Thuraisingham, and B. Xi, "Adversarial support vector machine learning" SIGKDD '12 (pdf)
  • Yan Zhou, Murat Kantarcioglu, Bhavani M. Thuraisingham: Sparse Bayesian Adversarial Learning Using Relevance Vector Machine Ensembles. ICDM 2012:1206-1211 (pdf)
  • M. Kearns and M. Li. Learning in the presence of malicious errors. SIAM Journal on Computing, 22:807-837, 1993. (pdf)


  • N. Dalvi, P. Domingos, Mausam, S. Sanghai, and D. Verma. Adversarial classification. KDD '04 (pdf)
  • M. Bruckner and T. Scheffer. Nash equilibria of static prediction games. In Advances in Neural Information Processing Systems. MIT Press, 2009. (pdf)
  • Dekel, O., O. Shamir: 2008, "Learning to classify with missing and corrupted features".  ICML. (pdf)


  • M. Bruckner and T. Scheffer. Stackelberg games for adversarial prediction problems.  SIGKDD, 2011 (pdf)
  • D. Lowd and C. Meek. Adversarial learning. In KDD'05, pages 641-647, 2005. (pdf)
  • Globerson, A. and S. Roweis: 2006, "Nightmare at Test Time: Robust Learning by Feature Deletion". ICML. pp. 353-360. (pdf)


  • B. I. Rubinstein et al., "Antidote: understanding and defending against poisoning of anomaly detectors," in Proceedings of the 9th ACM SIGCOMM conference on Internet measurement conference. 2009, pp. 1-14. (pdf)
  • M. Barreno et al., "Can machine learning be secure?" AsiaCCS '06. (pdf)
  • Song, Y. et al.  "On the infeasibility of modeling polymorphic shellcode". CCS '07. (pdf)


  • Newsome, B. Karp, and D. Song. Paragraph: Thwarting signature learning by training maliciously.  In RAID, volume 4219 of LNCS, pages 81-105, 2006. (pdf) (Presenter: Huseyin)
  • Brumley, D., J. Newsome, D. Song, H. Wang, and S. Jha: 2006, "Towards automatic generation of vulnerability-based signatures". In: IEEE Symposium on Security and Privacy . pp.  2-16.(pdf) (Presenter: Huseyin)
  • Perdisci, R., D. Dagon, W. Lee, P. Fogla, and M. Sharif: 2006, "Misleading Worm Signature Generators Using Deliberate Noise Injection". In: IEEE Symposium on Security and Privacy. pp. 17-31. (pdf) (Presenter: Huseyin)


  • Wang, K. and S. Stolfo: 2004, "Anomalous Payload-Based Network Intrusion Detection". In: Recent Adances in Intrusion Detection (RAID). pp. 203-222. (pdf) (Presenter: Harichandan)
  • Wang, K., J. J. Parekh, and S. J. Stolfo: 2006, "ANAGRAM: A Content Anomaly Detector Resistant To Mimicry Attack". In: Recent Adances in Intrusion Detection (RAID). pp. 226-248. (pdf) (Presenter: Khaled)
  • Fogla, P. and W. Lee: 2006, "Evading network anomaly detection systems: formal reasoning and practical techniques". In: ACM Conference on Computer and Communications Security (CCS). pp. 59-68. (pdf) (Presenter:  Fahad)


  • G. L. Wittel and S. F. Wu. On attacking statistical spam filters. In CEAS'04, 2004. (pdf) (Presenter: Fahad)
  • D. Lowd and C. Meek. Good word attacks on statistical spam filters. In CEAS'05, 2005. (pdf)  (Presenter: Ahmad)
  • Nelson et al. "Exploiting Machine Learning to Subvert Your Spam Filter" Usenix Workshop, 2008 (pdf) (Presenter: Harichandan)


  • Li, Z., M. Sandhi, Y. Chen, M.-Y. Kao, and B. Chavez: 2006, "Hamsa: fast signature generation for zero-day polymorphic worms with provable attack resilience". In: IEEE Symposium on Security and Privacy. pp. 32-47.(pdf) (Presenter: Imrul)
  • C. Smutz and A. Stavrou, "Malicious PDF detection using metadata and structural features," in Annual Computer Security Applications Conference (ACSAC), 2012, pp. 239-248 (pdf) (Presenter: Yasmeen) 
  • Nedim Srndic and Pavel Laskov, "Practical Evasion of a Learning-Based Classifier: A Case Study" Proceeding SP '14 Proceedings of the 2014 IEEE Symposium on Security and Privacy Pages 197-211 (pdf) (Presenter: Ahmad)


  • Spring Break.
  • M. Cova, C. Kruegel, and G. Vigna, "Detection and analysis of drive- by-download attacks and malicious JavaScript code," in International Conference on World Wide Web (WWW), 2010, pp. 281-290. (pdf) (Presenter: Imrul)
  • D. Canali, M. Cova, G. Vigna, and C. Kruegel, "Prophiler: a fast filter for the large-scale detection of malicious web pages," in International Conference on World Wide Web (WWW), 2011, pp. 197-206. (pdf) (Presenter: Fahad)
  • G. Stringhini, C. Kruegel, and G. Vigna, "Shady paths: leveraging surfing crowds to detect malicious web pages," in ACM Conference on Computer and Communications Security (CCS), 2013, pp. 133-144. (pdf) (Presenter: Imrul)


  • M.Egele,G.Stringhini,C.Kruegel,andG.Vigna,"COMPA:Detecting compromised accounts on social networks." in Network and Distributed System Security Symposium (NDSS), 2013. (pdf) (Presenter: Yifan)
  • Ding et al., "Intrusion as (anti)social communication: characterization and detection"  SIGKDD '12 (pdf) (Presenter: Khaled)
  • Wang et al. "Man vs. Machine: Practical Adversarial Detection of Malicious Crowdsourcing Workers", Usenix Security 2014 (pdf) (Presenter: Ahmad)


  • Chu et al. "Who is tweeting on twitter: human, bot, or cyborg?" In Proc. of ACSAC (2010) (pdf) (Presenter: Yifan)
  • Stringhini et al. "Detecting spammers on social networks". In Proc. of ACSAC (2010). (pdf) (Presenter: Harichandan)
  • Yang et al. "Die free or live hard? empirical evaluation and new design for fighting evolving twitter spammers". In Proc. of RAID (2011) (pdf) (Presenter: Yasmeen)


  • Zhang et. al. "Online modeling of proactive moderation system for auction fraud detection". In Proc. of WWW (2012). (pdf) (Presenter:  Yasmeen)
  • Venketaraman et al., "Limits of learning-based signature generation with adversaries". In Proc. of NDSS (2008) (pdf) (Presenter: Yifan)
  • Sommer et al. "Outside the closed world: On using machine learning for network intrusion detection" IEEE S&P (2010) (pdf) (Presenter: Khaled)


  • T.B.D.
  • Project Presentations.


  • Project Presentations.