A comparison of the classification of disparate malware collected in different time periods
Islam, Rafiqul, Tian, Ronghua, Moonsamy, Veelasha and Batten, Lynn 2012, A comparison of the classification of disparate malware collected in different time periods, Journal of networks, vol. 7, no. 6, pp. 946-955.
It has been argued that an anti-virus strategy based on malware collected at a certain date, will not work at a later date because malware evolves rapidly and an anti-virus engine is then faced with a completely new type of executable not as amenable to detection as the first was.
In this paper, we test this idea by collecting two sets of malware, the first from 2002 to 2007, the second from 2009 to 2010 to determine how well the anti-virus strategy we developed based on the earlier set  will do on the later set. This anti-virus strategy integrates dynamic and static features extracted from the executables to classify malware by distinguishing between families. We also perform another test, to investigate the same idea whereby we accumulate all the malware executables in the old and new dataset, separately, and apply a malware versus cleanware classification.
The resulting classification accuracies are very close for both datasets, with a difference of approximately 5.4% for both experiments, the older malware being more accurately classified than the newer malware. This leads us to conjecture that current anti-virus strategies can indeed be modified to deal effectively with new malware.