Deakin University
Browse

File(s) under permanent embargo

Binary aggregation functions in software plagiarism detection

Version 2 2024-06-12, 15:08
Version 1 2019-10-09, 08:14
conference contribution
posted on 2024-06-12, 15:08 authored by M Bartoszuk, M Gagolewski
© 2017 IEEE. Supervised learning is of key interest in data science. Even though there exist many approaches to solving, among others, classification as well as ordinal and standard regression tasks, most of them output models that do not possess useful formal properties, like nondecreasingness in each independent variable, idempotence, symmetry, etc. This makes them difficult to interpret and analyze. For instance, it might be impossible to determine the importances of individual features or to assess the effects of increasing the values of predictors on the behavior of a chosen response variable. Such properties are especially important in software plagiarism detection, where we are faced with the combination of degrees to which how much a code chunk A is similar to (or contained in) B as well as how much B is similar to A. Therefore, in this paper we consider a new method for fitting B-spline tensor product-based aggregation functions to empirical data. An empirical study indicates a highly competitive performance of the resulting models. Additionally, they possess an intuitive interpretation which is highly desirable for end-users.

History

Pagination

1-6

Location

Naples, Italy

Start date

2017-07-09

End date

2017-07-12

ISSN

1098-7584

ISBN-13

9781509060344

Language

eng

Publication classification

E1.1 Full written paper - refereed

Title of proceedings

FUZZ-IEEE 2017 : IEEE International Conference on Fuzzy Systems

Event

IEEE International Conference on Fuzzy Systems (2017 : Naples, Italy)

Publisher

IEEE

Place of publication

Piscataway, N.J.

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC