File(s) under permanent embargo

A Fuzzy R Code similarity detection algorithm

conference contribution
posted on 2014-01-01, 00:00 authored by M Bartoszuk, Marek GagolewskiMarek Gagolewski
R is a programming language and software environment for performing statistical computations and applying data analysis that increasingly gains popularity among practitioners and scientists. In this paper we present a preliminary version of a system to detect pairs of similar R code blocks among a given set of routines, which bases on a proper aggregation of the output of three different [0,1]-valued (fuzzy) proximity degree estimation algorithms. Its analysis on empirical data indicates that the system may in future be successfully applied in practice in order e.g. to detect plagiarism among students' homework submissions or to perform an analysis of code recycling or code cloning in R's open source packages repositories. © Springer International Publishing Switzerland 2014.

History

Event

Information Processing and Management of Uncertainty in Knowledge-Based Systems. Conference (15th : 2014 : Montpellier, France)

Volume

444

Issue

Part 3

Series

Communications in Computer and Information Science

Pagination

21 - 30

Publisher

Springer

Location

Montpellier, France

Place of publication

Berlin, Germany

Start date

2014-07-15

End date

2014-07-19

ISSN

1865-0929

ISBN-13

9783319088518

Language

eng

Publication classification

E1.1 Full written paper - refereed

Editor/Contributor(s)

A Laurent, O Strauss, B Bouchon-Meunier, R Yager

Title of proceedings

IPMU 2014 : Information processing and management of uncertainty in knowledge-based systems : 15th international conference, IPMU 2014, Montpellier France, July 15-19, 2014 : Proceedings