A Proposed Model for Source Code Reuse Detection in Computer Programs، انجام شده توسط سیدمحمد بیدکی

مشخصات پژوهش

عنوان	A Proposed Model for Source Code Reuse Detection in Computer Programs
نوع پژوهش	مقالات در نشریات
کلیدواژه‌ها	Plagiarism detection, Source code reuse, SOCO, Structure-based approach
مجله	Iranian Journal of Science and Technology-Transactions of Electrical Engineering
شناسه DOI	https://doi.org/10.1007/s40998-020-00403-8
پژوهشگران	زهرا ستوده (نفر اول) ، سید محمدرضا موسوی (نفر دوم) ، سید مصطفی فخراحمد (نفر سوم) ، سیدمحمد بیدکی (نفر چهارم)

چکیده

Source code reuse detection has become of growing significance as a common plagiarism prevention practice in academic research. For a large collection of source codes, the manual detection of the code reuse seems impractical, and there is a vital need for automatic and highly accurate tools. This paper introduces a structure-based approach for recognizing source code (SOCO) reuse in reference programs. The proposed model consists of the three main phases; preprocessing, sequence generation, and decision-making based on estimated similarities. Firstly, important instructions in each code file are identified, and source code is converted to a string of specific tokens. A sequence alignment process is then carried out, and the tree representation of the source code is constructed. In the third phase, the similarity values among the code files are estimated using three different innovative strategies based on both lexical and structural comparison of source codes. Finally, the system decides on each pair of files. The SOCO-2014 corpus is used for evaluating the method. The comparative experimental results of our model and that of the contest participants indicate that our proposed method’s performance is acceptable and promising.