程序代码相似度检测技术的研究与实现

信令风暴程序代码相似度检测技术的研究与实现
水样女人作者:卫军超 耿楠
清教主义
来源:《电脑知识与技术》2017年第05期
        摘要:针对传统相似度算法应用在程序设计课程作业检测中精度较低这一问题,通过研究最长公共子序列等算法,发现其优缺点,在分析的基础上,结合结构度量技术和属性技术两种技术,提出一种性能较好的程序相似度计算方法。方法首先对源程序进行初步处理,将程序中的注释语句和空格删除,再次确定常用元素及常用结构,然后利用Lex统计、抽取程序元素;利用开源代码ucc生成语法树,之后抽取相应的语法结构;最后生成特征向量,并计算代码相似度。实验结果表明该方法比最长公共子序列算法精度提高了10.6%。
        关键词:属性计数法;结构度量技术 ;相似度度量
人的异化
        中图分类号: TP311 文献标志码: A 文章编号:1009-3044(2017)05-0039-02
        Abstract: To solve the problem of the low precision of testing for similarity of program
ad637codes in traditional ways, this thesis proposes an improved technique to make such a test on the combination of technology of attribute counting and that of structure calculation through studying and comparing several different methods of calculating the Longest Common Subsequence. Firstly, source program is processed primarily, annotation statements and spaces are deleted, and common elements and structures get confirmation; next, statistics are made by means of Lex, program elements are extracted, and abstract syntax trees get to be generated using UCC; then, grammar structures are extracted; lastly, eigenvector is produced and the similarity can get calculated. The experimental result shows that the new method is 10.6 percent more precise than those of calculating the Longest Common Subsequence.瓶颈效应

本文发布于:2024-09-23 22:23:52,感谢您对本站的认可!

本文链接:https://www.17tex.com/xueshu/129609.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:技术   相似   算法
留言与评论(共有 0 条评论)
   
验证码:
Copyright ©2019-2024 Comsenz Inc.Powered by © 易纺专利技术学习网 豫ICP备2022007602号 豫公网安备41160202000603 站长QQ:729038198 关于我们 投诉建议