跳到主要內容

臺灣博碩士論文加值系統

(44.192.94.177) 您好!臺灣時間:2024/07/25 10:38
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:蔡易賢
論文名稱:一個基於大型語言模型的智能合約弱點偵測方法
論文名稱(外文):A Smart Contract Vulnerability Detection Manner Based on Large Language Model
指導教授:鍾毓驥
口試委員:鄭鴻君林永裕鍾毓驥吳忠信楊富強
口試日期:2023-12-03
學位類別:碩士
校院名稱:國立高雄科技大學
系所名稱:工業工程與管理系
學門:工程學門
學類:工業工程學類
論文種類:學術論文
論文出版年:2023
畢業學年度:112
語文別:中文
論文頁數:48
中文關鍵詞:去中心化金融區塊鏈智能合約弱點偵測大型語言模型GPT 提示工程
外文關鍵詞:DeFiBlockChainSmart ContractVulnerability detectionlarge language modelGPTprompt engineering
相關次數:
  • 被引用被引用:0
  • 點閱點閱:201
  • 評分評分:
  • 下載下載:58
  • 收藏至我的研究室書目清單書目收藏:0
在本研究中,我們提出了一個基於大型語言模型( Large Language Model; LLM)的智能合約弱點偵測方法。 智能合約是去中心化金融( DeFi)中十分重要的元件 。 智能合約是一種自動執行的合約, 可以用來處理資料,亦可以用來轉帳 。 有許多區塊鏈上的應用,都是基於智能合約來開發的。 然而,智能合約中程式碼的弱點,也容易被駭客所使用,造成金融上的損失。 例如在 2016年時,因為The DAO智能合約上的一個漏洞,造成了 55 million USD的損失。 為了避免這樣的問題,有許多智能合約的漏洞偵測機制被研發出來。這些研究有的基於傳統的靜態分析( static analysis),模糊測試 fuzzy testing),或是 machine learning的方法。 近年來由於大型語言模型如 GPT等發展快速, 許多組織,公司,或是個人,將日常的事務交由大型語言模型來處理。由於大型語言模型已經有了基本理解程式碼的能力,因此 本研究的目的,便是在探索大型語言模型用於偵測智能合約漏洞的潛力。 我們在本研究中採用了 CoT Plan-and-Solve few-shot learning等prompt engineering的機制,來進一步提升大型語言 模型偵測漏洞的能力。我們也
設計了一系列的實驗,來展示我們所提出的 prompt engineering方法,在不同的智能合約漏洞下的表現。
In the current research, we introduce an advanced approach for the detection of smart contract vulnerabilities leveraging Large Language Models (LLMs). Smart contracts are pivotal in the ecosystem of decentralized finance (DeFi), functioning as automated protocols for data management and transaction execution. The foundation of numerous blockchain-based applications lies in smart contract technology. Nevertheless, these contracts’ code vulnerabilities can become targets for malicious exploitation, leading to substantial financial damages, exemplified by the 2016 Dao smart contract incident, which incurred a loss of 55 million USD. In response to such challenges, a plethora of detection mechanisms for smart contract vulnerabilities have been devised, drawing upon conventional static analysis, fuzzy testing, and machine learning methodologies. Owing to the swift progression of LLMs, such as GPT, a broad spectrum of entities has adopted these models for routine operational management. Recognizing LLMs' inherent capability to comprehend programming code, our research seeks to investigate their aptitude for identifying smart contract vulnerabilities. We have integrated prompt engineering techniques, including the Chain of Thought (CoT), Plan-and-Solve, and few-shot learning, to augment the LLMs' vulnerability detection efficacy. Furthermore, a sequence of empirical studies has been orchestrated to validate the effectiveness of our proposed prompt engineering strategies against diverse smart contract vulnerabilities.
摘要 i
ABSTRACT ii
誌謝 iii
目錄 iv
圖目錄 vi
表目錄 vii
第一章 緒論 1
1.1研究背景與動機 1
1.2研究動機 5
1.3 論文組成 9
第二章 文獻探討 11
2.1去中心化金融 (DeFi) 發展的歷史 11
2.2智能合約上的漏洞偵測研究 13
2.2.1靜態分析(Static Analysis) 14
2.2.2模糊測試(Fuzzy Testing) 14
2.2.3機器學習方法 14
2.3大型語言模型的發展 15
2.4大型語言模型的prompt engineering技術 18
第三章 研究方法 22
3.1智能合約上的漏洞資料 22
3.1.1 無窮迴圈(infinite loop) 22
3.1.2 整數上溢位/下溢位(integer overflow/underflow) 23
3.1.3 重進入弱點(reentrancy) 24
3.2基於大型語言模型的弱點偵測方法 24
第四章 實驗結果 30
第五章 結論及未來研究方向 34
圖目錄
圖 1智能合約的範例 4
圖 2大型語言模型的湧現現象 7
圖 3 T5的text-to-text架構 16
圖 4 OpenAI的in context learning的示範。 18
圖 5 GPT模型在執行圖 4任務時所回應的結果。 19
圖 6 Zero-shot 以及 few-shot CoT 的範例 20
圖 7 Plan-and-Solve示意圖 21
圖 8 無窮迴圈的範例 22
圖 9 整數溢位的弱點 23
圖 10 測試集中弱點的分佈 30
表目錄
表格 1 知名語言模型所用的參數量 17
表格 2 不同prompt engineering方法所產生的prompt的示範。 25
表格 3 Few-shot CoT 的例子 28
表格 4 不同prompt engineering在zero-shot 情境下,對於不同弱點的預測分數。 31
表格 5 不同prompt engineering在few-shot 情境下,對於不同弱點的預測分數。 32


參考文獻
[1] T. Weingärtner, F. Fasser, P. da Costa, and W. Farkas, “Deciphering DeFi: A Comprehensive Analysis and Visualization of Risks in Decentralized Finance,” Journal of Risk and Financial Management. 2023. doi: 10.3390/jrfm16100454.
[2] X. Sun, X. Chen, C. Stasinakis, and G. Sermpinis, “Multiparty democracy in decentralized autonomous organization (dao): evidence from makerdao,” SSRN Electronic Journal, 2022, doi: 10.2139/ssrn.4253868.
[3] F. Schär, “Decentralized finance: on blockchain- and smart contract-based financial markets,” Review, vol. 103, no. 2, 2021, doi: 10.20955/r.103.153-74.
[4] P. K. Ozili, “Decentralised finance and cryptocurrency activity in africa,” The New Digital Era: Digitalisation, Emerging Risks and Opportunities, pp. 3–11, 2022, doi: 10.1108/s1569-37592022000109a001.
[5] J. Yli-Huumo, D. Ko, S. Choi, S. Park, and K. Smolander, “Where is current research on blockchain technology?—a systematic review,” PLoS One, vol. 11, no. 10, p. e0163477, 2016, doi: 10.1371/journal.pone.0163477.
[6] C. H. Rawlins and S. Jagannathan, “An intelligent distributed ledger construction algorithm for iot,” IEEE Access, vol. 10, pp. 10838–10851, 2022, doi: 10.1109/access.2022.3146343.
[7] S. Wang, L. Ouyang, Y. Yuan, X. Ni, X. Han, and F.-Y. Wang, “Blockchain-enabled smart contracts: architecture, applications, and future trends,” IEEE Trans Syst Man Cybern Syst, vol. 49, no. 11, pp. 2266–2277, 2019.
[8] S. S. Kushwaha, S. Joshi, D. Singh, M. Kaur, and H. N. Lee, “Ethereum Smart Contract Analysis Tools: A Systematic Review,” IEEE Access, vol. 10, pp. 57037–57062, 2022, doi: 10.1109/ACCESS.2022.3169902.
[9] J. Wei et al., “Emergent Abilities of Large Language Models,” Transactions on Machine Learning Research, 2022.
[10] N. Ding et al., “OpenPrompt: An Open-source Framework for Prompt-learning,” Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 105–113, Nov. 2021, doi: 10.18653/v1/2022.acl-demo.10.
[11] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing,” ACM Comput Surv, vol. 55, no. 9, Jan. 2023, doi: 10.1145/3560815.
[12] P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language
36
Processing,” ACM Comput Surv, vol. 55, no. 9, pp. 1–35, Jul. 2021, doi: 10.48550/arxiv.2107.13586.
[13] X. Yang, W. Cheng, X. Zhao, W. Yu, L. Petzold, and H. Chen, “Dynamic Prompting: A Unified Framework for Prompt Tuning,” Mar. 2023, Accessed: Oct. 17, 2023. [Online]. Available: https://arxiv.org/abs/2303.02909v2
[14] T. Shin, Y. Razeghi, R. L. Logan, E. Wallace, and S. Singh, “AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts,” EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pp. 4222–4235, Oct. 2020, doi: 10.18653/v1/2020.emnlp-main.346.
[15] J. Wei et al., “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” Adv Neural Inf Process Syst, vol. 35, Nov. 2022, [Online]. Available: https://arxiv.org/abs/2201.11903v6
[16] L. Luu, D.-H. Chu, H. Olickel, P. Saxena, and A. Hobor, “Making smart contracts smarter,” in Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 254–269.
[17] P. Tsankov, A. Dan, D. Drachsler-Cohen, A. Gervais, F. Buenzli, and M. Vechev, “Securify: Practical security analysis of smart contracts,” in Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, 2018, pp. 67–82.
[18] M. Mossberg et al., “Manticore: A user-friendly symbolic execution framework for binaries and smart contracts,” in 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2019, pp. 1186–1189.
[19] B. Jiang, Y. Liu, and W. K. Chan, “Contractfuzzer: Fuzzing smart contracts for vulnerability detection,” in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018, pp. 259–269.
[20] W. Wang, J. Song, G. Xu, Y. Li, H. Wang, and C. Su, “Contractward: Automated vulnerability detection models for ethereum smart contracts,” IEEE Trans Netw Sci Eng, vol. 8, no. 2, pp. 1133–1144, 2020.
[21] S.-J. Hwang, S.-H. Choi, J. Shin, and Y.-H. Choi, “CodeNet: Code-targeted convolutional neural network architecture for smart contract vulnerability detection,” IEEE Access, vol. 10, pp. 32595–32607, 2022.
[22] L. ; Zhang et al., “CBGRU: A Detection Method of Smart Contract Vulnerability Based on a Hybrid Model,” Sensors 2022, Vol. 22, Page 3577, vol. 22, no. 9, p.
37
3577, May 2022, doi: 10.3390/S22093577.
[23] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput, vol. 9, no. 8, November 15, 1997, pp. 1735–1780, 1997, [Online]. Available: http://www.bioinf.jku.at/publications/older/2604.pdf
[24] R. Dey and F. M. Salemt, “Gate-variants of Gated Recurrent Unit (GRU) neural networks,” in Midwest Symposium on Circuits and Systems, Institute of Electrical and Electronics Engineers Inc., Jan. 2017, pp. 1597–1600. doi: 10.1109/MWSCAS.2017.8053243.
[25] A. Vaswani et al., “Attention is All you Need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., Curran Associates, Inc., 2017, pp. 5998–6008. [Online]. Available: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
[26] J. D. M.-W. C. Kenton and L. K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” in Proceedings of NAACL-HLT, 2019, pp. 4171–4186.
[27] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V Le, “Xlnet: Generalized autoregressive pretraining for language understanding,” Adv Neural Inf Process Syst, vol. 32, 2019.
[28] C. Raffel et al., “Exploring the limits of transfer learning with a unified text-to-text transformer,” Journal of Machine Learning Research, vol. 21, pp. 1–67, Oct. 2020, Accessed: Feb. 24, 2022. [Online]. Available: https://arxiv.org/abs/1910.10683v3
[29] X. Liu et al., “GPT understands, too,” AI Open, 2023, doi: https://doi.org/10.1016/j.aiopen.2023.08.012.
[30] E. A. M. v. Dis, J. Bollen, W. Zuidema, R. v. Rooij, and C. Bockting, “Chatgpt: five priorities for research,” Nature, vol. 614, no. 7947, pp. 224–226, 2023, doi: 10.1038/d41586-023-00288-7.
[31] Wiki, “Large language model.” Nov. 2023. [Online]. Available: https://en.wikipedia.org/wiki/Large_language_model
[32] S. S. Tekiroğlu, Y. Chung, and M. Guerini, “Generating counter narratives against online hate speech: data and strategies,” Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, doi:
38
10.18653/v1/2020.acl-main.110.
[33] C. Zhu-Tian et al., “Beyond Generating Code: Evaluating GPT on a Data Visualization Course,” Jun. 2023, Accessed: Nov. 08, 2023. [Online]. Available: https://arxiv.org/abs/2306.02914v2
[34] M. Orken, O. Dina, A. Keylan, T. Tolganay, and O. Mohamed, “A study of transformer-based end-to-end speech recognition system for Kazakh language,” Scientific Reports 2022 12:1, vol. 12, no. 1, pp. 1–11, May 2022, doi: 10.1038/s41598-022-12260-y.
[35] OpenAI, “OpenAI Prompt engineering Guide.” Accessed: Nov. 08, 2023. [Online]. Available: https://platform.openai.com/docs/guides/prompt-engineering
[36] T. Kojima, S. S. Gu, M. R. G. Research, Y. Matsuo, and Y. Iwasawa, “Large Language Models are Zero-Shot Reasoners,” Adv Neural Inf Process Syst, vol. 35, pp. 22199–22213, Nov. 2022.
[37] L. Wang et al., “Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models,” pp. 2609–2634, May 2023, doi: 10.18653/v1/2023.acl-long.147.
[38] T. Durieux, J. F. Ferreira, R. Abreu, and P. Cruz, “Empirical Review of Automated Analysis Tools on 47,587 Ethereum Smart Contracts,” Proceedings - International Conference on Software Engineering, pp. 530–541, Oct. 2019, doi: 10.1145/3377811.3380364.
[39] Y. He, H. Dong, H. Wu, and Q. Duan, “Formal verification of Reentrancy Vulnerability Based on CPN,” 2022.
[40] OpenAI, “OpenAI Prompt engineering Guide.” Nov. 2023. [Online]. Available: https://platform.openai.com/docs/guides/prompt-engineering
[41] S. Siriwardhana, R. Weerasekera, E. Wen, T. Kaluarachchi, R. Rana, and S. Nanayakkara, “Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering,” Trans Assoc Comput Linguist, vol. 11, pp. 1–17, Nov. 2023, doi: 10.1162/TACL_A_00530/114590/IMPROVING-THE-DOMAIN-ADAPTATION-OF-RETRIEVAL.
[42] P. Lewis et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Adv Neural Inf Process Syst, vol. 33, pp. 9459–9474, 2020, Accessed: Nov. 15, 2023. [Online]. Available: https://github.com/huggingface/transformers/blob/master/
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
無相關期刊