跳到主要內容

臺灣博碩士論文加值系統

(18.97.14.91) 您好!臺灣時間:2024/12/10 07:00
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果 :::

詳目顯示

: 
twitterline
研究生:鄭育丞
研究生(外文):Cheng, Yu-Cheng
論文名稱:藉由作業系統核心的 I/O 負載移轉以落實高度並行且低延遲的網路服務
論文名稱(外文):Achieving High-Concurrency and Low-Latency in Network Servers via In-Kernel I/O Offloading
指導教授:涂嘉恒黃敬群黃敬群引用關係
指導教授(外文):Tu, Chia-­HengHuang, Ching-­Chun
口試委員:涂嘉恒黃敬群羅習五
口試委員(外文):Tu, Chia-­HengHuang, Ching-­ChunLo, Shi-Wu
口試日期:2023-05-30
學位類別:碩士
校院名稱:國立成功大學
系所名稱:資訊工程學系
學門:工程學門
學類:電資工程學類
論文種類:學術論文
論文出版年:2023
畢業學年度:111
語文別:英文
論文頁數:90
中文關鍵詞:網路伺服器非同步I/O核心擴展I/O卸載效能分析
外文關鍵詞:network serverasynchronous I/Okernel extensionI/O offloadingperformance analysis
相關次數:
  • 被引用被引用:0
  • 點閱點閱:127
  • 評分評分:
  • 下載下載:3
  • 收藏至我的研究室書目清單書目收藏:0
事件驅動架構 (event-driven architecture) 已成為開發高性能網路伺服器的常見設計典範,尤其對於系統呼叫密集型應用程式而言,它能有效利用多工 I/O (I/O multiplexing) 處理機制。然而,為了修補處理器微架構漏洞 (如 Spectre 和 Meltdown),卻會增加系統呼叫密集情境的開銷。此外,傳統系統呼叫的同步執行阻塞(blocking)特性,使其難以充分利用最先進的多核處理器特性。為了解決這些限制並識別性能瓶頸,本研究探討 Linux 核心 I/O 模型的演進並分析運行時的應用程式,如網頁伺服器和鍵-值伺服器。我們提出一個特化的核心模組,旨在提升事件驅動網路伺服器的性能,並且不像 kernel bypass 策略在換取吞吐量和延遲的高性能時,增加功耗或犧牲安全性。此外,通過大量實驗,我們將我們的方法與原生系統進行比較。結果顯示,我們的方法得以改善頻寬、延遲和功耗,且無需重新設計或重新實作整個應用程式。
Over the past decade, event-driven architectures have become a common solution for developing high-performance network servers (which are considered system-call intensive applications) due to their ability to handle concurrent requests using I/O multiplexing. However, the mitigation of architectural flaws such as Spectre and Meltdown has led to an increase in the overhead of system calls. Furthermore, the blocking characteristic of synchronous execution of legacy system calls makes it difficult to utilize state-of-the-art multicore processors. To address these limitations and identify performance bottlenecks, this research delves into the evolution of the Linux I/O model and analyzes the runtime of applications such as web servers and in-memory key-value servers. To overcome these restrictions, we propose and implement a kernel extension that improves the performance of event-driven network servers. Our approach does not compromise high power consumption or security for high performance in terms of throughput and latency, unlike kernel bypass techniques. We conducted several experiments to compare our approach against the baseline, and the results show that our approach improves bandwidth, latency, and power consumption without requiring the entire application to be redesigned or reimplemented.
摘要 i
Abstract ii
Table of Contents iii
List of Tables v
List of Figures vi
Chapter 1. Introduction 1
1.1. Contributions 3
1.2. Paper Organization 3
Chapter 2. Fundamentals 5
2.1. Linux I/O model 5
2.2. Event­driven Architecture 7
2.2.1. EDA with network sockets in Linux 8
2.2.2. I/O strategies in EDA 9
2.2.3. Advantages of application adopting EDA 12
2.3. System Call in Linux 12
2.3.1. CPU privilege levels 12
2.3.2. Software flow of invoking a system call 13
2.3.3. Direct and indirect cost from a system call invocation 14
2.4. Network Server Software 16
2.4.1. Web server 16
2.4.2. In­memory key­value server 18
2.4.3. Transport Layer Security 22
2.4.4. Kernel TLS Offload 23
2.5. System Call Batching 24
2.6. System Call Bypassing 24
2.7. Asynchronous System Call 25
2.8. Data Plane Development Kit 26
Chapter 3. Design 28
3.1. Overview 28
3.2. Sharding Buffer 29
3.3. Asynchronous Operation Processor 31
3.4. Configurable Threading Model 32
3.5. User Adaptation Library 34
3.6. Proactor 35
3.7. Limitation 36
Chapter 4. Implementation 38
4.1. Supported Architectures 38
4.2. Hooking a New System Call 38
4.3. Establishing Communication Between User and Kernel Space 40
4.4. Threads on Asynchronous Operation Processor 42
4.4.1. Creating the worker from user­space 42
4.4.2. Posting an asynchronous system call to kernel 43
4.4.3. Offloading an asynchronous system call from userland 44
4.4.4. Suppression of instruction reordering during compilation 46
4.5. Application Patch 48
4.5.1. Soft and hard failed­independent application 48
4.5.2. Apply OFIO to Nginx 49
4.5.3. Apply OFIO to Redis 50
Chapter 5. Evaluation 55
5.1. System and configuration 55
5.2. Nginx 55
5.2.1. Workload 56
5.2.2. Enhancement for HTTP workload 56
5.2.3. Enhancement for HTTPS workload 58
5.2.4. Enhancement for kTLS 60
5.2.5. Conclusion 61
5.3. Redis 63
5.3.1. Workload 63
5.3.2. SET/GET operations 63
5.3.3. Request concurrency 66
5.3.4. Request size 67
5.3.5. Performance on HTTPS workload 68
5.3.6. Kernel bypass comparison 70
5.3.6.1 Power consumption comparison 70
5.3.6.2 Instance type comparison 71
5.3.6.3 Polling and sleep­and­wake comparison 72
5.3.7. Conclusion 73
Chapter 6. Conclusion 75
References 77
Appendix A. Software Patch 82
A.1. Critical code for Nginx kernel TLS support 82
A.2. Critical code for OFIO­-Redis 87
[1] Hasan Abbasi, Matthew Wolf, Greg Eisenhauer, Scott Klasky, Karsten Schwan, and Fang Zheng. DataStager: scalable data staging services for petascale applications. Cluster Computing, 13:277–290, 2010.
[2] Ubaid Abbasi, El Houssine Bourhim, Mouhamad Dieye, and Halima Elbiaze. A performance comparison of container networking alternatives. IEEE Network, 33:178–185, 2019.
[3] Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. Workload analysis of a large­scale key­value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems, pages 53–64, 2012.
[4] Suparna Bhattacharya, Steven Pratt, Badari Pulavarty, and Janet Morgan. Asynchronous I/O support in Linux 2.5. In Proceedings of the Linux Symposium, pages 371–386, 2003.
[5] Matias Bjørling, Jens Axboe, David Nellans, and Philippe Bonnet. Linux block IO: introducing multi­queue SSD access on multi­core systems. In Proceedings of the 6th international systems and storage conference, pages 1–10, 2013.
[6] Conor Black and Sandra Scott­Hayward. Investigating the Vulnerability of Programmable Data Planes to Static Analysis­Guided Attacks. In 2022 IEEE 8th International Conference on Network Softwarization (NetSoft), pages 411–419. IEEE, 2022.
[7] Tim Brecht, G Janakiraman, Brian Lynn, Vikram Saletore, and Yoshio Turner. Evaluating network processing efficiency with processor partitioning and asynchronous I/O. ACM SIGOPS Operating Systems Review, 40:265–278, 2006.
[8] Haogang Chen, Yandong Mao, Xi Wang, Dong Zhou, Nickolai Zeldovich, and M Frans Kaashoek. Linux kernel vulnerabilities: State­of­the­art defenses and open problems. In Proceedings of the Second Asia­Pacific Workshop on Systems, pages 1–5, 2011.
[9] Yu­Cheng Cheng, Ching­Chun (Jim) Huang, and Chia­Heng Tu. ESCA: Effective system call aggregation for event­driven servers. In 30th Euromicro International Conference on Parallel, Distributed and Network­based Processing (PDP), pages 18–25. IEEE, 2022.
[10] Adrian Cockcroft. Netflix in the cloud. QCon San Fransisco, 2011.
[11] Jonathan Corbet. KAISER: hiding the kernel from user space. https://lwn.net/Articles/738975/, 2017.
[12] Jonathan Corbet. Ringing in a new asynchronous I/O API. https://lwn.net/Articles/776703/, 2019.
[13] Helen Custer. Inside Windows NT. Microcomputer Applications, 1992.
[14] David Dice, Virendra J Marathe, and Nir Shavit. Lock cohorting: a general technique for designing NUMA locks. ACM SIGPLAN Notices, 47:247–256, 2012.
[15] Tim Dierks and Eric Rescorla. The transport layer security (tls) protocol version 1.2.Technical report, 2008.
[16] Jake Edge. TLS in the kernel. https://lwn.net/Articles/666509/, 2015.
[17] Yoav Einav. Amazon found every 100ms of latency cost them 1% in sales. https:
//www.gigaspaces.com/blog/, 2019.
[18] Khaled Elmeleegy, Anupam Chanda, Alan L Cox, and Willy Zwaenepoel. Lazy asynchronous I/O for event­driven servers. In USENIX Annual Technical Conference, General Track, pages 241–254, 2004.
[19] Marco Faltelli, Giacomo Belocchi, Francesco Quaglia, Salvatore Pontarelli, and Giuseppe Bianchi. Metronome: adaptive and precise intermittent packet retrieval in DPDK. In Proceedings of the 16th International Conference on emerging Networking EXperiments and Technologies, pages 406–420, 2020.
[20] Bin Fan, David G Andersen, and Michael Kaminsky. MemC3: Compact and concurrent MemCache with dumber caching and smarter hashing. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 371–384, 2013.
[21] Steffen Friedrich, Wolfram Wingerath, and Norbert Ritter. Coordinated omission in nosql database benchmarking. Datenbanksysteme für Business, Technologie und Web (BTW 2017)­Workshopband, 2017.
[22] Luis Gerhorst. Flexible and low­overhead system­call aggregation using BPF. 2021.
[23] Luis Gerhorst, Benedict Herzog, Stefan Reif, Wolfgang Schröder­Preikschat, and Timo Hönig. AnyCall: Fast and flexible system­call aggregation. In Proceedings of the 11th Workshop on Programming Languages and Operating Systems, pages 1–8, 2021.
[24] Yoann Ghigoff, Julien Sopena, Kahina Lazri, Antoine Blin, and Gilles Muller. BMC: Accelerating Memcached using safe in­kernel caching and pre­stack processing. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21), pages 487–501, 2021.
[25] Hossein Golestani, Amirhossein Mirhosseini, and Thomas F Wenisch. Software data planes: You can’t always spin to win. In Proceedings of the ACM Symposium on Cloud Computing, pages 337–350, 2019.
[26] Steven Hart, Eitan Frachtenberg, and Mateusz Berezecki. Predicting Memcached throughput using simulation and modeling. In SpringSim (TMS­DEVS), page 40, 2012.
[27] Aditya G Holla and Maurice Herlihy. Lock elision for Memcached: Power and performance analysis on an embedded platform. Computer Science Department, Brown University, pages 1–9, 2013.
[28] Neil R. T. Horman. Batch execution of system calls in an operating system, 2015.
[29] Zhichao Hua, Dong Du, Yubin Xia, Haibo Chen, and Binyu Zang. EPTI: Efficient defence against meltdown attack for unpatched vms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 255–266, 2018.
[30] Ching­Chun Huang and Chung­Fan Yang. An empirical approach to minimize latency of real­time multiprocessor Linux kernel. In 2020 International Computer Symposium (ICS), pages 214–218. IEEE, 2020.
[31] Takashi Isobe, Satoshi Tsutsumi, Koichiro Seto, Kenji Aoshima, and Kazutoshi Kariya. 10 Gbps implementation of TLS/SSL accelerator on FPGA. In 2010 IEEE 18th International Workshop on Quality of Service (IWQoS), pages 1–6. IEEE, 2010.
[32] Keon Jang, Sangjin Han, Seungyeop Han, Sue Moon, and KyoungSoo Park. SSLShader: Cheap SSL acceleration with commodity processors. In 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11), 2011.
[33] Michael Jarschel, Arsany Basta, Wolfgang Kellerer, and Marco Hoffmann. SDN and NFV in the mobile core: Approaches and challenges. it­Information Technology, 57:305–313, 2015.
[34] Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. NetCache: Balancing key­value stores with fast innetwork caching. In Proceedings of the 26th Symposium on Operating Systems Principles, pages 121–136, 2017.
[35] Hongshin Jun, Jinhee Cho, Kangseol Lee, Ho­Young Son, Kwiwook Kim, Hanho Jin, and Keith Kim. HBM (high bandwidth memory) DRAM technology and architecture. In 2017 IEEE International Memory Workshop (IMW), pages 1–4. IEEE, 2017.
[36] Ryota Kawashima, Shin Muramatsu, Hiroki Nakayama, Tsunemasa Hayashi, and Hiroshi Matsuo. A host­based performance comparison of 40g nfv environments focusing on packet processing architectures and virtual switches. In 2016 Fifth European Workshop on Software­Defined Networks (EWSDN), pages 19–24. IEEE, 2016.
[37] Greg Kroah­Hartman. readfile: implement readfile syscall, 2020.
[38] Simon Kuenzer, Vlad­Andrei Bădoiu, Hugo Lefeuvre, Sharan Santhanam, Alexander Jung, Gaulthier Gain, Cyril Soldani, Costin Lupu, Ştefan Teodorescu, Costi Răducanu, et al. Unikraft: fast, specialized unikernels the easy way. In Proceedings of the Sixteenth European Conference on Computer Systems, pages 376–394, 2021.
[39] Hsuan­Chi Kuo, Dan Williams, Ricardo Koller, and Sibin Mohan. A Linux in unikernel clothing. In Proceedings of the Fifteenth European Conference on Computer Systems (EuroSys), pages 1–15, 2020.
[40] Stefan Lankes, Jens Breitbart, and Simon Pickartz. Exploring Rust for unikernel development. In Proceedings of the 10th Workshop on Programming Languages and Operating Systems, pages 8–15, 2019.
[41] Bojie Li, Zhenyuan Ruan, Wencong Xiao, Yuanwei Lu, Yongqiang Xiong, Andrew Putnam, Enhong Chen, and Lintao Zhang. KV­Direct: High­performance in-­memory key-­value store with programmable NIC. In Proceedings of the 26th Symposium on Operating Systems Principles, pages 137–152, 2017.
[42] Xuesong Li, Wenxue Cheng, Tong Zhang, Jing Xie, Fengyuan Ren, and Bailong Yang. Power efficient high performance packet I/O. In Proceedings of the 47th International Conference on Parallel Processing, pages 1–10, 2018.
[43] Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Jann Horn, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, et al. Meltdown: Reading kernel memory from user space. Communications of the ACM, 63(6):46–56, 2020.
[44] Daniel A Menasce. Web server software architectures. IEEE internet computing, 7:78–81, 2003.
[45] Brenda M Michelson. Event­driven architecture overview. Patricia Seybold Group, 2:10–1571, 2006.
[46] Brenda M Michelson. Event­driven architecture overview. Patricia Seybold Group, 2(12):10–1571, 2006.
[47] Gustavo Miotto, Marcelo Caggiani Luizelli, Weverton Luis da Costa Cordeiro, and Luciano Paschoal Gaspary. Adaptive placement & chaining of virtual network functions with NFV­PEAR. Journal of Internet Services and Applications, 10:1–19, 2019.
[48] Lars Müller. Kpti a mitigation method against meltdown. Advanced Microkernel Operating Systems, page 41, 2018.
[49] Edmund B Nightingale, Kaushik Veeraraghavan, Peter M Chen, and Jason Flinn. Rethink the sync. ACM Transactions on Computer Systems (TOCS), 26:1–26, 2008.
[50] Boris Pismenny, Haggai Eran, Aviad Yehezkel, Liran Liss, Adam Morrison, and Dan Tsafrir. Autonomous NIC offloads. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 18–35, 2021.
[51] Nikolai Pitaev, Matthias Falkner, Aris Leivadeas, and Ioannis Lambadaris. Characterizing the performance of concurrent virtualized network functions with OVS­DPDK, FD. IO VPP and SR­IOV. In Proceedings of the 2018 ACM/SPEC International Conference on Performance Engineering, pages 285–292, 2018.
[52] George Prekas, Marios Kogias, and Edouard Bugnion. Zygos: Achieving low tail latency for microsecond­scale networked tasks. In Proceedings of the 26th Symposium on Operating Systems Principles, pages 325–341, 2017.
[53] Anmol Sarma Rahul Jadhav, Zhen Cao. Improved system call batching for network I/O, 2019.
[54] Ali Raza, Thomas Unger, Matthew Boyd, Eric B Munson, Parul Sohal, Ulrich Drepper, Richard Jones, Daniel Bristot De Oliveira, Larry Woodman, Renato Mancuso, et al. Unikernel Linux (UKL). In Proceedings of the Eighteenth European Conference on Computer Systems, pages 590–605, 2023.
[55] Luigi Rizzo. netmap: a novel framework for fast packet I/O. In 21st USENIX Security Symposium (USENIX Security 12), pages 101–112, 2012.
[56] Douglas C Schmidt, Michael Stal, Hans Rohnert, and Frank Buschmann. Patternoriented software architecture, patterns for concurrent and networked objects. John Wiley & Sons, 2013.
[57] Jeffrey Shafer, David Carr, Aravind Menon, Scott Rixner, Alan L Cox, Willy Zwaenepoel, and Paul Willmann. Concurrent direct network access for virtual machine monitors. In 2007 IEEE 13th International Symposium on High Performance Computer Architecture, pages 306–317. IEEE, 2007.
[58] Livio Soares and Michael Stumm. FlexSC: Flexible system call scheduling with exception-­less system calls. In 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI 10), 2010.
[59] Randall Stewart, John­Mark Gurney, and Scott Long. Optimizing TLS for highbandwidth applications in FreeBSD. In Proc. Asia BSD conference. Citeseer, 2015.
[60] Houjun Tang, Quincey Koziol, Suren Byna, John Mainzer, and Tonglin Li. Enabling transparent asynchronous I/O using background threads. In 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW), pages 11–19. IEEE, 2019.
[61] Dave Watson. KTLS: Linux kernel transport layer security. Proposal by Facebook Engineer, 2016.
[62] Alex Wiggins and Jimmy Langston. Enhancing the scalability of Memcached. Intel document, unpublished, 2012.
[63] Joel M Winett. Definition of a socket. Technical report, 1971.
[64] Qiumin Xu, Huzefa Siyamwala, Mrinmoy Ghosh, Tameesh Suri, Manu Awasthi, Zvika Guz, Anahita Shayesteh, and Vijay Balakrishnan. Performance analysis of NVMe SSDs and their implication on real world databases. In Proceedings of the 8th ACM International Systems and Storage Conference, pages 1–11, 2015.
[65] Shan Zeng and Qinfen Hao. Network I/O path analysis in the kernel-­based virtual machine environment through tracing. In 2009 First International Conference on Information Science and Engineering, pages 2658–2661. IEEE, 2009.
[66] Tingzhe Zhou, Pante A Zardoshti, and Michael Spear. Practical experience with transactional lock elision. In 2017 46th International Conference on Parallel Processing (ICPP), pages 81–90. IEEE, 2017.
[67] Heqing Zhu. Data Plane Development Kit (DPDK): A Software Optimization Guide to the User Space­based Network Applications. CRC Press, 2020.
連結至畢業學校之論文網頁點我開啟連結
註: 此連結為研究生畢業學校所提供,不一定有電子全文可供下載,若連結有誤,請點選上方之〝勘誤回報〞功能,我們會盡快修正,謝謝!
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top