|
[1] LEE, Edward A. The problem with threads. Computer, 2006, 39.5: 33-42. [2] SUTTER, Herb; LARUS, James. Software and the concurrency revolution. Queue, 2005, 3.7: 54-62. [3] TAN, M. A Minimal GDB Stub for Embedded Remote Debugging, 2002.Columbia University. [4] STALLMAN, Richard M.; PESCH, Roland H. Using GDB: A guide to the GNU source-level debugger. Free software foundation, 1991. [5] ZHU, Jianwen; GAJSKI, Daniel D. A retargetable, ultra-fast instruction set simulator. In: Design, Automation and Test in Europe Conference and Exhibition 1999. Proceedings. IEEE, 1999. p. 298-302. [6] LIN, P.-CP; DU, Evason; TSAY, Ren-Song. A fast and accurate instruction-oriented processor simulation approach. In: VLSI Design, Automation, and Test (VLSI-DAT), 2013 International Symposium on. IEEE, 2013. p. 1-5. [7] LIN, Kai-Li; LO, Chen-Kang; TSAY, Ren-Song. Source-level timing annotation for fast and accurate TLM computation model generation. In: Design Automation Conference (ASP-DAC), 2010 15th Asia and South Pacific. IEEE, 2010. p. 235-240. [8] CHEN, Shu-Yung; CHEN, Chien-Hao; TSAY, Ren-Song. An activity-sensitive contention delay model for highly efficient deterministic full-system simulations. In: Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014. IEEE, 2014. p. 1-6. [9] WU, Meng-Huan, et al. A high-parallelism distributed scheduling mechanism for multi-core instruction-set simulation. In: Proceedings of the 48th Design Automation Conference. ACM, 2011. p. 339-344. [10] WU, Meng-Huan, et al. An effective synchronization approach for fast and accurate multi-core instruction-set simulation. In: Proceedings of the seventh ACM international conference on Embedded software. ACM, 2009. p. 197-204 [11] WU, Meng-Huan, et al. An effective synchronization approach for fast and accurate multi-core instruction-set simulation. In: Proceedings of the seventh ACM international conference on Embedded software. ACM, 2009. p. 197-204. [12] YU, Fan-Wei, et al. A critical-section-level timing synchronization approach for deterministic multi-core instruction set simulations. In: Proceedings of the Conference on Design, Automation and Test in Europe. EDA Consortium, 2013. p. 643-648. [13] ZENG, Bo-Han; TSAY, Ren-Song; WANG, Ting-Chi. An efficient hybrid synchronization technique for scalable multi-core instruction set simulations. In: Design Automation Conference (ASP-DAC), 2013 18th Asia and South Pacific. IEEE, 2013. p. 588-593. [14] CARLSON, Trevor E.; HEIRMAN, Wim; EECKHOUT, Lieven. Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, 2011. p. 52. [15] MOLLOY, Michael Karl. On the integration of delay and throughput measures in distributed processing models. 1981. [16] AJMONE MARSAN, M., et al. Modeling bus contention and memory interference in a multiprocessor system. Computers, IEEE Transactions on, 1983, 100.1: 60-72. [17] LUCIA, Brandon; WOOD, Benjamin P.; CEZE, Luis. Isolating and understanding concurrency errors using reconstructed execution fragments. In:ACM SIGPLAN Notices. ACM, 2011. p. 378-388. [18] ZHANG, Wei, et al. ConSeq: detecting concurrency bugs through sequential errors. In: ACM SIGPLAN Notices. ACM, 2011. p. 251-264. [19] ERICKSON, John, et al. Effective Data-Race Detection for the Kernel. In: OSDI. 2010. p. 1-16. [20] MUSUVATHI, Madanlal, et al. Finding and Reproducing Heisenbugs in Concurrent Programs. In: OSDI. 2008. p. 267-280. [21] SAVAGE, Stefan, et al. Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS), 1997, 15.4: 391-411. [22] MOZILLA, rr-Project. [Website]. 2014. Available from: http://rr-project.org/. [23] ECLIPSE: Parallel Tools Platform (PTP) User Guide. [Website]. 2014. Available from: http://help.eclipse.org/juno/index.jsp?topic=%2Forg.eclipse.ptp.doc.user%2Fhtml%2F06parDebugging.html [24] ALTEKAR, Gautam; STOICA, Ion. ODR: output-deterministic replay for multicore debugging. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. ACM, 2009. p. 193-206. [25] ZAMFIR, Cristian, et al. Debug determinism: the sweet spot for replay-based debugging. In: Workshop on Hot Topics in Operating Systems. 2011. [26] NARAYANASAMY, Satish; POKAM, Gilles; CALDER, Brad. Bugnet: Continuously recording program execution for deterministic replay debugging. In: ACM SIGARCH Computer Architecture News. IEEE Computer Society, 2005. p. 284-295. [27] WANG, Yan, et al. DrDebug: Deterministic Replay based Cyclic Debugging with Dynamic Slicing. In: Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization. ACM, 2014. p. 98. [28] GOTTSCHLICH, Justin E., et al. Concurrent predicates: A debugging technique for every parallel programmer. In: Proceedings of the 22nd international conference on Parallel architectures and compilation techniques. IEEE Press, 2013. p. 331-340. [29] BERGAN, Tom, et al. CoreDet: a compiler and runtime system for deterministic multithreaded execution. In: ACM SIGARCH Computer Architecture News. ACM, 2010. p. 53-64. [30] DEVIETTI, Joseph, et al. DMP: deterministic shared memory multiprocessing. In: ACM SIGARCH Computer Architecture News. ACM, 2009. p. 85-96. [31] LIU, Tongping; CURTSINGER, Charlie; BERGER, Emery D.Dthreads: efficient deterministic multithreading. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM, 2011. p. 327-336. [32] OLSZEWSKI, Marek; ANSEL, Jason; AMARASINGHE, Saman. Kendo: efficient deterministic multithreading in software. ACM Sigplan Notices, 2009, 44.3: 97-108. [33] AVIRAM, Amittai, et al. Efficient system-enforced deterministic parallelism. Communications of the ACM, 2012, 55.5: 111-119. [34] BELLARD, Fabrice. QEMU, a Fast and Portable Dynamic Translator. In: USENIX Annual Technical Conference, FREENIX Track. 2005. p. 41-46. [35] WOO, Steven Cameron, et al. The SPLASH-2 programs: Characterization and methodological considerations. In: ACM SIGARCH Computer Architecture News. ACM, 1995. p. 24-36. [36] MARK, D. Hill and Min Xu. Racey: A stress test for determinis tic execution. In http://www.cs.wisc.edu/~markhill/racey.html [37] DOWNEY, Allen B. The Little Book of Semaphores. Version, 2005, 2.5: 11-15. [38] SHAMS, Ramtin; KENNEDY, R. A. Efficient histogram algorithms for NVIDIA CUDA compatible devices. In: Proc. Int. Conf. on Signal Processing and Communications Systems (ICSPCS). 2007. p. 418-422. [39] HUANG, Jeff; ZHANG, Charles; DOLBY, Julian. CLAP: recording local executions to reproduce concurrency failures. In: ACM SIGPLAN Notices. ACM, 2013. p. 141-152.
|