[1] U. Ramacher, “Software-defined radio prospects for multistandard mobile phones,” Computer, vol. 40, no. 10, 2007. [2] S. Yang and L. Hanzo, “Fifty years of mimo detection: The road to large-scale mimos,” IEEE Communications Surveys & Tutorials, vol. 17, no. 4, pp. 1941–1988, 2015. [3] Q. Zheng, Y. Chen, R. Dreslinski, C. Chakrabarti, A. Anastasopoulos, S. Mahlke, and T. Mudge, “Architecting an lte base station with graphics processing units,” in 2013 IEEE Workshop on Signal Processing Systems, SiPS 2013. IEEE, 2013, pp. 219–224. [4] J. Berkmann, C. Carbonelli, F. Dietrich, C. Drewes, and W. Xu, “On 3g lte terminal implementation-standard, algorithms, complexities and challenges,” in Wireless Communications and Mobile Computing Conference, 2008. IWCMC’08. Interna- tional. IEEE, 2008, pp. 970–975. [5] J. Ketonen, M. Juntti, and J. R. Cavallaro, “Performance—complexity comparison of receivers for a lte mimo–ofdm system,” IEEE transactions on signal processing, vol. 58, no. 6, pp. 3360–3372, 2010. [6] Q. Zheng, Y. Chen, R. Dreslinski, C. Chakrabarti, A. Anastasopoulos, S. Mahlke, and T. Mudge, “Wibench: An open source kernel suite for benchmarking wireless systems,” in Proceedings of the IEEE International Symposium on Workload Char- acterization (IISWC), 2013, pp. 123–132. [7] S.Bang,C.Ahn,Y.Jin,S.Choi,J.Glossner,andS.Ahn,“Implementation of lte system on an sdr platform using cuda and uhd,” Analog Integrated Circuits and Signal Processing, vol. 78, no. 3, pp. 599–610, 2014. [8] J. Kim, S. Hyeon, and S. Choi, “Implementation of an sdr system using graphics processing unit,” IEEE Communications Magazine, vol. 48, no. 3, 2010. [9] T.-D.Chiueh,P.-Y.Tsai,andI.-W.Lai,Baseband receiver design for wireless MIMO-OFDM communications. John Wiley & Sons, 2012. [10] B. Hassibi and H. Vikalo, “On the sphere-decoding algorithm i. expected complexity,” IEEE transactions on signal processing, vol. 53, no. 8, pp. 2806–2818, 2005. [11] S. Cook, CUDA Programming: A Developer’s Guide to Parallel Computing with GPUs, 1st ed. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2013. [12] Q. Xu, H. Jeon, and M. Annavaram, “Graph processing on gpus: Where are the bottlenecks?” in Proceedings of the IEEE International Symposium on Workload Characterization (IISWC), 2014, pp. 140–149. [13] D. Sui, Y. Li, J. Wang, P. Wang, and B. Zhou, “High throughput mimo-ofdm detection with graphics processing units,” in 2012 IEEE International Conference on Computer Science and Automation Engineering (CSAE), vol. 2, pp. 176–179. [14] M.Wu,S.Gupta,Y.Sun,andJ.R.Cavallaro,“A GPU implementation of a real-time mimo detector,” in 2009 IEEE Workshop on Signal Processing Systems, SiPS, 2009, pp. 303–308. [15] M.S.Khairy,C.Mehlführer,and M.Rupp,“Boosting spheredecoding speed through graphic processing units,” in 2010 European Wireless Conference (EW). IEEE, 2010, pp. 99–104.