|
[1] S. Waser and M. J. Flynn, Introduction to Arithmetic for Digital Systems Designers, Holt, Rinehart and Winston, NY, 1982 [2] K. Hwang, Computer Arithmetic, John Wiley & Sons Inc., NY, 1979 [3] A. Svoboda, Digitale Informationswanler, Vieweg and Sohn, Braunschweig, Germany, 1960 [4] N. S. Szabo and R. I. Tanaka, Residue Arithmetic and Its Applications to Computer Technology, McGraw-Hall, NY, 1967 [5] J. Sklansky and M. Lehman, “Ultimate-speed adders,” IRE Trans. EC-12, no. 2, pp. 142-148, Apr. 1963 [6] O. J. Bedrji, “Carry-select adders,” IRE Trans. EC-11, no. 3, pp. 340-360, June 1962 [7] A. Avizienis, “Signed digit number representation for fast parallel arithmetic”, IRE Trans. Electron Computers, vol. EC-10, pp. 389-400, 1961 [8] Y. Harata, Y. Nakamura, H. Nagase, M. Takigawa, and N. Takagi, “A high-speed multiplier using a redundant binary adder tree,” IEEE J. Solid-State Circuits, vol. SC-22, pp. 28-34, Feb. 1987 [9] M. D. Ercegovac, A General Method for Evaluation of Functions and Computations in a Digital Computer, Ph.D. Dissertation, Department of Computer Science, University of Illinois, Urbana-Champaign, Aug. 1975 [10] M. D. Ercegovac, “A general hardware-oriented method for evaluation of functions and computations in a digital computer,” IEEE Trans. Computers, vol. C-26, no. 7, pp. 667-680, July 1977 [11] K. S. Trivedi and M. D. Ercegovac, “On-line algorithms for division and multiplication,” IEEE Trans. Computers, vol. C-26, no. 7, pp. 681-687, July 1977 [12] M. D. Ercegovac, “On-line arithmetic: an overview,” Proc. SPIE, Real Time Signal Processing VII, vol. 495, pp. 86-93, 1984 [13] K. K. Parhi, VLSI Digital Signal Processing: Design and Implementation, John Wiley & Sons, Inc., NY, 1999 [14] A. Guyot, Y. Herreros, and J. Muller, “JANUS, an on-line multiplier/divider for manipulating larger numbers,” Proc. of 9th Symposium on Computer Arithmetic, pp. 106-111, 1989 [15] D. S. Phatak and I. Koren, “Hybrid signed-digit number systems: a nuified framework for redundant number representations with bounded carry propagation chains,” IEEE Trans. Computers, vol. 43, no. 8, pp. 880-891, Aug. 1994 [16] Y. Harata, Y. Nakamura, H. Nagase, M. Takigawa, and N. Takagi, “A high-speed multiplier using a redundant binary adder tree,” IEEE J. Solid-State Circuits, vol. SC-22, No. 1, pp. 28-34, Feb. 1987 [17] G. Privat, “A novel class of serial-parallel redundant signed-digit multipliers,” Proc. IEEE Int’l Symp. Circuits and Systems, pp. 2116-2119, 1990 [18] N. Takagi, H. Yassura, and S. Yajima, “High-speed VLSI implementation algorithm with a redundant binary addition tree,” IEEE Trans. Computer, vol. C-34, no. 9, pp. 789-796, Sept. 1985 [19] G. A. Ruiz and M. A. Manzano, “Self-timed multiplier based on canonical signed-digit recording,” IEE Proc. -Circuits Devices Syst., vol. 148, no. 5, pp. 235-241, Oct. 2001 [20] M. D. Ercegovac and T. Lang, “Fast multiplication without carry-propagation addition,” IEEE Trans. Computer, vol. 39, no. 11, pp. 1385-1390, Nov. 1990 [21] G. Privat and M. Renaudin, “Motion estimation VLSI architecture for image coding,” Proc. IEEE Int’l Conf. Computer Design, pp. 78-81, 1989 [22] S. G. Chen, “A unified bit-parallel architecture using redundant binary representation,” Proc. Int’l Phoenix Conf. Computers and Communications, pp. 91-96, 1989 [23] H. R. Srinivas and K. K. Parhi, “High-speed VLSI arithmetic processor architectures using hybrid number representation,” Proc. IEEE Int’l Conf. Computer design, pp. 564-571, 1991 [24] R. J. Singh and J. V. McCanny, “Systolic two-port adaptor for high performance wave digital filtering,” Proc. Int’l Conf. Application Specific Array Processors, pp. 379-388, 1990 [25] W. Diffie, “The first ten years of public key cryptography,” Proc. IEEE, vol. 76, pp. 560-577, May 1988 [26] D. S. Phatak, T. Goff, and I. Koren, “Constant-time addition and simultaneous format conversion based on redundant binary representations,” IEEE Trans. Computer, vol. 50, no. 11, pp. 1267-1278, Nov. 2001 [27] K.K. Parhi, D.G. Messerschmitt, “Pipeline interleaving and parallelism in recursive digital Filters- -Part I: Pipelining using scattered look-ahead and decomposition,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 1099-1117, July 1989 [28] C. W. Wu and P. R. Cappello, “Application-specific CAD of VLSI second-order sections,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 36, pp. 813-825, May 1988 [29] C. L. Wang, C.-H. Wei, and S. H. Chen, “Efficient bit-level systolic implementation of FIR and IIR filters,” IEEE J. Select. Areas Commun., vol. 6, pp. 484-493, Apr. 1988 [30] S. E. McQuillan and J. V. McCanny, “A systemic methodology for the design of high performance recursive digital filters,” IEEE Trans. Computer, vol. 44, pp. 971-982, Aug. 1995 [31] J. V. McCanny, R. F. Woods, S. E. McQuillan, and R. J. Singh, “Very high performance DSP chips based on most significant bit first arithmetic,” Proc. DSPX Exposition and Symp., pp. 289-297, Oct. 1993 [32] S. E. McQuillan and J. V. McCanny, “Algorithms and architectures for high performance recursive filtering,” Proc. Int’l Conf. Application Specific Array Processors, pp. 230-244, Aug. 1992 [33] R. F. Woods, S. C. Knowles, J. V. McCanny, and J. G. McWhirter, “Systolic IIR filters with bit level pipelining,” Proc. Int’l Conf. Acoustics, Speech, and Signal Processing, vol. 4, pp 2072-2075, Apr. 1988 [34] S. C. Knowles, R. F. Woods, J. G. McWhirter, and J. V. McCanny, “Bit-level systolic arrays for IIR filtering,” Proc. Inte’l Conf. Systolic Arrays, May 1988 [35] O. C. McNally, J. V. McCanny, and R. F. Woods, “Optimised bit level architectures for IIR filtering,” Proc. Inte’l Computer Design, pp. 302-306, Sept. 1990 [36] M. Lapointe, P. Fortier, and H. T. Huynh, “A new faster and simpler systolic structure for IIR filters,” Proc. Inte’l Symp. Circuits and Systems, vol. 2, pp 1227-1230, May 1990 [37] M. Andrews, “A systolic SBNR adaptive signal processor,” IEEE Trans. Circuits and Systems, vol. CAS-33, no. 2, pp. 230-238, Feb. 1986 [38] B. Koppenhöfer, “A novel architecture for a decision-feedback equalizer using extended signed-digit feedback,” Proc. Int’l Conf. Application-Specific Array Processors, pp. 490-501, Oct. 1993 [39] R. R. Rao and I. Chakrabarti, “High-performance compensation technique for the radix-4 CORDIC algorithm,” IEE Proc. Computers and digital techniques, vol. 149, no. 5, pp. 219-227, Sept. 2002 [40] R. R. Osorio, E. Antelo, and J. D. Bruguera, “Digit on-line large radix CORDIC rotator,” Proc. Int’l Conf. Application Specific Array Processors, pp. 246-257, July 1995 [41] T. S. Chang, C. Chen, and C. W. Jen, “New distributed arithmetic algorithm and its application to IDCT,” IEE Proc. Circuits, Devices, and Systems, vol. 146, no. 4, pp. 159-163, Aug. 1999 [42] A. Vandemeulebroecke, E. Vanzieleghen, T. Denayer, and P. Jespers, “A new carry-free division algorithm and its application to a single-chip 1024-b RSA processor,” IEEE J. Solid-State Circuits, vol. 25, no. 3, pp. 748-756, June 1990 [43] A. Peled and B. Lin, “A new approach to the realization of nonrecursive digital filters,” IEEE Trans. Audio and Electroacoustics, vol. AU-21, pp. 477-485, Dec. 1973 [44] S. A. White, “Applications of distributed arithmetic to digital signal processing: a tutorial review,” IEEE ASSP Magazine, pp. vol. 6, no. 3, pp. 4-19, July 1989 [45] K. P. Lim and A. B. Premkumar, “A modular approach to the computation of convolution sum using distributed arithmetic principle,” IEEE Trans. Circuits and Systems-II Analog and Digital Signal Processing, vol. 46, no. 1, pp. 92-96, Jan. 1999 [46] W. P. Burleson and L. L. Scharf, “VLSI design of inner-product computers using distributed arithmetic,” Proc. IEEE Int’l Symp. Circuits and Systems, vol. 1, pp 158-161, May 1989 [47] Y. T. Hwang and C. L. Su, “A new design approach and VLSI implementations of recursive digital filters,” Proc. IEEE Int’l Symp. Circuits and Systems, vol. 4, pp. 304-307, May 1996 [48] T. S. Chang, C. Chen, and C. W. Jen, “New distributed arithmetic algorithm and its application to IDCT,” IEE Proceedings- Circuits, Devices and Systems, vol. 146, no. 4, pp. 159-164, Aug. 1990 [49] S. Yu and E. E. Swartziander, “DCT implementation with distributed arithmetic,” IEEE Trans. Computers, vol. 50, vol. 9, pp. 985-991, Sept. 2001 [50] H. C. karathanasis, “A low ROM distributed arithmetic implementation of the forward/inverse DCT/IDCT using rotations,” IEEE Trans. Consumer Electronics, vol. 41, no. 2, pp. 263-272, May 1995 [51] L. G. Chen, J. Y. Jiu, and H. C. Chang, “Design and implementation of low-power DCT for portable multimedia terminals,” IEEE Workshop on Signal Processing Systems, pp. 85-93, Oct. 1998 [52] T. S. Chang, C. S. Kung, and C. W. Jen, “A simple processor core design for DCT/IDCT,” IEEE Trans. Circuits and Systems for Video Technology, vol. 10, no. 3, pp. 439-447, Apr. 2000 [53] K. R. Rao and P. Yip, ‘Discrete Cosine Transforms — Algorithms, advantages, applications,’ Academic, Boston, MA, 1990 [54] K. R. Rao and J. J. Hwang, Techniques and Standard for Image, Video and Audio Coding, Prentice Hall Inc., 1996 [55] T. Komarek and P. Pirsch, “Array architectures for block matching algorithms,” IEEE Trans. Circuits Systems for Video Technology, vol. 36, no. 10, pp. 1301-1308, Oct. 1989 [56] R. J. Jain and A. K. Jain, “Displacement measurement and its application in interframe image coding,” IEEE Trans. Commun., vol. 29, pp. 1799-1808 Dec. 1981 [57] T. Koga, K. Linuma, A. Hirano, Y. Iijima, and T. Ishiguro, “Motion compensated interframe coding for video conference,” Proc. NTC, pp. C9.6.1-5, Nov. 1981 [58] M. Ghanbari, “The cross-search algorithm for motion estimation,” IEEE Trans. Commun., vol. 38, pp. 950-953, July 1990 [59] V. Christopoulos and J. Cornelis, “A center-base adaptive search algorithm for block motion estimation,” IEEE Trans. Circuits Systems for Video Technology, vol. 10, pp. 423-426, Apr. 2000 [60] L. K. Su and R. M. Mersereau, “Motion estimation methods for overlapped block motion compensation,” IEEE Trans. Image Processing, vol. 9, pp. 1509-1521, Sept. 2000 [61] M. Bierling, “Displacement estimation by hierarchical block matching,” SPIE Conf., Visual Commun., Image Processing, vol. 1001, pp. 942-951, 1988 [62] B. Liu and A. Zaccaring, “New fast algorithms for the estimation of block motion vectors,” IEEE Trans. Circuits Systems for Video Technology, vol. 3, pp. 148-157, Apr. 1993 [63] Y. L. Chan and W. C. Siu, “New adaptive pixel decimation for block motion vector estimation,” IEEE Trans. Circuits Systems for Video Technology, vol. 6, pp. 113-118, Feb. 1996 [64] Y. Wang, Y. Wang, and H. Kuroda, “A globally adaptive pixel-decimation algorithm for block-motion estimation,” IEEE Trans. Circuits Systems for Video Technology, vol. 10, pp. 1006-1011, Sept. 2000 [65] J. Chalidabhongse and C. C. Kuo, “Fast motion vector estimation using multiresolution-spatio-temporal correlations,” IEEE Trans. Circuits Systems for Video Technology, vol. 7, pp. 477-488, June 1997 [66] F. Kossentini, Y. W. Lee, M. Smith, and R. K. Ward, “Predictive RD optimized motion estimation for very low bit-rate video coding,” IEEE J. Selected Areas in Communications, vol. 15, pp. 1752-1763, Dec. 1997 [67] A. Chimienti, L. Fanucci, R. Locatelli, and Saponara, “VLSI architecture for a low-power video codec system,” Microelectronics Journal, Elsevier Science, vol. 33, no. 3, pp. 417-427, 2002 [68] Z. L. He, C. T. Tsui, K. K. Chan, and M. L. Liou, “Low-power VLSI design for motion estimation using adaptive pixel truncation,” IEEE Trans. Circuits Systems for Video Technology, vol. 10, pp. 669-678, Aug. 2000 [69] Y. K. Lai and L. G. Chen, “A data-interlacing architecture with two-dimensional data-reuse for full-search block-matching algorithm,” IEEE Trans. Circuits Systems for Video Technology, vol. 8, pp. 124-127, Apr. 1998 [70] L. K. Komarek and P. Pirsch, “Array architectures for block matching algorithms,” IEEE Trans. Circuits Systems, vol. 36, pp. 124-127, Apr. 1998 [71] J. C. Tuan, T. S. Chang, and C. W. Jen, “On the data reuse and memory bandwidth analysis for full-search block-matching VLSI architecture,” IEEE Trans. Circuits Systems for Video Technology, vol. 12, pp. 61-72, Jan. 2002 [72] C. L. Su and C. W. Jen, “Motion estimation using on-line arithmetic,” Proc. IEEE Int’l Symp. Circuits and Systems, no. 1, pp. 683-686, 2000 [73] S. Y. Kung, VLSI Array Processor, Prentice-Hall Int’l Inc., NJ, 1988 [74] ‘Passport 0.35 micron, 3.3 volt, Optimum Silicon SC Library, CB 35OS142’, Avant! Corporation, Mar. 1998. [75] “Synopsys: Design Compiler Tutorial,” Version 3.0b, Synopsysy Inc., June 1993 [76] Yeong-Kang Lai, Yeong-Lin Lai, Yuan-Chen Liu, Po-Chen Wu, and Liang-Gee Chen, “VLSI implementation of the motion estimator with two-dimensional data-reuse,” IEEE Trans. Consumer Electronics, vol. 44, no. 3, Aug. 1998 [77] K. M. Yang, M. T. Sun, and L. Wu, “A family of VLSI designs for the motion compensation block-matching algorithm,” IEEE Trans. Circuits Systems for Video Technology, vol. 36, pp. 1317-1325, Oct. 1989 [78] Seung Hyun and Moon Key Lee, “Flexible VLSI architecture of motion estimator for video image application,” IEEE Trans. Circuits Systems-II, vol. 43, pp. 467-470, Jun. 1996 [79] Luc De Vos and Michael Stegherr, “Parameterized VLSI architectures for the full-search block-matching algorithm,” IEEE Trans. Circuits Systems for Video Technology, vol. 36, pp. 1309-1316, Oct. 1989 [80] C. H. Hsieh and P. T. Lin, “VLSI architecture for block-matching motion estimation algorithm,” IEEE Trans. Circuits Systems for Video Technology, vol. 2, pp. 169-175, Jun. 1992 [81] T. Komarek and P. Pirch, “Array architectures for block matching algorithms,” IEEE Trans. Circuits Systems for Video Technology, vol. 36, pp. 1301-1308, Oct. 1989 [82] Hangu Yeo and Yu-Hen Hu, “A novel modular systolic array architecture for full-search block-matching motion estimation,” IEEE Trans. Circuits Systems for Video Technology, vol. 5, pp. 407-416, Oct. 1995
|