|
[1]John Walker, S. (2014). Big data: A revolution that will transform how we live, work, and think. [2]Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S., & Stoica, I. (2010). Spark: cluster computing with working sets. HotCloud, 10, 10-10. [3]Alsheikh, M. A., Niyato, D., Lin, S., Tan, H. P., & Han, Z. (2016). Mobile big data analytics using deep learning and apache spark. IEEE network, 30(3), 22-29. [4]Chang, Victor, Yen-Hung Kuo, and Muthu Ramachandran. "Cloud computing adoption framework: A security framework for business clouds." Future Generation Computer Systems 57 (2016): 24-41. [5]Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep learning (Vol. 1). Cambridge: MIT press. [6]Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Dieleman, S. (2016). Mastering the game of Go with deep neural networks and tree search. nature, 529(7587), 484. [7]Collobert, R., & Weston, J. (2008, July). A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning (pp. 160-167). ACM. [8]Glorot, X., Bordes, A., & Bengio, Y. (2011). Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 513-520). [9]Merkel, D. (2014). Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014(239), 2. [10]Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016). Borg, omega, and kubernetes. Queue, 14(1), 10. [11]Apache Hadoop. http://hadoop.apache.org/ [12]Ghazi, M. R., & Gangodkar, D. (2015). Hadoop, MapReduce and HDFS: A Developers Perspective.Procedia Computer Science,48, 45-50. [13]K. V. Shvachko, “HDFS Scalability: The limits to growth,” ;login:. April 2010, pp. 6–16 [14]Odersky, M., Spoon, L., & Venners, B. (2008). Programming in scala. Artima Inc. [15]P.F. Dubois, editor. Python: Batteries Included, volume 9 of Computing in Science & Engineering. IEEE/AIP, May 200 [16]Arnold, K., Gosling, J., & Holmes, D. (2005). The Java programming language. Addison Wesley Professional. [17]Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., ... & Stoica, I. (2012, April). Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation (pp. 2-2). USENIX Association. [18]Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113. [19]Isard, M., Budiu, M., Yu, Y., Birrell, A., & Fetterly, D. (2007, March). Dryad: distributed data-parallel programs from sequential building blocks. In ACM SIGOPS operating systems review (Vol. 41, No. 3, pp. 59-72). ACM. [20]Nitzberg, B., & Lo, V. (1991). Distributed shared memory: A survey of issues and algorithms. Computer, 24(8), 52-60. [21]Armbrust, M., Xin, R. S., Lian, C., Huai, Y., Liu, D., Bradley, J. K., ... & Zaharia, M. (2015, May). Spark sql: Relational data processing in spark. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (pp. 1383-1394). ACM. [22]Thusoo, A., Sarma, J. S., Jain, N., Shao, Z., Chakka, P., Anthony, S., ... & Murthy, R. (2009). Hive: a warehousing solution over a map-reduce framework. Proceedings of the VLDB Endowment, 2(2), 1626-1629. [23]Crockford, D. (2006). The application/json media type for javascript object notation (json) (No. RFC 4627). [24]Hamilton, G., Cattell, R., & Fisher, M. (1997). JDBC Database Access with Java(Vol. 7). Addison Wesley. [25]Kreps, J., Narkhede, N., & Rao, J. (2011, June). Kafka: A distributed messaging system for log processing. In Proceedings of the NetDB (pp. 1-7). [26]Hoffman, S. (2013). Apache Flume: distributed log collection for Hadoop. Packt Publishing Ltd. [27]Amazon Web Services. http://s3.amazonaws.com [28]Kwak, H., Lee, C., Park, H., & Moon, S. (2010, April). What is Twitter, a social network or a news media?. In Proceedings of the 19th international conference on World wide web (pp. 591-600). AcM. [29]Amazon Kinesis http://aws.amazon.com/kinesis/ Retrieved: Jul, 2015 [30]Joachims, T. (1998). Making large-scale SVM learning practical(No. 1998, 28). Technical Report, SFB 475: Komplexitätsreduktion in Multivariaten Datenstrukturen, Universität Dortmund. [31]Ruczinski, I., Kooperberg, C., & LeBlanc, M. (2003). Logic regression. Journal of Computational and graphical Statistics, 12(3), 475-511. [32]Neter, J., Kutner, M. H., Nachtsheim, C. J., & Wasserman, W. (1996). Applied linear statistical models (Vol. 4, p. 318). Chicago: Irwin. [33]Murphy, K. P. (2006). Naive bayes classifiers. University of British Columbia, 18. [34]Freund, Y., & Mason, L. (1999, June). The alternating decision tree learning algorithm. In icml (Vol. 99, pp. 124-133). [35]Takane, Y., Young, F. W., & De Leeuw, J. (1977). Nonmetric individual differences multidimensional scaling: An alternating least squares method with optimal scaling features. Psychometrika, 42(1), 7-67. [36]Huang, Zhexue. "Extensions to the k-means algorithm for clustering large data sets with categorical values." Data mining and knowledge discovery 2.3 (1998): 283-304 [37]Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE transactions on speech and audio processing, 3(1), 72-83. [38]Lin, F., & Cohen, W. W. (2010, June). Power Iteration Clustering. In ICML (Vol. 10, pp. 655-662). [39]Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine Learning research, 3(Jan), 993-1022. [40]Golub, G. H., & Reinsch, C. (1970). Singular value decomposition and least squares solutions. Numerische mathematik, 14(5), 403-420. [41]Jolliffe, I. (2011). Principal component analysis. In International encyclopedia of statistical science (pp. 1094-1096). Springer, Berlin, Heidelberg. [42]Han, J., Pei, J., & Yin, Y. (2000, May). Mining frequent patterns without candidate generation. In ACM sigmod record (Vol. 29, No. 2, pp. 1-12). ACM. [43]Docker: https://www.docker.com/what-docker [44]Sefraoui, O., Aissaoui, M., & Eleuldj, M. (2012). OpenStack: toward an open-source solution for cloud computing. International Journal of Computer Applications, 55(3), 38-42. [45]Linux Containers:https://linuxcontainers.org/ [46]R. J. Creasy, "The origin of the VM/370 time-sharing system", IBM Journal of Research & Development, Vol. 25, No. 5 (September 1981), pp. 483–9 [47]T. Werner:VirtualBox im auswärtigen Amt. (Case study, slides in German)Presented at Frühjahrsfachgespräch 2008 of the German Unix User Group, Munich, Germany, March 2008. [48]Adams, K., & Agesen, O. (2006). A comparison of software and hardware techniques for x86 virtualization. ACM SIGARCH Computer Architecture News, 34(5), 2-13. [49]Barham, P., Dragovic, B., Fraser, K., Hand, S., Harris, T., Ho, A., ... & Warfield, A. (2003, October). Xen and the art of virtualization. In ACM SIGOPS operating systems review (Vol. 37, No. 5, pp. 164-177). ACM. [50]Kivity, A., Kamay, Y., Laor, D., Lublin, U., & Liguori, A. (2007, July). kvm: the Linux virtual machine monitor. In Proceedings of the Linux symposium (Vol. 1, pp. 225-230). [51]Kubernetes. [Online].Avaliable: https://kubernetes.io. [Accessed: 11-Dec-2017]. [52]Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E., & Wilkes, J. (2015, April). Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems (p. 18). ACM. [53]Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct), 2825-2830. [54]Chollet, F. (2018). Keras: The python deep learning library. Astrophysics Source Code Library. [55]Leonard Kaufman and J. Peter Rousseeuw, "Clustering by means of Medoids," in Statistical Data Analysis Based on the L_1–Norm and Related Methods, 1987, pp. 405–416. [56]Leonard Kaufman and Peter J. Rousseuw, Finding Groups in Data: An Introduction to Cluster Analysis.: Wiley, 1990. [57]Sinha, V., Doucet, F., Siska, C., Gupta, R., Liao, S., & Ghosh, A. (2000, September). YAML: a tool for hardware design visualization and capture. In Proceedings of the 13th international symposium on System synthesis (pp. 9-14). IEEE Computer Society. [58]Taieb, S. B., & Hyndman, R. J. (2014). A gradient boosting approach to the Kaggle load forecasting competition. International journal of forecasting, 30(2), 382-394. [59]Puurula, A., Read, J., & Bifet, A. (2014). Kaggle LSHTC4 winning solution. arXiv preprint arXiv:1405.0546.
|