|
[1] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345-1359, 2010. [2] M. Sugiyama, M. Krauledat, and K.-R. MÞller, “Covariate shift adaptation by importance weighted cross validation,” Journal of Machine Learning Research, vol. 8, no. May, pp. 985-1005, 2007. [3] J. Blitzer, R. McDonald, and F. Pereira, “Domain adaptation with structural correspondence learning,” in Proceedings of Conference on Empirical Methods in Natural Language Processing, pp. 120-128, 2006. [4] H. Daumé III, “Frustratingly easy domain adaptation,” in Proceedings of Conference of the Association for Computational Linguistics (ACL), 2007. [5] S. J. Pan, J. T. Kwok, and Q. Yang, “Transfer learning via dimensionality reduction,” in Proceedings of Advancement of Artificial Intelligence (AAAI), vol. 8, pp. 677-682, 2008. [6] S. J. Pan, I. W. Tsang, J. T. Kwok, Q. Yang, et al., “Domain adaptation via transfer component analysis,” IEEE Transactions on Neural Networks, vol. 22, no. 2, pp. 199-210, 2011. [7] M. Chen, Z. Xu, K. Weinberger, and F. Sha, “Marginalized denoising autoencoders for domain adaptation,” 2012. [8] E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, “Deep domain confusion: Maximizing for domain invariance,” arXiv preprint arXiv:1412.3474, 2014. [9] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky, “Domain-adversarial training of neural networks,” Journal of Machine Learning Research, vol. 17, no. 59, pp. 1-35, 2016. [10] C. Louizos, K. Swersky, Y. Li, M. Welling, and R. Zemel, “The variational fair auto encoder,” in International Conference on Learning Representations(ICLR), 2016. [11] M. Long, H. Zhu, J. Wang, and M. I. Jordan, “Unsupervised domain adaptation with residual transfer networks,” in Advances in Neural Information Processing Systems, pp. 136-144, 2016. [12] K. Bousmalis, G. Trigeorgis, N. Silberman, D. Krishnan, and D. Erhan, “Domain separation networks,” in Advances in Neural Information Processing Systems, pp. 343-351, 2016. [13] R. Raina, A. Battle, H. Lee, B. Packer, and A. Y. Ng, “Self-taught learning: transfer learning from unlabeled data,” in Proceedings of International Conference on Machine Learning (ICML), pp. 759-766, ACM, 2007. [14] J. Huang, A. Gretton, K. M. Borgwardt, B. Sch¨olkopf, and A. J. Smola, “Correcting sample selection bias by unlabeled data,” in Advances in Neural Information Processing Systems (NIPS), pp. 601-608, 2006. [15] M. Sugiyama, S. Nakajima, H. Kashima, P. V. Buenau, and M. Kawanabe, “Direct importance estimation with model selection and its application to covariate shift adaptation,” in Advances in Neural Information Processing Systems (NIPS), pp. 1433-1440, 2008 [16] T. Kanamori, S. Hido, and M. Sugiyama, “A least-squares approach to direct importance estimation,” Journal of Machine Learning Research, vol. 10, pp. 1391- 1445, 2009. [17] S. Bickel, M. Br¨uckner, and T. Scheffer, “Discriminative learning under covariate shift,” Journal of Machine Learning Research, vol. 10, pp. 2137-2155, 2009. [18] F. Zhuang, X. Cheng, S. J. Pan, W. Yu, Q. He, and Z. Shi, “Transfer learning with multiple sources via consensus regularized autoencoders,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 417-431, Springer, 2014. [19] A. T. Selvan, S. Samanta, and S. Das, “Domain adaptation using weighted subspace sampling for object categorization,” in Advances in Pattern Recognition (ICAPR), 2015 Eighth International Conference on, pp. 1-5, IEEE, 2015. [20] J. Jiang and C. Zhai, “Instance weighting for domain adaptation in nlp," in ACL, vol. 7, pp. 264{271, 2007. [21] B. Zadrozny, “Learning and evaluating classifiers under sample selection bias,” in Proceedings of International Conference on Machine Learning (ICML), p. 114, ACM, 2004. [22] A. Gretton, K. M. Borgwardt, M. Rasch, B. Sch¨olkopf, and A. J. Smola, “A kernel method for the two-sample-problem,” in Advances in Neural Information Processing Systems (NIPS), pp. 513-520, 2007. [23] K. M. Borgwardt, A. Gretton, M. J. Rasch, H.-P. Kriegel, B. SchÖlkopf, and A. J. Smola, “Integrating structured biological data by kernel maximum mean discrepancy,” Bioinformatics, vol. 22, no. 14, pp. e49-e57, 2006. [24] I. Steinwart, “On the influence of the kernel on the consistency of support vector machines,” Journal of Machine Learning Research, vol. 2, pp. 67-93, 2002. [25] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” in Parallel Distributed Processing: Explorations inthe Microstructure of Cognition (D. E. Rumelhart, J. L. McClelland, and C. PDP Research Group, eds.), vol. 1, pp. 318-362, MIT Press, 1986. [26] Q. V. Le, J. Ngiam, A. Coates, A. Lahiri, B. Prochnow, and A. Y. Ng, “On optimization methods for deep learning,” in Proceedings of International Conference on Machine Learning (ICML), pp. 265-272, 2011. [27] J. Martens, “Deep learning via Hessian-free optimization,” in Proceedings of International Conference on Machine Learning (ICML), pp. 735-742, 2010. [28] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2015. [29] N. Qian, “On the momentum term in gradient descent learning algorithms,” Neural Networks, vol. 12, no. 1, pp. 145-151, 1999. [30] M. Kan, S. Shan, and X. Chen, “Bi-shifting auto-encoder for unsupervised domain adaptation,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 3846-3854, 2015. [31] F. Zhuang, X. Cheng, P. Luo, S. J. Pan, and Q. He, “Supervised representation learning: transfer learning with deep autoencoders,” in Int. Joint Conf. Artif. Intell, 2015. [32] W. Jiang, H. Gao, F.-l. Chung, and H. Huang, “The l2, 1-norm stacked robust autoencoders for domain adaptation,” in Thirtieth AAAI Conference on Artificial Intelligence, 2016. [33] Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” arXiv preprint arXiv:1409.7495, 2014. [34] R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” in Proceedings of the 25th international conference on Machine learning, pp. 160-167, ACM, 2008. [35] X. Glorot, A. Bordes, and Y. Bengio, “Domain adaptation for large-scale sentiment classification: A deep learning approach,” in Proceedings of International Conference on Machine Learning (ICML), pp. 513-520, 2011. [36] T. H. Nguyen and R. Grishman, “Event detection and domain adaptation with convolutional neural networks,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 2, pp. 365-371, 2015. [37] R. Johnson and T. Zhang, “Semi-supervised convolutional neural networks for text categorization via region embedding,” in Advances in neural information processing systems, pp. 919-927, 2015. [38] J. Hu, J. Lu, and Y.-P. Tan, “Deep transfer metric learning,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 325-333, 2015. [39] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?,” in Advances in neural information processing systems, pp. 3320-3328, 2014. [40] H.-Y. Chen and J.-T. Chien, “Deep semi-supervised learning for domain adaptation,” in 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1-6, IEEE, 2015. [41] M. Long and J. Wang, “Learning transferable features with deep adaptation networks,” CoRR, abs/1502.02791, vol. 1, p. 2, 2015. [42] A. Rahimi and B. Recht, “Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning,” in Advances in neural information processing systems, pp. 1313-1320, 2009. [43] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems, pp. 2672-2680, 2014. [44] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015. [45] S. Nowozin, B. Cseke, and R. Tomioka, “f-gan: Training generative neural samplers using variational divergence minimization,” in Advances in Neural Information Processing Systems, pp. 271-279, 2016. [46] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein gan,” arXiv preprint arXiv:1701.07875, 2017. [47] X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. P. Smolley, “Least squares generative adversarial networks,” arXiv preprint ArXiv:1611.04076, 2016. [48] M.-Y. Liu and O. Tuzel, “Coupled generative adversarial networks,” in Advances in Neural Information Processing Systems, pp. 469-477, 2016. [49] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets,” in Advances in Neural Information Processing Systems, pp. 2172{2180, 2016. [50] G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Information processing & management, vol. 24, no. 5, pp. 513-523, 1988. [51] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011. [52] P. Arbelaez, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligencen, vol. 33, no. 5, pp. 898-916, 2011. [53] M. Long, J. Wang, G. Ding, J. Sun, and P. S. Yu, “Transfer feature learning with joint distribution adaptation,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 2200-2207, 2013. [54] D. L. Davies and D. W. Bouldin, “A cluster separation measure," IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 2, pp. 224-227, 1979. [55] J. Blitzer, M. Dredze, F. Pereira, et al., “Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification,” in Proceedings of Conference of the Association for Computational Linguistics (ACL), vol. 7, pp. 440-447, 2007. [56] I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” in International Conference on Machine Learning, pp. 1139-1147, 2013.
|