(3.215.180.226) 您好!臺灣時間:2021/03/06 15:10
字體大小: 字級放大   字級縮小   預設字形  
回查詢結果

詳目顯示:::

我願授權國圖
: 
twitterline
研究生:方松營
研究生(外文):Sung-Ying Fang
論文名稱:以文字探勘方法理解財報中的管理層討論與分析
論文名稱(外文):Understanding the MD&A Section in Financial Reports through Sentence Classification
指導教授:盧信銘
指導教授(外文):Hsin-Min Lu
口試委員:陳以錚簡宇泰
口試委員(外文):Yi-Cheng ChenYu-Tai Chien
口試日期:2019-07-02
學位類別:碩士
校院名稱:國立臺灣大學
系所名稱:資訊管理學研究所
學門:電算機學門
學類:電算機一般學類
論文種類:學術論文
論文出版年:2019
畢業學年度:107
語文別:英文
論文頁數:80
中文關鍵詞:文字探勘管理層與討論分析10-K財報人工標記BERT句子分類器
DOI:10.6342/NTU201901359
相關次數:
  • 被引用被引用:1
  • 點閱點閱:169
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏至我的研究室書目清單書目收藏:0
在會計與財經領域研究文字資訊的文獻中,文字探勘與其內容分析方法向來是實驗重要的基礎與環節之一。然而,這些文獻中常用的內容分析方法多為情緒辭典或基礎的機器學習方法,少有考量自然語言處理近年在深度學習與移轉學習上的突破。
本研究利用自然語言處理預訓練的模型(Bidirectional Encoder Representation from Transformer,簡稱BERT),希望藉此提高財經領域文本文字探勘方法的準確率。為此,本研究對年報中的管理層與討論分析項目(MD&A)進行標籤作為訓練資料,以便句子分類模型能準確分辨是否具前瞻性、句子的文本情緒、以及句子內容包含哪些會計項目。建構句子分類模型後,本研究分析2007年至2016年的MD&A,並探討近年文獻回顧中財經文字探勘領域重要的研究議題,以證明標籤資料與分類器的準確率。最終我們得出,我們的實驗結果多與過去文獻結果相符,尤其與Li(2010) 的研究得出相近的結論,發現MD&A的情緒文本可闡釋部分財經相關的數據資料和各個產業內的環境因素。
Textual analysis has been an emerging area in accounting and finance research. With the growing realization that economical and statistical models may not adequately explain the market with conventional quantitative measures alone, there has been extensive empirical literature attempting to incorporate verbal, non-quantitative measures. However, in fields such as natural language processing, there have been extensive progress in machine learning methods, such as new methods of language representations and the application of transfer learning, which has not been commonly used in academic papers featuring textual analysis within financial context.
This paper wishes to contribute to the field by preparing a training dataset suitable for a wider array of research questions, and apply a contemporary machine learning method using bidirectional encoding representations from transformers, or BERT. Our sentence classifier aims to correctly classify sentences with respect to their tone, their accounting category or topic based on the context of the sentence, and whether or not the sentence is a forward-looking statement. By applying our sentence classifier to out-of-sample annual filings, we evaluate our dataset and classification method by revisiting a subset of research questions concluded from our literature review. Our dataset and preliminary descriptive analysis align with the results of many empirical models from other studies, most notably with the analysis made by Li (2010). In conclusion, the tone of MD&A sections is mean-reverting, may proxy for economic determinants, and can be useful in inspecting the macro-environment of an industry.
CONTENT
ACKNOWLEDGEMENTS 2
中文摘要 3
ABSTRACT 4
CONTENT 5
LIST OF FIGURES 7
LIST OF TABLES 8
Chapter 1 Introduction 9
Chapter 2 Literature Review 14
2.1 Corpora 14
2.1.1 Outsider-expressed Corpora 14
2.1.2 Insider-expressed Corpora 16
2.1.3 The Management Discussion & Analysis Section 17
2.2 Content Analysis Methods 20
2.2.1 Readability Measures 21
2.2.2 Dictionary-based Approaches 22
2.2.3 Machine Learning Methods 27
2.3 Applications in Financial Context 32
2.3.1 Textual Sentiment 32
2.3.2 Predictive Modeling 36
Chapter 3 Methodology 38
3.1 Data Preparation 39
3.1.1 Corpora Selection 39
3.1.2 Label Selection 42
3.2 Classification Method 45
3.2.1 BERT: Bidirectional Encoding Representations from Transformers 46
3.2.2 Fine-tuning BERT 47
Chapter 4 Experiment Results 49
4.1 Summary of Training Dataset 49
4.2 Classifier Evaluation 50
4.3 Preliminary Descriptive Analysis 55
4.3.1 Summary of Predicted Dataset 56
4.3.2 Determinants of MD&A Tone 62
4.3.3 Aggregate Tone of MD&As over Time 66
Chapter 5 Conclusions & Future Work 74
REFERENCES 75
REFERENCES
Ahern, K. R., & Sosyura, D. (2014). Who writes the news? Corporate press releases during merger negotiations. The Journal of Finance, 69(1), 241-291.
Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259-1294.
Barron, O. E., Kile, C. O., & O''KEEFE, T. B. (1999). MD&A quality as measured by the SEC and analysts'' earnings forecasts. Contemporary accounting research, 16(1), 75-109.
Bloomfield, R. (2008). Discussion of “annual report readability, current earnings, and earnings persistence”. Journal of Accounting and economics, 45(2-3), 248-252.
Bonsall IV, S. B., Leone, A. J., Miller, B. P., & Rennekamp, K. (2017). A plain English measure of financial reporting readability. Journal of Accounting and economics, 63(2-3), 329-357.
Brown, S. V., & Tucker, J. W. (2011). Large‐sample evidence on firms’ year‐over‐year MD&A modifications. Journal of Accounting Research, 49(2), 309-346.
Buehlmaier, M. M., & Whited, T. M. (2018). Are financial constraints priced? Evidence from textual analysis. The Review of Financial Studies, 31(7), 2693-2728.
Campbell, J. L., Chen, H., Dhaliwal, D. S., Lu, H.-m., & Steele, L. B. (2014). The information content of mandatory risk factor disclosures in corporate filings. Review of Accounting Studies, 19(1), 396-455.
Chen, H., De, P., Hu, Y. J., & Hwang, B.-H. (2014). Wisdom of crowds: The value of stock opinions transmitted through social media. The Review of Financial Studies, 27(5), 1367-1403.
Chen, Y.-H., & Lu, H.-M. (2018). Item Extraction for Annual Financial Report: Annotation and Evaluation.
Clarkson, P. M., Kao, J. L., & Richardson, G. D. (1999). Evidence that management discussion and analysis (MD&A) is a part of a firm''s overall disclosure package. Contemporary accounting research, 16(1), 111-134.
Cole, C. J., & Jones, C. L. (2015). The quality of management forecasts of capital expenditures and store openings in MD&A. Journal of Accounting, Auditing & Finance, 30(2), 127-149.
Das, S. R., & Chen, M. Y. (2007). Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management science, 53(9), 1375-1388.
Davis, A. K., Ge, W., Matsumoto, D., & Zhang, J. L. (2015). The effect of manager-specific optimism on the tone of earnings conference calls. Review of Accounting Studies, 20(2), 639-673.
Davis, A. K., & Tama‐Sweet, I. (2012). Managers’ use of language across alternative disclosure outlets: earnings press releases versus MD&A. Contemporary accounting research, 29(3), 804-837.
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Dougal, C., Engelberg, J., Garcia, D., & Parsons, C. A. (2012). Journalists and the stock market. The Review of Financial Studies, 25(3), 639-679.
Ertugrul, M., Lei, J., Qiu, J., & Wan, C. (2017). Annual report readability, tone ambiguity, and the cost of borrowing. Journal of Financial and Quantitative Analysis, 52(2), 811-836.
Feldman, R., Govindaraj, S., Livnat, J., & Segal, B. (2008). The incremental information content of tone change in management discussion and analysis.
Fissette, M., & de Vries, T. (2017). Text mining to detect indications of fraud in annual reports worldwide. Paper presented at the Benelearn 2017: Proceedings of the Twenty-Sixth Benelux Conference on Machine Learning, Technische Universiteit Eindhoven, 9-10 June 2017.
Garcia, D. (2013). Sentiment during recessions. The Journal of Finance, 68(3), 1267-1300.
Gerdes Jr, J. (2003). EDGAR-Analyzer: automating the analysis of corporate data contained in the SEC''s EDGAR database. Decision Support Systems, 35(1), 7-29.
Härdle, W. K., Hoffmann, L., & Moro, R. (2011). Learning machines supporting bankruptcy prediction. In Statistical Tools for Finance and Insurance (pp. 225-250): Springer.
Heaton, J., Polson, N., & Witte, J. H. (2017). Deep learning for finance: deep portfolios. Applied Stochastic Models in Business and Industry, 33(1), 3-12.
Heaton, J., Polson, N. G., & Witte, J. H. (2016). Deep learning in finance. arXiv preprint arXiv:1602.06561.
Hendershott, T., Livdan, D., & Schürhoff, N. (2015). Are institutions informed about news? Journal of Financial Economics, 117(2), 249-287.
Henry, E. (2008). Are investors influenced by how earnings press releases are written? The Journal of Business Communication (1973), 45(4), 363-407.
Heston, S. L., & Sinha, N. R. (2014). News versus sentiment: Comparing textual processing approaches for predicting stock returns. Robert H. Smith School Research Paper.
Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. arXiv preprint arXiv:1801.06146.
Huang, A. H., Zang, A. Y., & Zheng, R. (2014). Evidence on the information content of text in analyst reports. The Accounting Review, 89(6), 2151-2180.
Huang, X., Teoh, S. H., & Zhang, Y. (2013). Tone management. The Accounting Review, 89(3), 1083-1113.
Hüfner, B. (2007). The SeC’S MD&A: DoeS iT MeeT The inforMATionAl DeMAnDS of inveSTorS? Schmalenbach Business Review, 59(1), 58-84.
Jegadeesh, N., & Wu, D. (2013). Word power: A new approach for content analysis. Journal of Financial Economics, 110(3), 712-729.
Jiang, F., Lee, J., Martin, X., & Zhou, G. (2019). Manager sentiment and stock returns. Journal of Financial Economics, 132(1), 126-149.
Kearney, C., & Liu, S. (2014). Textual sentiment in finance: A survey of methods and models. International Review of Financial Analysis, 33, 171-185.
Kothari, S. P., Li, X., & Short, J. E. (2009). The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: A study using content analysis. The Accounting Review, 84(5), 1639-1670.
Kraus, M., & Feuerriegel, S. (2017). Decision support from financial disclosures with deep neural networks and transfer learning. Decision Support Systems, 104, 38-48.
Li, F. (2006). Do stock market investors understand the risk sentiment of corporate annual reports? Available at SSRN 898181.
Li, F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and economics, 45(2-3), 221-247.
Li, F. (2010). The information content of forward‐looking statements in corporate filings—A naïve Bayesian machine learning approach. Journal of Accounting Research, 48(5), 1049-1102.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65.
Loughran, T., & McDonald, B. (2014). Measuring readability in financial disclosures. The Journal of Finance, 69(4), 1643-1671.
Loughran, T., & McDonald, B. (2016). Textual analysis in accounting and finance: A survey. Journal of Accounting Research, 54(4), 1187-1230.
Manela, A., & Moreira, A. (2017). News implied volatility and disaster concerns. Journal of Financial Economics, 123(1), 137-162.
Mayew, W. J., Sethuraman, M., & Venkatachalam, M. (2014). MD&A Disclosure and the Firm''s Ability to Continue as a Going Concern. The Accounting Review, 90(4), 1621-1651.
Muslu, V., Radhakrishnan, S., Subramanyam, K., & Lim, D. (2014). Forward-looking MD&A disclosures and the information environment. Management science, 61(5), 931-948.
Nelson, D. M., Pereira, A. C., & de Oliveira, R. A. (2017). Stock market''s price movement prediction with LSTM neural networks. Paper presented at the 2017 International Joint Conference on Neural Networks (IJCNN).
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. arXiv preprint arXiv:1802.05365.
Petersen, M. A. (2004). Information: Hard and soft.
Price, S. M., Doran, J. S., Peterson, D. R., & Bliss, B. A. (2012). Earnings conference calls and stock returns: The incremental informativeness of textual tone. Journal of Banking & Finance, 36(4), 992-1011.
Purda, L., & Skillicorn, D. (2015). Accounting variables, deception, and a bag of words: Assessing the tools of fraud detection. Contemporary accounting research, 32(3), 1193-1223.
Ryans, J. (2018). Textual classification of SEC comment letters. Available at SSRN 2474666.
SEC, U. (2003). Interpretation: Commission Guidance Regarding Management''s Discussion and Analysis of Financial Condition and Results of Operations. In: SEC Washington, DC.
Tavcar, L. R. (1998). Make the MD&A more readable. The CPA Journal, 68(1), 10.
Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139-1168.
Tetlock, P. C., Saar‐Tsechansky, M., & Macskassy, S. (2008). More than words: Quantifying language to measure firms'' fundamentals. The Journal of Finance, 63(3), 1437-1467.
Tsantekidis, A., Passalis, N., Tefas, A., Kanniainen, J., Gabbouj, M., & Iosifidis, A. (2017). Forecasting stock prices from the limit order book using convolutional neural networks. Paper presented at the 2017 IEEE 19th Conference on Business Informatics (CBI).
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., . . . Macherey, K. (2016). Google''s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.
QRCODE
 
 
 
 
 
                                                                                                                                                                                                                                                                                                                                                                                                               
第一頁 上一頁 下一頁 最後一頁 top
系統版面圖檔 系統版面圖檔