Seo+02

2013-02-21 (木) 18:38:47 (2843d) | Topic path: Top / Seo+02

テキスト分析

http://www.ri.cmu.edu/pubs/pub_3976.html

Text Classification for Intelligent Portfolio Management Y. Seo, J.A. Giampapa, and K. Sycara tech. report CMU-RI-TR-02-14, Robotics Institute, Carnegie Mellon University, May, 2002.

Abstract

In the application domain of stock portfolio management, software agents that evaluate the risks associated with the individual companies of a portfolio should be able to read electronic news articles that are written to give investors an indication of the financial outlook of a company. There is a positive correlation between news reports on a company's financial outlook and the company's attractiveness as an investment. However, because of the volume of such reports, it is impossible for financial analysts or investors to track and read each one. Therefore, it would be very helpful to have a system that automatically classifies news reports that reflect positively or negatively on a company's financial outlook. To accomplish this task, we treat the understanding of news articles as a text classification problem. In this paper, we propose a text classification method that we call, ``Domain Experts" and ``Self-Confident" sampling, and compare it with naive Bayes with expectation maximization (EM). We evaluate these learning techniques in terms of how well they improve with unlabeled data after being initially trained on a small number of human-labeled articles and how well they classify the latest financial news articles. The significance of this work lies in the new classification method that we propose and in the sampling technique we used for improving classification accuracy.

トップ   編集 凍結 差分 バックアップ 添付 複製 名前変更 リロード   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS