Seminar 10/6/10: Text Mining and Its Use In Textual Research, Andrew Prather

October 6, 2010

Textual research – research involving literary text – is a fastidious and daunting task. Anyone with firsthand experience in serious textual research will be able to testify to the complex and time consuming nature of the process. Without the proper tools, the task can turn into a nightmare and the researcher can become lost in a vast sea of words. The area within computer science that can help solve this problem is called Text Mining. Text Mining can be defined as the development and use of algorithms and software to perform analysis on textual sources for the purpose of finding information that leads to new knowledge. A program which utilizes Text Mining techniques to organize and analyze literary sources would be an invaluable tool for the researcher. The purpose of this project is to explore how Text Mining can be utilized to help research in non-computer science fields. Attention is focused on the development of software that can create indexes, word frequency lists and concordances so that 1) themes and word meanings can be discovered, and 2) the researcher knows what words are important to the text.

Sources: Bilisoly, R. (2008). Practical text mining with Perl. Hoboken, New Jersey: John Wiley & Sons.

Hearst, M. (1999). “Untangling text data mining.”  Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics, 3-10.

Kroeze, J., Matthee, M., & Bothma, T. (2003). “Differentiating data- and text-mining terminology.”  Proceedings of SAICSIT, 93-101.

Stravrianou, A., Andristos, P., & Nicoloyannis, N. (2007). “Overview and semantic issues of text mining.”  SIGMOD Record, 36(3), 23-34.


Leave a Reply

Your email address will not be published. Required fields are marked *