Saturday, February 2, 2019

Text Analysis


Text Analysis: http://libguides.uta.edu/textanalysis

·         Text Analysis
·         Sources of Text
·         Cleaning / Parsing
·         Methods and Tools
·         Text Visualization
·         Digital Humanities 
·         Questions?

Acknowledgement

This guide is adapted by permission from Angela Zoss, Data Visualization Coordinator, Duke University.



TEXT AND DATA MINING (TDM) is the computational analysis of vast quantities of digital information, whether free-form natural language text or structured data. Using specialized software, researchers can extract data, identify trends, look for patterns and better understand the relationships of terms within and between documents. 

Analysis might focus on word frequency, words that frequently appear near each other, contextual information for key words, common phrases and other patterns. Materials to be analyzed range from websites (such as publicly available Facebook posts), 16th C. manuscripts, DNA sequences, to old newspapers.

Introduction to Text Analysis

"Text analysis" is a broad term covering various processes by which text and natural language documents can be modified so that they can be organized and described.
This guide collects resources for several phases of the text analysis process, including text collection, text parsing and cleaning, text summary and analysis methods, and text visualization.

Overviews/summaries:

Web Scraping

Web scraping (web harvesting or web data extraction) is a computer software technique of extracting information from websites.    https://en.wikipedia.org/wiki/Web_scraping

Related Guides

Last Updated Nov 1, 2018 56 views this year
Last Updated Jan 21, 2019 1108 views this year
Last Updated Jan 31, 2019 62 views this year
·         Last Updated: Jan 2, 2018 12:58 PM
·         URL: https://libguides.uta.edu/textanalysis
·          Print Page


Except where otherwise noted, this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. For details and exceptions, see the Library Copyright Statement
© 2016 The University of Texas at Arlington.
University of Texas Arlington Libraries
702 Planetarium Place · Arlington, TX 76019 · 817-272-3000


No comments:

Post a Comment