343x Filetype PDF File size 1.53 MB Source: www.uni-mannheim.de
Data Mining
Text Mining
University of Mannheim – Prof. Bizer: Data Mining Slide 1
Outline
1. What is Text Mining?
2. Text Preprocessing
3. Feature Creation
4. Feature Selection
5. Pattern Discovery
University of Mannheim – Prof. Bizer: Data Mining Slide 2
Motivation for Text Mining
Approximately 90% of the world’s data is held in
unstructured formats.
Source: Oracle Corporation
Structured data Examples:
10% web pages
emails
Unstructured or customer complaint letters
semi-structured corporate documents
data
scientific papers
books in digital libraries
University of Mannheim – Prof. Bizer: Data Mining Slide 3
Text Mining
The extraction of implicit, previously unknown
and potentially useful information from large
amounts of textual resources.
Data Information
Mining Retrieval
Text
Statistics Mining Machine
Learning
Computational
Linguistics &
NLP
University of Mannheim – Prof. Bizer: Data Mining Slide 4
no reviews yet
Please Login to review.