Instructor: Tomasz Imielinski
Analyzing and mining massive data sets is one of the key skills needed in marketplace today and in the future. Data is everywhere, from mobile devices to world wide web, from HTML and XML to databases. Recent NETFLIX competition http://www.netflixprize.com/ showed high profile of successful data mining applied to real, practical problem. We will introduce basic practical tools for massive data mining and data analysis, using weka libraries http://www.cs.waikato.ac.nz/ml/weka/ of tools for classification, clustering and rule mining both for unstructured as well as structured data. ATTENTION: Class will be "hands on" - large part (if not all) of the final grade will be based on projects and highly interactive "in progress" meetings. You will be asked to participate, present and criticize. "American idol" style finale will feature competition for the best live demo. . Programming proficiency C (++), Java, Perl (desired). + passion to build practical and cool data driven solutions to real problems. Textbook: http://www.cs.waikato.ac.nz/~ml/weka/book.html .
Prerequisite 198:336. AI class - desired, or special permission of instructor.