Some thoughts about how to evaluate an NMF representation ========================================================= 1. Compare to manual decomposition by parts. 2. Compare PCA/NMF on photos of people aging. Which is more stable? 3. Which produces a better ranking when retrieving related images? 4. When combined with a classifier, which is more accurate for predicting, say, sex or face? 5. Which gives better compression? 6. Which gives better human interpretability? (Measured how?) 7. When a single feature of the image is changed (say a person puts on glasses), which produces a smaller or more localized change in the representation? A useful way to think about the latent variable models is as having a set of patients (documents), which present with symptoms (words). Given enough experience, we want to learn to recognize syndromes (topics, latent variables). If we can do so, we'd like to use our knowledge of these syndromes to diagnose and cure disease (recognize or perhaps categorize documents).