Learning reliable and interpretable representations is the foundation for the recent success of neural network models. However, learning a direct mapping from the source to the target would require tremendous annotations, and sometimes result in limited performance when extensive variations exist. In this study, we investigate features learned by deep neural networks and leverage domain knowledge to disentangle latent factors in a hyperplane under either supervised or unsupervised manner. We show that the disentangled representation is efficient to learn and robust to variations. State-of-the-art performance is achieved in multiple face analysis tasks such as head pose estimation, landmark tracking, and face recognition.