Predict test data using training set
Utilisateur anonyme
Data set is sparse with many uninformative features. Need to scale and impute data, eliminate non-informative features (LASSO) then train classifier. Also need to report statistics on training so need to use cross validation.