scikit learn - Python sklearn - how to calculate p-values -


this simple question trying calculate p-values features either using classifiers classification problem or regressors regression. suggest best method each case , provide sample code? want see p-value each feature rather keep k best / percentile of features etc explained in documentation.

thank you

just run significance test on x, y directly. example using 20news , chi2:

>>> sklearn.datasets import fetch_20newsgroups_vectorized >>> sklearn.feature_selection import chi2 >>> data = fetch_20newsgroups_vectorized() >>> x, y = data.data, data.target >>> scores, pvalues = chi2(x, y) >>> pvalues array([  4.10171798e-17,   4.34003018e-01,   9.99999996e-01, ...,          9.99999995e-01,   9.99999869e-01,   9.99981414e-01]) 

Comments

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

c - Why would PK11_GenerateRandom() return an error -8023? -