python - Getting proportion of each of one variable that is True for another in 'pandas' -
i have dataframe in pandas includes column 'a' , boolean-valued column 'b' , find values of 'a' @ least number, n, of rows have true 'b'.
the closest thing can com
df.query('b == true')['a'].value_counts() and @ numbers see ones greater than, n.
is there more pythonic (or more ailuropodian) way of doing (perhaps approach returns count greater n, or proportions true)?
this sounds similar filter:
in [11]: df = pd.dataframe([[1, true], [1, true], [2, false], [2, true]], columns=['a', 'b']) in [12]: g = df.groupby('a') in [13]: g.filter(lambda x: x['b'].sum() > 1) out[13]: b 0 1 true 1 1 true to find values of true use sum agg method:
in [21]: res = g.b.sum() > 1 in [22]: res[res] out[22]: 1 true name: b, dtype: bool in [23]: res[res].index out[23]: int64index([1], dtype='int64')
Comments
Post a Comment