Pandas text matching like SQL's LIKE? -
is there way similar sql's syntax on pandas text dataframe column, such returns list of indices, or list of booleans can used indexing dataframe? example, able match rows column starts 'prefix_', similar where <col> prefix_%
in sql.
you can use series method str.startswith
(which takes regex):
in [11]: s = pd.series(['aa', 'ab', 'ca', np.nan]) in [12]: s.str.startswith('a', na=false) out[12]: 0 true 1 true 2 false 3 false dtype: bool
you can same str.contains
(using regex):
in [13]: s.str.contains('^a', na=false) out[13]: 0 true 1 true 2 false 3 false dtype: bool
so can df[col].str.startswith
...
see sql comparison section of docs.
note: (as pointed out op) default nans propagate (and hence cause indexing error if want use result boolean mask), use flag nan should map false.
in [14]: s.str.startswith('a') # can't use boolean mask out[14]: 0 true 1 true 2 false 3 nan dtype: object
Comments
Post a Comment