java - Very slow look-behind -
i'm trying recover 2 positions using java regex
the first 1 given regex:
val r="""(?=(?<=[ ]|^)[^ ]{1,21474836}(?=[ ]|$)(?<=[^a-z]|^)[a-z]{1,21474836}(?=[^a-z]|$))"""
the second 1 given regex
val p="""(?<=(?<=[ ]|^)[^ ]{1,21474836}(?=[ ]|$)(?<=[^a-z]|^)[a-z]{1,21474836}(?=[^a-z]|$))"""
note 2 expressions identical, except first "=" replaced "<=" in second expression. not using neste quantifiers here.
my command test following:
r.findallmatchin("a <b/>"*100) //.... long string of size 600... p.findallmatchin("a <b/>"*100) //.... long string of size 600...
the first example instant during execution, whereas second takes dozens of seconds. if launch same examples in repl, both fast.
where come from? how can make second expression faster?
update: why matters
note in general, can have expressions of type:
[^ ]+[^.]+
and know when regular expression can found on left of given position, or when can end. if have following data position below it:
abc145a 0123456
i end of previous expression match position 1,2,3,4,5 , 6. if use non-greedy repeating jokers, match 1,3 , 5. if use greedy operators, matches 6. why need look-behind assertions. or find me way define operators find positions looking for.
you aren't using nested quantifiers, suspect nested lookbehinds cause similar problem. suspect don't need outer lookahead/lookbehind @ - how performing single regex search using inner part of regexes (common both), , retrieving both start position , end position each result?
Comments
Post a Comment