regex - How to match one word or another in Elisp regexp -
i have string contains html code, below:
... <a href="../link.png">image link</a> ... <img src="../image.png" /> ... <pre class="should_not_match">...</pre> ... i want extract resource paths: ../link.png of href in a, , ../image.png of src in img. have following code:
(with-temp-buffer (insert html-content) ;; html-content content mentioned above (beginning-of-buffer) (while (re-search-forward "<[a-za-z]+[^/>]+[src|href]=\"\\([^\"]+\\)\"[^>]*>" nil t) (message (match-string 1)) ;; more code here )) the output includes not wanted ../link.png, ../image.png, should_not_match, know because incorrect [src|href] in regexp (i want match either src or href). use following regexp:
"<[a-za-z]+[^/>]+(src|href)=\"\\([^\"]+\\)\"[^>]*>" but returns nil now. tried following, without luck:
"<[a-za-z]+[^/>]+\\(src|href\\)=\"\\([^\"]+\\)\"[^>]*>" "<[a-za-z]+[^/>]+((src)|(href))=\"\\([^\"]+\\)\"[^>]*>" "<[a-za-z]+[^/>]+(\\(src\\)|\\(href\\))=\"\\([^\"]+\\)\"[^>]*>" "<[a-za-z]+[^/>]+\\((src)|(href)\\)=\"\\([^\"]+\\)\"[^>]*>" "<[a-za-z]+[^/>]+\\(\\(src\\)|\\(href\\)\\)=\"\\([^\"]+\\)\"[^>]*>" so, correct regexp can work?
thanks in advance,
kelvin
edit
inspired @lawlist, find because need escape | \\|, \\(src\\|href\\) works well.
this particular regexp covers first 2 items in example of original poster, e.g, <a href="../link.png">image link</a> , <img src="../image.png" />. saw no need exclude third item in example of original poster because not included in following regexp:
\\(<a href=\"\\|<img src=\"\\)\\(.*\\)\\(\">image link</a>\\|\" />\\) the regexp of original poster not cover portion of first example -- i.e., image link</a> not contemplated regexp if fix \\(src\\|href\\). thus, recommendation devise regexp includes entire html link.
Comments
Post a Comment