regex - How to match one word or another in Elisp regexp -


i have string contains html code, below:

... <a href="../link.png">image link</a> ... <img src="../image.png" /> ... <pre class="should_not_match">...</pre> ... 

i want extract resource paths: ../link.png of href in a, , ../image.png of src in img. have following code:

(with-temp-buffer   (insert html-content) ;; html-content content mentioned above   (beginning-of-buffer)   (while (re-search-forward "<[a-za-z]+[^/>]+[src|href]=\"\\([^\"]+\\)\"[^>]*>" nil t)     (message (match-string 1))     ;; more code here     )) 

the output includes not wanted ../link.png, ../image.png, should_not_match, know because incorrect [src|href] in regexp (i want match either src or href). use following regexp:

"<[a-za-z]+[^/>]+(src|href)=\"\\([^\"]+\\)\"[^>]*>" 

but returns nil now. tried following, without luck:

"<[a-za-z]+[^/>]+\\(src|href\\)=\"\\([^\"]+\\)\"[^>]*>" "<[a-za-z]+[^/>]+((src)|(href))=\"\\([^\"]+\\)\"[^>]*>" "<[a-za-z]+[^/>]+(\\(src\\)|\\(href\\))=\"\\([^\"]+\\)\"[^>]*>" "<[a-za-z]+[^/>]+\\((src)|(href)\\)=\"\\([^\"]+\\)\"[^>]*>" "<[a-za-z]+[^/>]+\\(\\(src\\)|\\(href\\)\\)=\"\\([^\"]+\\)\"[^>]*>" 

so, correct regexp can work?

thanks in advance,
kelvin


edit

inspired @lawlist, find because need escape | \\|, \\(src\\|href\\) works well.

this particular regexp covers first 2 items in example of original poster, e.g, <a href="../link.png">image link</a> , <img src="../image.png" />. saw no need exclude third item in example of original poster because not included in following regexp:

\\(<a href=\"\\|<img src=\"\\)\\(.*\\)\\(\">image link</a>\\|\" />\\) 

the regexp of original poster not cover portion of first example -- i.e., image link</a> not contemplated regexp if fix \\(src\\|href\\). thus, recommendation devise regexp includes entire html link.


Comments

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

c - Why would PK11_GenerateRandom() return an error -8023? -