parsing - Using xPath & PHP to loop through an HTML and extract required attributes & text? -
i trying use php , xpath extract various text , attributes html file.
the desired output this:
item1 | level1 = aaaa | level 2 = aaa.com | text i can construct xpath trouble creating necessary loops cycle through file. what best method this?
sample html - sections , subsections (item 1 item 999):
<div class=container1> <div class=item1> <div class=level1> <h1>aaaa</h1> </div> <div class=level2> <a href=aaa.com>text</a> <p>text</p> </div> </div> .. <div class=item2> .. </div>
i've embedded xml , used loadxml() instead of load().
please notice bit ambigous, "text" mean after href, text href <a href=aaa.com>text</a> or text <p>text</p>. solution uses text href.
output
item1 | level1 = aaaa | level2 = aaa.com | atext1
item2 | level1 = bbbb | level2 = bbb.com | btext1
solution
<?php // file $xml = ' <div class="container1"> <div class="item1"> <div class="level1"> <h1>aaaa</h1> </div> <div class="level2"> <a href="aaa.com">atext1</a> <p>atext2</p> </div> </div> <div class="item2"> <div class="level1"> <h1>bbbb</h1> </div> <div class="level2"> <a href="bbb.com">btext1</a> <p>btext2</p> </div> </div> </div> '; $xmldoc = new domdocument(); //$xmldoc->load('yourfile.html'); $xmldoc->loadxml($xml); $xpath = new domxpath($xmldoc); foreach($xpath->query("//div[contains(@class,'item')]") $node){ echo $node->getattribute('class') . ' | '; // item 1 $div = $node->getelementsbytagname('div'); foreach($div $i) { if($i->getattribute('class') === 'level1') { echo $i->getattribute('class') . ' = ' . $i->nodevalue . ' | '; } if($i->getattribute('class') === 'level2') { echo $i->getattribute('class') . ' = '; foreach($i->childnodes $node){ if($node instanceof domelement && $node->hasattribute('href')) { echo $node->getattribute('href') . ' | ' . $node->nodevalue; } } } } echo '<br>'; } // item1 | level1 = aaaa | level 2 = aaa.com | text2 ?>
Comments
Post a Comment