Home > other >  The crawler
The crawler

Time:10-21

Source code, how to use regular expression match,, urgent urgent!!!!!!!!!!

CodePudding user response:

What do you want to match? Who also don't know what are you going to match? How to tell you!

CodePudding user response:

reference 1st floor JMZL response:
what do you want to match? Who also don't know what are you going to match? How to tell you!

Write a crawler match the title of the news

CodePudding user response:

If from the h1
Re.com from running (u '& lt; H1 [^ & gt;] * & gt; (? P . *?) </h1 & gt; '). The search (HTML) group (" title ")<p class="article - content rp"> CodePudding user response: </p>The <fieldset> <legend> reference 3 floor dabingsou response: <legend> <blockquote> if <br/> from the h1Re.com from running (u '& lt; H1 [^ & gt;] * & gt; (? P<title> . *?) </h1 & gt; '). The search (HTML) group (" title ") </blockquote> <fieldset> <br/><br/>I still don't quite understand what from h1? <br/><p class="article - content rp"> CodePudding user response: </p>You are not to take the title<p class="article - content rp"> CodePudding user response: </p>You write in other libraries, regular much trouble,,, </div> <div class="th_page th_page_color"></div> <div class="umCopyright"> <p>Page link:<a href="/other/69918.html" target="_blank" style="color:#999">https//www.codepudding.com/other/69918.html</a></p> </div> <div class="detail-arr"> <div class="detail-arr-left">Prev:<a href='/other/69917.html'>Pyd is use win VS developed, what method can quickly switch to use Linux?</a></div> <div class="detail-arr-right">Next:<a href='/other/69919.html'>Help: Python with Tribon interference? Python3 after installation cannot run..</a></div> </div> </div> </div> </div> </div> <div class="container th_top"> <div class="row"> <div class="col-md-12"> <div class="hot-tags neitags"> <ul> <li><i class="iconfont icon-x-tags"></i> Tags:  </li> <a href='/e/tags/?tagname=Scripting+language+%28Perl%2FPython%29' target='_blank'>Scripting language (Perl/Python)</a> </ul> </div> </div> </div> </div> <div class="container th_top"> <div class="row"> <div class="col-md-12"> <div class="xiangguan"> <ul class="msg msghead"> <li class="tbname">Related</li> </ul> <ul> </ul> </div> </div> </div> </div> <div class="container th_top"> <div class="row"> <div class="col-md-12"> <div class="flinks"> <ul> <li><i class="iconfont icon-x-tags"></i> Links:  </li> <li class="liflinks"><a target="_blank" href="/" title="CodePudding">CodePudding</a></li> </ul> </div> </div> </div> </div> <div class="footer"> <p><span style="font-size:16px;color:#666;font-weight: bold">About Us:</span>  <a href="https://www.codepudding.com/contact.html">Contact Us</a>      <a href="https://www.codepudding.com/service.html">Terms of Service</a>      <a href="https://www.codepudding.com/privacy.html"> Privacy Policy</a></p> <p class="foot_info">Copyright © 2010-2023,Powered By <a href="/" target="_blank">CodePudding</a> </p> </div> <script type="text/javascript" src="/skin/code/tianhu.js"></script> </body></html>