百度蜘蛛抓取feed文件后,抓取对应页面地址错误 ,从而导致抓取404如何解决?
比如蜘蛛在抓取到
123.125.71.16 - - [25/Aug/2019:01:42:22 +0800] 'GET /2224.html/feed HTTP/1.1' 200 979 '-' 'Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)'
紧接着就会抓取
220.181.108.158 - - [25/Aug/2019:02:18:41 +0800] 'GET /www.whlihun.com/2224.html HTTP/1.1' 404 479 '-' 'Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)'
但因为抓取地址多了www.whlihun.com,从而导致抓取404,这是feed设置出错了吗?
feed内容如下:
<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
>
<channel>
<title>
《离婚协议是否受合同法约束?》的评论 </title>
<atom:link href="https://www.whlihun.com/2224.html/feed" rel="self" type="application/rss+xml" />
<link>https://www.whlihun.com/2224.html</link>
<description></description>
<lastBuildDate>Wed, 26 Dec 2018 10:06:23 +0000</lastBuildDate>
<sy:updatePeriod>
hourly </sy:updatePeriod>
<sy:updateFrequency>
1 </sy:updateFrequency>
<generator>https://wordpress.org/?v=5.2.2</generator>
</channel>
</rss>