我需要抓取一个需要登陆的网站数据, 其登陆机制与http://www.cnblogs.com/wenjiang/p/3203403.html所述网站非常类似,根据该文的提示,已经可以成功登陆,登陆以后第一次抓取本想用get来实现但该站不允许(用浏览器是可以的),只好用post,设置如下:
[C#] 纯文本查看 复制代码 item = new HttpItem()
{
URL = uri,
Method = "post",
KeepAlive = true,
Postdata = poststr,//poststr包含用户名和密码,加上__VIEWSTATE,[color=#00ff][font="][size=12px]__EVENTVALIDATION等隐含参数[/size][/font][/color]
Cookie = cookie,
ContentType = "application/x-www-form-urlencoded",
Allowautoredirect = false,
};
此次可以抓到第一页数据,但同样设置抓取下一页数据时被强制中断,提示“The underlying connection was closed: A connection that was expected to be kept alive was closed by the server.”, 不知为什么会出现这样的问题?莫非需要更新cookie或poststr或者是其他设置不对?如果像浏览器那样用get来抓取数据的话,又应该如何设置?
|