c# - Collecting cookies that are not set by HttpWebResponse -
i need scrape table of info site have valid credentials because owners of site not provide api.
i performed login , saved traffic fiddler, , trying replicate key steps.
i'm going show steps i've done far, , stuck.
log base url
cookiecontainer jar = new cookiecontainer(); request = (httpwebrequest)webrequest.create(urlbase); request.cookiecontainer = jar; httpwebresponse response = (httpwebresponse)request.getresponse(); newurl = response.responseuri.tostring();
along return cookie set. when @ cookiecontainer
has count of 1 after call.
interestingly response object not contain cookie - think okay because can use jar
.
2nd call
i'm not yet @ page name , password presented, doesn't happen until 4th call.
httpwebrequest request = (httpwebrequest)webrequest.create(urlbase + secondcallfolderaddition); cookiecollection bakery = new cookiecollection(); request.keepalive = true; request.headers.add("upgrade-insecure-requests", @"1"); //request.useragent = "mozilla/5.0 (windows nt 6.1; wow64) applewebkit/537.36 (khtml, gecko) chrome/59.0.3071.115 safari/537.36 opr/46.0.2597.57"; request.accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp, image/apng,*/*;q=0.8"; request.headers.set(httprequestheader.acceptencoding, "gzip, deflate, br"); request.headers.set(httprequestheader.acceptlanguage, "en-us,en;q=0.8"); httpwebresponse response = (httpwebresponse)request.getresponse(); string newurl = response.responseuri.tostring();
i ok status, , response looks compared original fiddler traffic capture. in original 2nd call not set cookie, , no cookie set here.
third call
but here's lost: browser sent cookie data 3 values (i've obfuscated):
__utma=1.123456789.123456789.123456789.123456789.1 olfsk=olfsk12345678901234567890123456789 hblid=abcdl11abcabxabc1aabv1flfx1re1os
i don't know values set. seem relate google analytics (from articles i've found) don't know how collect them can attach them call make.
httpwebrequest request = (httpwebrequest)webrequest.create(newurl); request.keepalive = true; request.headers.add("upgrade-insecure-requests", "1"); request.useragent = "mozilla/5.0 (windows nt 6.1; wow64) applewebkit/537.36 (khtml, gecko) chrome/59.0.3071.115 safari/537.36 opr/46.0.2597.57"; request.accept = "text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,image/apng,*/*;q=0.8"; request.headers.set(httprequestheader.acceptencoding, "gzip, deflate, br"); request.headers.set(httprequestheader.acceptlanguage, "en-us,en;q=0.8"); ///request.headers.set(httprequestheader.cookie, @"__utma=1.123456789.123456789.123456789.123456789.1; olfsk=olfsk12345678901234567890123456789; hblid=abcdl11abcabxabc1aabv1flfx1re1os"); httpwebresponse response = (httpwebresponse)request.getresponse(); string newurl = response.responseuri.tostring();
please note commented out line cookie data - i've tried line un-commented also.
what happens i never response call.
i appreciative of insights.
guessing cookie data in third call needed, , is set client-side script gets collected between 2nd , 3rd call - new , unsure.
- if set on client side, how can valid cookies me past roadblock. (these roadblock coming in next call, more cookies used not see set in server response - not there yet.)
i know can solve using webbrowser
object, seems clumsy solution. there less clumsy way go? there other objects or libraries should try? (restsharp? postman? webrequest object instead of httpwerequest?)
Comments
Post a Comment