javascript - NodeJS - Request a page with later loaded info -


i'm making web-crawler nodejs, it's working, calls page, use cheerio convert jquery, , call tags.

and i'm trying call comments of page, problem is, tag want loaded after seconds ajax request. , request-promise made can't find specific tag, because loads later.

there way can find tag loaded?

code :

/* requires */ var rp = require('request-promise'); var cheerio = require('cheerio');  //page crawl  var pagetovisit = "http://pagetovisit.com/page-with-comments.html"; console.log("visiting "+pagetovisit);   var options = {     uri: pagetovisit,     transform: function (body) {         return cheerio.load(body);     },     resolvewithfullresponse: true,     simple: false };  rp(options)     .then(function ($) {         console.log($('.commentstag').text());     })     .catch(function (err) {         console.log(err);         // crawling failed...     }); 

i not believe able using 'cheerio'. parses html dom intents , purposes, not web browser, not execute scripts on page. need use casperjs (powered phantomjs) render page allow wait content load via scripts.

casperjs waitforselector

edit: in cheerio documentation.

cheerio not web browser

cheerio parses markup , provides api traversing/manipulating resulting data structure. not interpret result web browser does. specifically, not produce visual rendering, apply css, load external resources, or execute javascript. if use case requires of functionality, should consider projects phantomjs or jsdom.


Comments

Popular posts from this blog

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

python Tkinter Capturing keyboard events save as one single string -

sql server - Why does Linq-to-SQL add unnecessary COUNT()? -