web scraping - Loop through payload Python -

May 15, 2011

there site connect to, need login 4 times different user names , passwords.

is there anyway can looping through usernames , passwords in payload.

this first time im doing , not sure of how go it. code works fine if post 1 username , password.

im using python 2.7 , beautifulsoup , requests.

here code.

import requests import zipfile, stringio bs4 import beautifulsoup  # here add login details submitted login form. payload = [ {'username': 'xxxxxx','password': 'xxxxxx','option': 'login'}, {'username': 'xxxxxx','password': 'xxxxxxx','option': 'login'}, {'username': 'xxxxx','password': 'xxxxx','option': 'login'}, {'username': 'xxxxxx','password': 'xxxxxx','option': 'login'}, ] #possibly need headers later. headers = {'user-agent': 'mozilla/5.0 (macintosh; intel mac os x 10_12_5) applewebkit/537.36 (khtml, gecko) chrome/59.0.3071.115 safari/537.36'} base_url = "https://service.rl360.com/scripts/customer.cgi/sc/servicing/"  requests.session() s:         p = s.post('https://service.rl360.com/scripts/customer.cgi?option=login', data=payload)          # download page scrape.         r = s.get('https://service.rl360.com/scripts/customer.cgi/sc/servicing/downloads.php?folder=datadownloads&sortfield=expirydays&sortorder=ascending', stream=true)         content = r.text         soup = beautifulsoup(content, 'lxml')         #now recent download url.         download_url = soup.find_all("a", {'class':'tabletd'})[-1]['href']         #now join base url download url.         download_docs = s.get(base_url + download_url, stream=true)         print "checking content"         content_type = download_docs.headers['content-type']         print content_type         print "checking filename"         content_name = download_docs.headers['content-disposition']         print content_name         print "checking download size"         content_size = download_docs.headers['content-length']         print content_size          #this extract , download specified xml files.         z = zipfile.zipfile(stringio.stringio(download_docs.content))         print "---------------------------------"         print "downloading........."         #now save files specified location.         z.extractall('c:\temp')         print "download complete"

just use loop. may need adjust download directory if files overwritten.

payloads = [ {'username': 'xxxxxx1','password': 'xxxxxx','option': 'login'}, {'username': 'xxxxxx2','password': 'xxxxxxx','option': 'login'}, {'username': 'xxxxx3','password': 'xxxxx','option': 'login'}, {'username': 'xxxxxx4','password': 'xxxxxx','option': 'login'}, ]  ....  payload in payloads:     requests.session() s:         p = s.post('https://service.rl360.com/scripts/customer.cgi?option=login', data=payload)         ...

Search This Blog

LP

web scraping - Loop through payload Python -

Comments

Post a Comment

Popular posts from this blog

PHP and MySQL WP -

android - InAppBilling registering BroadcastReceiver in AndroidManifest -

nginx - phpPgAdmin - log in works but I have to login again after clicking on any links -