regex - Scrape div contents using PHP and cURL -
i'm new curl. have been trying scrape contents of this amazon link, (ie., image, book title, author , price of 20 books) html page. far i've got print page using below code
<?php function curl($url) { $options = array( curlopt_returntransfer => true, curlopt_followlocation => true, curlopt_autoreferer => true, curlopt_connecttimeout => 120, curlopt_timeout => 120, curlopt_maxredirs => 10, curlopt_url => $url, ); $ch = curl_init(); curl_setopt_array($ch, $options); $data = curl_exec($ch); curl_close($ch); return $data; } ?> $url = "http://www.amazon.in/gp/bestsellers/books/1318209031/ref=zg_bs_nav_b_2_1318203031"; $results_page = curl($url); echo $results_page;
i have tried using regex , failed; have tried possible 6hrs straight , got tired, hoping find solution here; isn't enough solution tq in advance. :)
update: found helpful site(click here) beginners me(without using curl though).
you should using awsecommerce api, here's way leverage yahoo's yql service:
<?php $query = sprintf( 'http://query.yahooapis.com/v1/public/yql?q=%s', urlencode('select * html url = "http://www.amazon.in/gp/bestsellers/books/1318209031/ref=zg_bs_nav_b_2_1318203031" , xpath=\'//div[@class="zg_itemimmersion"]\'') ); $xml = new simplexmlelement($query, null, true); foreach ($xml->results->div $product) { vprintf("%s\n", array( $product->div[1]->div[1]->a, )); } /* engineering thermodynamics textbook of fluids mechanics design of everyday things forest history of india computer networking story of microsoft private empire: exxonmobil , americ... project management metrics, kpis, and... design , analysis of experiments: i... ies - 2013: general english foundation of software testing: istqb... faster: 100 ways improve digi... textbook of fluid mechanics , hyd... software engineering embedded sys... communication skills engineers making things move diy mechanisms for... virtual instrumentation using labview geometric dimensioning , tolerancin... power system protection & switchgear... computer networks */
Comments
Post a Comment