PHP parsing external site -

June 15, 2013

i not have experience in parsing external url's grab data from, today i've try experiments:

$str1 = file_get_contents('http://www.indiegogo.com/projects/ubuntu-edge'); $test1 = strstr($str1, "amount medium clearfix"); $parts = explode(">",$test1); $parts2 = vsprintf("%s", $parts[1]);  $str2 = file_get_contents('http://www.indiegogo.com/projects/ubuntu-edge'); $test2 = strstr($str2, "money-raised goal"); $test3 = str_ireplace("money-raised goal", "", "$test2"); $test4 = str_ireplace("\"", "", "$test3"); $test5 = str_ireplace(">", "", "$test4"); $test6 = substr($test5, 0, 29); $test7 = explode("raised of", $test6); $test8 = vsprintf("%s", $test7[1]);

try code with:

print_r($parts2); print_r($test8); , echo "$parts2 - $test8";

because it's popular days ubuntu edge campaign have try 2 fields site (only experiment), without success. grabs 2 fields, not put both in same variable. output or $parts2, or $parts2 contain value of test8, or $test8.

what i'm doing wrong, , why? there simpler method want, without code?

well grabs 2 fields, not put both in same variable.

not sure mean there.

also there simpler method want, without code?

without code? no. more flexible , (possibly) efficient? yes.

try , tailor liking

<?php $page = file_get_contents('http://www.indiegogo.com/projects/ubuntu-edge');  $doc = new domdocument; libxml_use_internal_errors(true); $doc->loadhtml($page);  $finder = new domxpath($doc);  // find class="money-raised" $nodes = $finder->query("//*[contains(@class, 'money-raised')]");  // children of first match  (class="money-raised") $raised_children = $nodes->item(0)->childnodes;  // children of second match (class="money-raised goal") $goal_children = $nodes->item(1)->childnodes;  // amount value $money_earned = $raised_children->item(1)->nodevalue;  // amount value preg_match('/\$[\d,]+/', $goal_children->item(0)->nodevalue, $m); $money_earned_goal = $m[0];   echo "money earned: $money_earned\n"; echo "goal: $money_earned_goal\n";  ?>

this has eleven lines of code without echos (compared 12 lines), calls other site once. scraping websites involved task. code gets values wanted exact page.

if want scrape sites, recommend learning use domdocument , domxpath. there lot learn, it's worth effort.

Search This Blog

Roma

PHP parsing external site -

Comments

Post a Comment

Popular posts from this blog

How to logout from a login page in asp.net -

How do i redirect a user to the previous page they came from after logging in? HTML/ASP -

java - More than one row with the given identifier was found: 1, for class: com.model.Diagnosis -