PHP parsing external site -


i not have experience in parsing external url's grab data from, today i've try experiments:

$str1 = file_get_contents('http://www.indiegogo.com/projects/ubuntu-edge'); $test1 = strstr($str1, "amount medium clearfix"); $parts = explode(">",$test1); $parts2 = vsprintf("%s", $parts[1]);  $str2 = file_get_contents('http://www.indiegogo.com/projects/ubuntu-edge'); $test2 = strstr($str2, "money-raised goal"); $test3 = str_ireplace("money-raised goal", "", "$test2"); $test4 = str_ireplace("\"", "", "$test3"); $test5 = str_ireplace(">", "", "$test4"); $test6 = substr($test5, 0, 29); $test7 = explode("raised of", $test6); $test8 = vsprintf("%s", $test7[1]); 

try code with:

print_r($parts2); print_r($test8); , echo "$parts2 - $test8";

because it's popular days ubuntu edge campaign have try 2 fields site (only experiment), without success. grabs 2 fields, not put both in same variable. output or $parts2, or $parts2 contain value of test8, or $test8.

what i'm doing wrong, , why? there simpler method want, without code?

well grabs 2 fields, not put both in same variable.

not sure mean there.

also there simpler method want, without code?

without code? no. more flexible , (possibly) efficient? yes.

try , tailor liking

<?php $page = file_get_contents('http://www.indiegogo.com/projects/ubuntu-edge');  $doc = new domdocument; libxml_use_internal_errors(true); $doc->loadhtml($page);  $finder = new domxpath($doc);  // find class="money-raised" $nodes = $finder->query("//*[contains(@class, 'money-raised')]");  // children of first match  (class="money-raised") $raised_children = $nodes->item(0)->childnodes;  // children of second match (class="money-raised goal") $goal_children = $nodes->item(1)->childnodes;  // amount value $money_earned = $raised_children->item(1)->nodevalue;  // amount value preg_match('/\$[\d,]+/', $goal_children->item(0)->nodevalue, $m); $money_earned_goal = $m[0];   echo "money earned: $money_earned\n"; echo "goal: $money_earned_goal\n";  ?> 

this has eleven lines of code without echos (compared 12 lines), calls other site once. scraping websites involved task. code gets values wanted exact page.

if want scrape sites, recommend learning use domdocument , domxpath. there lot learn, it's worth effort.


Comments

Popular posts from this blog

curl - PHP fsockopen help required -

HTTP/1.0 407 Proxy Authentication Required PHP -

c# - Resource not found error -