Results 1 to 5 of 5

Thread: Update Pagerank for all links

  1. #1
    chunguens is offline Registered User
    Join Date
    Nov 2008
    Posts
    89

    Default Update Pagerank for all links

    Copy the code below and save it as "getpagerank.php" and then run it in your web browser to populate/update a range of PageRank entries in your directory (idx_pagerank table).

    Code:
    <?php
     
    include "application.php";
    function StrToNum($Str, $Check, $Magic)
    {
        $Int32Unit = 4294967296;  // 2^32
    
        $length = strlen($Str);
        for ($i = 0; $i < $length; $i++) {
            $Check *= $Magic; 	
            //If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31), 
            //  the result of converting to integer is undefined
            //  refer to http://www.php.net/manual/en/language.types.integer.php
            if ($Check >= $Int32Unit) {
                $Check = ($Check - $Int32Unit * (int) ($Check / $Int32Unit));
                //if the check less than -2^31
                $Check = ($Check < -2147483648) ? ($Check + $Int32Unit) : $Check;
            }
            $Check += ord($Str{$i}); 
        }
        return $Check;
    }
    
    /* 
     * Genearate a hash for a url
     */
    function HashURL($String)
    {
        $Check1 = StrToNum($String, 0x1505, 0x21);
        $Check2 = StrToNum($String, 0, 0x1003F);
    
        $Check1 >>= 2; 	
        $Check1 = (($Check1 >> 4) & 0x3FFFFC0 ) | ($Check1 & 0x3F);
        $Check1 = (($Check1 >> 4) & 0x3FFC00 ) | ($Check1 & 0x3FF);
        $Check1 = (($Check1 >> 4) & 0x3C000 ) | ($Check1 & 0x3FFF);	
    	
        $T1 = (((($Check1 & 0x3C0) << 4) | ($Check1 & 0x3C)) <<2 ) | ($Check2 & 0xF0F );
        $T2 = (((($Check1 & 0xFFFFC000) << 4) | ($Check1 & 0x3C00)) << 0xA) | ($Check2 & 0xF0F0000 );
    	
        return ($T1 | $T2);
    }
    
    /* 
     * genearate a checksum for the hash string
     */
    function CheckHash($Hashnum)
    {
        $CheckByte = 0;
        $Flag = 0;
    
        $HashStr = sprintf('%u', $Hashnum) ;
        $length = strlen($HashStr);
    	
        for ($i = $length - 1;  $i >= 0;  $i --) {
            $Re = $HashStr{$i};
            if (1 === ($Flag % 2)) {              
                $Re += $Re;     
                $Re = (int)($Re / 10) + ($Re % 10);
            }
            $CheckByte += $Re;
            $Flag ++;	
        }
    
        $CheckByte %= 10;
        if (0 !== $CheckByte) {
            $CheckByte = 10 - $CheckByte;
            if (1 === ($Flag % 2) ) {
                if (1 === ($CheckByte % 2)) {
                    $CheckByte += 9;
                }
                $CheckByte >>= 1;
            }
        }
    
        return '7'.$CheckByte.$HashStr;
    }
    
    function getpagerank($url) {
    
    $fp = fsockopen("toolbarqueries.google.com", 80, $errno, $errstr, 30);
    if (!$fp) {
       echo "$errstr ($errno)<br />\n";
    } else {
     $out = "GET /search?client=navclient-auto&ch=".CheckHash(HashURL($url))."&features=Rank&q=info:".$url."&num=100&filter=0 HTTP/1.1\r\n";
    $out .= "Host: toolbarqueries.google.com\r\n";
    $out .= "User-Agent: Mozilla/4.0 (compatible; GoogleToolbar 2.0.114-big; Windows XP 5.1)\r\n";
    $out .= "Connection: Close\r\n\r\n";
    
       fwrite($fp, $out);
       
       //$pagerank = substr(fgets($fp, 128), 4);
       //echo $pagerank;
       while (!feof($fp)) {
    	$data = fgets($fp, 128);
    	$pos = strpos($data, "Rank_");
    	if($pos === false){} else{
    		$pagerank = substr($data, $pos + 9);
    		return $pagerank;
    		
    	}
       }
       fclose($fp);
       
       }
    
    }
    
    $sql = "select link_id, url from idx_link where link_id > '0' and link_id < '1000'";
    $rs = $dbConn->Execute($sql);
    
    while ($row = $rs->FetchRow()) {
      echo "$row[url]<br />";
      flush();  
    $pr = getpagerank($row[url]);
    echo $pr;
      $sql = "insert into idx_pagerank (link_id, engine, rank) values ('$row[link_id]','g','$pr')";
            echo $sql . "<hr />";
            $dbConn->Execute($sql);
            flush();
    }
    
    
     ?>
    You may change the values "0" and "1000" at this line:
    $sql = "select link_id, url from idx_link where link_id > '0' and link_id < '1000'";

    The php code can be improve.
    I hope it can help you.
    Best Regards

  2. #2
    sambog is offline Registered User
    Join Date
    Oct 2009
    Posts
    6

    Default

    Thank you !!!

  3. #3
    misingshot is offline Registered User
    Join Date
    Jan 2007
    Posts
    16

    Default

    Thanks chunguens,

    I already try code above, the browser show correct details of PR for each list.
    But I've found that those details not go in to sql db in idx_pagerank table.

    I edit something at sql query line like these and it's work:

    Code:
    <?php
     
    include "application.php";
    function StrToNum($Str, $Check, $Magic)
    {
        $Int32Unit = 4294967296;  // 2^32
    
        $length = strlen($Str);
        for ($i = 0; $i < $length; $i++) {
            $Check *= $Magic; 	
            //If the float is beyond the boundaries of integer (usually +/- 2.15e+9 = 2^31), 
            //  the result of converting to integer is undefined
            //  refer to http://www.php.net/manual/en/language.types.integer.php
            if ($Check >= $Int32Unit) {
                $Check = ($Check - $Int32Unit * (int) ($Check / $Int32Unit));
                //if the check less than -2^31
                $Check = ($Check < -2147483648) ? ($Check + $Int32Unit) : $Check;
            }
            $Check += ord($Str{$i}); 
        }
        return $Check;
    }
    
    /* 
     * Genearate a hash for a url
     */
    function HashURL($String)
    {
        $Check1 = StrToNum($String, 0x1505, 0x21);
        $Check2 = StrToNum($String, 0, 0x1003F);
    
        $Check1 >>= 2; 	
        $Check1 = (($Check1 >> 4) & 0x3FFFFC0 ) | ($Check1 & 0x3F);
        $Check1 = (($Check1 >> 4) & 0x3FFC00 ) | ($Check1 & 0x3FF);
        $Check1 = (($Check1 >> 4) & 0x3C000 ) | ($Check1 & 0x3FFF);	
    	
        $T1 = (((($Check1 & 0x3C0) << 4) | ($Check1 & 0x3C)) <<2 ) | ($Check2 & 0xF0F );
        $T2 = (((($Check1 & 0xFFFFC000) << 4) | ($Check1 & 0x3C00)) << 0xA) | ($Check2 & 0xF0F0000 );
    	
        return ($T1 | $T2);
    }
    
    /* 
     * genearate a checksum for the hash string
     */
    function CheckHash($Hashnum)
    {
        $CheckByte = 0;
        $Flag = 0;
    
        $HashStr = sprintf('%u', $Hashnum) ;
        $length = strlen($HashStr);
    	
        for ($i = $length - 1;  $i >= 0;  $i --) {
            $Re = $HashStr{$i};
            if (1 === ($Flag % 2)) {              
                $Re += $Re;     
                $Re = (int)($Re / 10) + ($Re % 10);
            }
            $CheckByte += $Re;
            $Flag ++;	
        }
    
        $CheckByte %= 10;
        if (0 !== $CheckByte) {
            $CheckByte = 10 - $CheckByte;
            if (1 === ($Flag % 2) ) {
                if (1 === ($CheckByte % 2)) {
                    $CheckByte += 9;
                }
                $CheckByte >>= 1;
            }
        }
    
        return '7'.$CheckByte.$HashStr;
    }
    
    function getpagerank($url) {
    
    $fp = fsockopen("toolbarqueries.google.com", 80, $errno, $errstr, 30);
    if (!$fp) {
       echo "$errstr ($errno)<br />\n";
    } else {
     $out = "GET /search?client=navclient-auto&ch=".CheckHash(HashURL($url))."&features=Rank&q=info:".$url."&num=100&filter=0 HTTP/1.1\r\n";
    $out .= "Host: toolbarqueries.google.com\r\n";
    $out .= "User-Agent: Mozilla/4.0 (compatible; GoogleToolbar 2.0.114-big; Windows XP 5.1)\r\n";
    $out .= "Connection: Close\r\n\r\n";
    
       fwrite($fp, $out);
       
       //$pagerank = substr(fgets($fp, 128), 4);
       //echo $pagerank;
       while (!feof($fp)) {
    	$data = fgets($fp, 128);
    	$pos = strpos($data, "Rank_");
    	if($pos === false){} else{
    		$pagerank = substr($data, $pos + 9);
    		return $pagerank;
    		
    	}
       }
       fclose($fp);
       
       }
    
    }
    
    $sql = "select link_id, url from idx_link where link_id > '0'";
    $rs = $dbConn->Execute($sql);
    
    while ($row = $rs->FetchRow()) {
      echo "$row[url]<br />";
      flush();  
    $pr = getpagerank($row[url]);
    echo $pr;
      $sql = "UPDATE `idx_pagerank` SET `idx_pagerank`.`rank` = '$pr' where `idx_pagerank`.`link_id` = '$row[link_id]';";
            echo $sql . "<hr />";
            $dbConn->Execute($sql);
            flush();
    }
    
    
     ?>
    If any of you have same problem like me on the first one above, may be you can try this one of mine ^^.

  4. #4
    chunguens is offline Registered User
    Join Date
    Nov 2008
    Posts
    89

    Post google ignore your IP after ~1000 Requests

    For me it work well. But my idx_pagerank database was totally empty.

    You should pay attention, because google will ignore your IP after approximately 1200 pagerank requests.
    To solve that, i installed indexu and this script on localhost and not on my hosting server. I renew my IP every 1000 pagerank requests (to renew my IP i only turn off and next turn on my internet connection) .
    After that, i export the localhost database and import thought phpmyadmin to my hosting server.

  5. #5
    sambog is offline Registered User
    Join Date
    Oct 2009
    Posts
    6

    Default

    The second one works great if you have already a filled idx_pagerank.
    thanks for your work.

Similar Threads

  1. Replies: 13
    Last Post: 02-04-2009, 10:34 PM
  2. links will not update
    By leoa in forum INDEXU DELUXE v1.x
    Replies: 7
    Last Post: 06-23-2008, 03:33 PM
  3. No Update for Pagerank Statistics
    By sama in forum v5.x
    Replies: 8
    Last Post: 02-08-2008, 11:59 AM
  4. new and update links change
    By landuyt in forum Tutorials, Hints & Tips
    Replies: 2
    Last Post: 05-08-2006, 12:39 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •