Jim Westergren
About me, my projects, SEO, Web Development and Personal Development since 2005
"If we did all the things we are capable of, we would literally astound ourselves." - Thomas A. Edison

as Internal Live PR; Evidence and Findings

March 27 update

In Matt Cutts latest post: http://www.mattcutts.com/blog/q-a-thread-march-27-2006/:

Q: Is the RK parameter turned off, or should we expect to see it again? A: I wouldn't expect to see the RK parameter have a non-zero value again.

Q: What's an RK parameter? A: It's a parameter that you could see in a Google toolbar query. Some people outside of Google had speculated that it was live PageRank, that PageRank differed between Bigdaddy and the older infrastructure, etc.

So my assumption that it is now put to zero because they were not supposed to be public seem to be valid.

Well, it was funny as it lasted.

March 25 update

Matt Cutts of Google read this article and also posted a funny comment on this post.

I was very curious and so of course I sent him an e-mail.

His answer:

I'm sorry, I can't shed light on that at this time. :)

Best wishes though, Matt

Matt, hope you are ok that I post this :), anyways I give here my theory:

The RK "Internal Live PR" values were not supposed to be public and thus Google put them now all to 0.

/ Jim

March 24 update

Google has now put all RK values to zero for all URLs.

Either some temporary glitch OR Google didn't like that value to be public ...

Let's wait and see.

/ Jim

Original Article

--------------------------

"PR prediction tools"

In may 2004 the checksum algorithm that Google is using to query a site's PageRank from it's servers were cracked and released here.

This spread across the internet and the code was translated into PHP. People started to write scripts to get the PR value without a toolbar and even though Google changed their checksum algorithm it was shortly cracked again. Then the popular PageRank prediction tools started to emerge from different places.

All these tools does not work as nobody fully understood what it is, not even the tool creaters.

The discovery

In a WebmasterWorld thread of Feb 12, 2006 a member asked what the mysterious figures such as Rank_1:1:6 Rank_1:1:5 Rank_1:1:4 that are displayed in one of the "PR prediction" tools when clicking "check" actually meant.

The discussion starts in the thread and another member (arran) posts that if you drop &features=Rank from the URL you get an XML feed, visible here.

It was also found out that you can add &start=0&num=10 at the end of the URL (protellix) to get a start and finish with more listings and that you could replace Rank in the &features=Rank part of the URL and instead use one of the following features (phish):

We realised that this XML feed is in fact a search query of the Google SERPs.

In message 75 of that thread, member Hanu posts a script that has the checksum decoder built in and will get the XML feed including any of the above features. I will not post this tool here as it violates Google's TOS.

Anyway, when using that tool:

That will display the XML feed of the SERP in a standard way with 100 results on a BigDaddy datacenter. Kudus to Hanu for this tool!

Don't use toolbarqueries.google.com as Google is using a DNS-based load balancing and it is like getting a random DC servered to you (same when you check SERPs in Google.com btw), so use the IP number of a DC.

Explaining the SERP XML Feed

Let's look at the above XML example.

This is what the abbrevations mean according to Google XML Tag Definitions, an official Google document:

It is the <rk> here that is interesting. Some recent "PR predictions tools" or "Live PageRank tools" has been using this value in their tools.

By looking at various SEO forums now during the PR update we see that it is those kind of tools that are most liked and accurate, but not 100% accurate - usually 1 value higher and I found the reason.

Observations and testings of RK to see if it is Live PR

By looking at Matt Cutts different blog posts with tools that shows you the PR value of multiple datacenters we see that the PR that is now in the process of being exported to the toolbars across 40+ DCs is dated between feb 4-7, 2006 which is around 20 days ago.

I was then using a tool that shows you the RK values for an URL across 40+ DCs on the more recent blog posts by Cutts and I found some incredible things.

Matt Cutts post from (Feb 17) shows as RK 3 on the DCs that found it and nothing on the rest, see the results from the tool here. A post from feb 15 shows as RK 6 on half DCs and RK 3 on the other half, see it yourself.

This means that the RK values are updated regularly.

And now this:

On those DCs that the RK of the blog post was 6 compared to those it was 3 he ranked higher in the SERPs!

Why older blog posts by Cutts has higher RK? GoogleBot has not yet found the the many RSS feeds and links that are linking to his posts (most probably).

I have also seen the RK values changing by the day and according to the person that made the LivePR tool he has been saying RK values increasing the same day as Google is caching new backlinks on that particular datacenter!

If you do search queries in the tool that Hanu provided you will see that the RK value is static and the same - no matter which query you use.

BidDaddy and non-BigDaddy RK values

The recent PR tools that uses the RK uses the BigDaddy datacenters, and those are usually 1-2 values higher than the toolbar PR. And by using this tool we see that there is a also a different RK value on the BigDaddy datacenters then the rest of the datacenters. Reason for this has to do with the new infrastructure but I don't know what.

I will update here later when I find out.

Difference of toolbar RK and SERP RK

I found something else very interesting.

Look at this article explaining how the PageRank toolbar works. After information is sent to a Google server, data in the form of an XML document with data about the URL is sent back.

There are values and info on a lot of things and guess what the field of the value of the PR is called?

<RK>

The same name as the SERP XML, but in the SERP XML the RK is not the toolbar PR, just has the same name.

Google definition of RK

Let's look at the definition of <rk> from Google.

From their official "Google XML Reference":

"Provides a general rating of the relevance of the search result"

Where does this come from? Seem total wacko, and yes it is a mistake.

First of all it is the XML document for the Google Search Appliance not the general search API.

I found a very interesting document from Google called: "Google's Search Results Protocols", hosted by some guy that mirrors controversial and important documents "that is in danger of censorship".

And there it says:

Definition of RK: "Google's rating of how good a single search result is"

But check this:

In that same document it defines what is a "single search result".

And it says:

"R - A single search result - Contains a U; an optional T; an RK; any number of F's; an optional S; and a HAS"

That is the SERP XML!

Every SERP listing in the XML starts with an <R>.

The old definition of R as per that same docuement is:

"A single search result"

The new definition from Google XML Tag Definitions is:

"Provides encapsulation for the details of an individual search result"

So the guy that wrote the new version of this document now called "Google XML Reference", earlier called "Google's Search Results Protocols" translated RK:

From:

"Google's rating of how good a single search result is"

To:

"Provides a general rating of the relevance of the search result"

Which is total wrong, the person didn't see there was a special definition for "single search result".

And this has caused headaches for SEOs ever since ...

A "single search result" is meant to be a listing in the SERP.

Which means that RK is:

"Google's rating of how good a listing in the SERP is"

Which is: PageRank!

To further prove the point:

Old version:

U - The URL of a single search result T - The title of a single search result RK - Google's rating of how good a single search result is

New version:

U - The URL of the search result T - The title of the search result RK - Provides a general rating of the relevance of the search result

The RK is a static value and has nothing to do with relevance, check yourself.

What does RK stand for?

My theory:

It is Rank. Why?

The RK values shows up on "&features=Rank".

PR is not a Google official abbreviation, it is something SEOs made up. They have PageRank and use the Rank part of the word, simple.

Checking the example above of Cutts blog posts shows that those DCs that has higher RK has a higher Rank as well.

What we now know

There are 3 kind of values.

If RK is not the internal PR, which I now believe, it must be something that is very close to being it.

What we now can do with this if RK is live internal PR

And more

Questions remaining

Update of this article

I will update this article regularly as soon I find more evidence and findings. Please also comment so I get more info.

The article is discussed on cre8asite forums here.

And the original discovery thread on WMW from message 151, here.

I think this is very interesing.

EDIT: Nice forum post by Matt Cutts :)

24 Feb 2006

About the Author Jim Westergren Jim Westergren is a Swedish web entrepreneur currently living in Spain. He is happily married and has three lovely children. Some of his interests are web development, SEO and writing.
He is the Founder of DomainStats and N.nu. Read his .