On May 5, 2024, Google accidentally released some of its internal data.
The details are covered thoroughly in articles by SparkToro’s Rand Fishkin and Mike King:
- https://sparktoro.com/blog/an-anonymous-source-shared-thousands-of-leaked-google-search-api-documents-with-me-everyone-in-seo-should-see-them/
- https://sparktoro.com/blog/11-min-video-the-google-api-leak-should-change-how-marketers-and-publishers-do-seo/
- https://ipullrank.com/google-algo-leak
Documents that were never meant to be public are now available online. Disclaimer: I’m not a technical SEO expert, nor do I have programming skills. However, I have a curious mind and nearly a decade of experience in web content management, which includes both recovering from penalties and earning them (who hasn’t?).
You’ll find my insights and sarcastic remarks here. I hope they prompt you to form your own conclusions and help you understand how this massive search engine operates.
Summary and 20 Insights
Off Page
- All new sites start in the “sandbox” — be prepared for your site to remain practically invisible for a year or more. During this time, you’ll still need to regularly add content and secure high-quality backlinks, essentially investing without expecting traffic growth for a long while. Google employs the “siteAuthority” parameter, although it officially denies it. Details on how siteAuthority works in real life are unclear, so this interpretation is loose.
- Click-through rate (CTR) affects ranking. Make sure your snippets and meta tags are appealing because Google shows sites more frequently if users click on them more. Google even tracks mouse clicks on pages. But if users just read without converting, your page may disappear from results. Work on conversion rates and hope for a comeback.
- New businesses won’t show up in search results. To gain visibility, set up a site, optimize pages, and trigger engagement signals. How? Use ads, branding, outdoor ads, or any method you find suitable.
- SmallPersonalSite tag — researchers found a “smallPersonalSite” label for small personal sites, which doesn’t seem promising. Developing a personal blog? Well, hope you like the sandbox [sarcasm]. The blog Marketing Link was busy molding sandcastles in the sandbox for almost three years before hitting 10,000 monthly visitors.
- Ads boost page rank and engagement metrics. We knew this, but now it’s official. Giants like HubSpot and SEMrush advertise their blog posts to boost traffic. While clicks are cheap, this can be a separate budget for ads, with little hope of conversions from these posts. This approach can be tough on small business budgets.
- Three link quality segments — Google categorizes links as low, medium, or high quality. This means that when you create a new page, you may find the traffic is like a candle flame — it flares up initially, only to fade out. At its peak, the page might reach several thousand visits per month, but it’s possible that its highest traffic point will only hit eight visits before it starts to decline. Eventually, as the CTR drops, the page will gradually disappear from search results, no matter how many backlinks you add. Once again: focus on engagement metrics.
- Sitelinks appear based on user clicks. You can’t manually add pages to Sitelinks, but you can work on engagement and traffic for key pages.
- EWOK platform — Google uses this to collect good and bad signals about pages. What Google does with this data is still unclear. Conspiracy theories circulate online, suggesting that live reviewers might sit there, rating sites for search ranking in exchange for payment. I, of course, don’t believe this—the reputational risks for Google would be too high, especially given the millions of dollars it already pays in various fines for lesser infractions.
On Page
- Subdomains aren’t ranked separately — they’re grouped with main domains. If subdomains have unoptimized or irrelevant content, consider removing or optimizing them.
- Domain age matters (hostAge). The older the domain, the better — but it should be a reputable domain with quality backlinks. Age doesn’t just mean a domain that’s been sitting around unused for years or one that hosted a sketchy site like a casino or adult content. It should be a well-established, legitimate domain with quality backlinks. So if you’re buying a dropped domain, check its history to make sure it’s clean — no blackjack and hookers.
- Google favors “whitelisted” sites. If you’re creating content for a particular keyword and a “white” site already has a page on that topic, you won’t get much traction. What to do? First, copying is bad—okay, that was a joke. No joke: don’t try to compete on high-volume keywords against giants; aim for smaller wins until you “dig out of the sandbox” or maybe get bought by Forbes.
- If you have to choose between these factors—content length, CTR, keyword density, keywords—Google will pick CTR. An unexpected twist.
- EEAT isn’t as frightening as it seemed. What to do? Improve behavioral factors. Google “Your Business +Keyword” and scroll all the way to page 10 if needed until you find and click your site. If this happens often enough, from different IPs, Google will eventually show you for these queries. But that’s not a guarantee—just like everything in life.
- Meta tags matter. Google scores titles with titlematchScore. If you haven’t set up a Description, that’s bad news—no matter how hard you work on infographics or build backlinks to that page. Leaked documents say nothing about title length, so if your Titles and Descriptions are on the longer side, it won’t hurt your page rank. But there’s no proof it’ll help either.
- Google also pays attention to the page creation date. Any attempts to “tweak” it, like changing the date in the URL, won’t get you anywhere. So if you’re updating a publication, use two dates: “published” and “updated.”
- OriginalContentScore and token diversity. Google counts tokens and calculates the ratio of unique tokens to total words. In simpler terms: write with style, use synonyms, metaphors, and surprise in your comparisons like a cupcake at a Christmas dinner. Google may not fully understand original content, but it does assess vocabulary variety. Find a copywriter who can write more than generic ChatGPT content and really bring the text to life, making it as enjoyable as chewing on honeycomb.
- chromeInTotal metric for user clicks. All user clicks are analyzed with the chromeInTotal parameter. The more a visitor interacts with your content—leaves contact info, highlights text—the better. A friend in SEO joked, “Well, let’s add more pop-ups so people will click.” It’s a risky suggestion, as it could trigger penalties for hidden content. A “whiter-hat” method is to add a chat tool; we use HubSpot, for instance. When users engage, chromeInTotal scores go up.
- PageRank and anchors aren’t as important as H1, H2, and meta tags. Good news: it’s cheaper to optimize on-page content than to compete in backlink bidding wars. Also, remember best practices: headers should be in the correct font size. For example, if your main text is Arial 12, H2 should be at least Arial 18 or bigger. Though you could joke that bigger isn’t always better. Page structure and style now matter more than EEAT—form has won over substance, unfortunately.
- Google hasn’t created an all-encompassing mystery algorithm, just “signals” for ranking. Maybe not just signals, but still far from a comprehensive algorithm.
- And Google will deny all of this, just like it has done for years.
How Search Algorithms Work
Navboost (also known as Glue) search engines have been around since 2005 and filter clicks by categorizing them as either good or bad. For instance, if a user enters a query but isn’t satisfied with the results and immediately clicks back, Google flags that query as “bad” and stops suggesting it. My thoughts out loud: how many monkeys would you need to type in, say, “Your Business,” then instantly hit back so that eventually your business name stops appearing in results altogether? Or, another example: how many monkeys would it take to constantly search for “Your Business + specialty (country, business)” and click on a site (either promoting or tarnishing your brand) optimized for that query, so Navboost labels it as a “good” query and ties it to your site, not a competitor’s—likely several times over. Sound familiar? Well, it looks like monkey farms will thrive, all in the name of boosting or sinking your brand.
Conclusion
We thought we could forget about behavior metrics after 2014, but here we are.
Content is king only if the king himself creates it. In other words, new sites go straight to the sandbox. Larger sites rank better, even if your article was written by a Nobel laureate, a Forbes writer’s article will outrank it.
Don’t lose hope: the journey is tough, but it’s worth it. If you give up, your competitors won’t.
Rand Fishkin’s top advice: “Create a recognizable, popular brand in your field outside of Google Search.”