Over the last five years, the concept of “machine learning” entered the mainstream. Today the term is a lightning rod for discussion, particularly in the world of search marketing. Over the last 5 years, interest in machine learning measured by Google Trends, has trended sharply upwards.
Introduction to Machine Learning
Though often used interchangeably with artificial intelligence (AI), machine learning is more accurate described as a subset of AI.
Hollywood representations of artificial intelligence, through creations like the Terminator and 3CPO, largely shape society’s impression of AI’s capabilities. However, “self-aware” machines capable of using reason and logic are still well beyond what existing AI can do today.
In 1950, a 13-foot-tall computing system dubbed “Bertie the Brain” proved unbeatable at Tic-Tac-Toe. The machine squared off against human competition at the Canadian National Exhibition. By the late 1990’s, IBM’s “Deep Blue” beat the chess grandmaster Garry Kasparov using brute-force computation.
Machine learning goes a step further, creating algorithms that independently evolve over time based on acquired data. Instead of hand-coding a specific set of rules for a task, machine learning self-evolves through the acquisition of large volumes of data. Consequently, it relies on algorithms that give it the ability to learn how to perform the task.
Product design and architecture are two industries already recognizing the value of machine learning. For example, a robots developed by MX3D will soon design, print, and autonomously construct a bridge over an Amsterdam canal.
Using machine learning concepts in different applications, Google’s RankBrain previews the possible, or perhaps probable future of SEO.
Introduction to Deep Learning & Google’s RankBrain
In 2012, researcher Andrew Ng founded the Google Brain project. With access to a significantly higher degree of computing power than previously ever used in machine learning, this research project resulted in what today is described as “deep learning.”
A 2015 profile of Ng by The Huffington Post highlights a key accomplishment of the original Google Brain program.
“Delightfully, one of its most important achievements came when computers analyzing scores of YouTube screenshots were able to recognize a cat. As Ng explained, “The remarkable thing was that [the system] had discovered the concept of a cat itself. No one had ever told it what a cat is. That was a milestone in machine learning.”
Though Ng departed Google for rival Baidu in early 2014, Google’s research into the applications of machine learning continued. Today, users of Google Analytics 360 Suite can speak questions such as “how many visitors came to my site from organic search last month” to their browser and get answers thanks to machine learning.
— Danny Sullivan (@dannysullivan) May 24, 2016
But most notably for anyone in the field of search marketing was the announcement of RankBrain in late October of 2015. As Bloomberg reported at the time:
“RankBrain uses artificial intelligence to embed vast amounts of written language into mathematical entities — called vectors — that the computer can understand. If RankBrain sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries.”
Following the announcement, attention quickly turned to a patent granted to Google in August of 2015. U.S. Patent #9104750 goes into great detail about a process involving “substitution data” and queries of “at least three sequential” queries. Many speculate that this patent is directly related to RankBrain.
More recently, Gary Illyes of Google shed light on the current uses and capabilities of RankBrain in an interview with Barry Schwartz and Danny Sullivan of Search Engine Land. In the discussion, Illyes confirmed that RankBrain does in fact use machine learning to substitute unrecognized query data. This unrecognizable data is substituted for more common terms, essentially trying to autonomously determine a searcher’s intent.
Essentially, if a user enters a lengthy, and never-before-seen query into Google, the algorithm will attempt to understand the context of the search. When possible, through a complex pattern of substitution, Google will serve results a more commonly searched query that it has a greater degree of confidence in.
For example, suppose I enter the following search into Google, “What is the drama from Brady vs Raiders playoffs where he fumbled?”
The first three results all prominently feature the phrase “Tuck Rule,” despite the fact that I used neither of those words in my search query. Clearly Google has formed an association between any searches done including the phrases “Brady” and “Raiders” with the infamous “Tuck Rule” game.
Optimizing for RankBrain: A Case Study:
Assume Google may be substitutes data for long-tail search queries to retrieve search results with a higher degree of confidence in quality. How might we then optimize for that?
One idea would be to take a close look at what Google is providing through its autocomplete functionality for various keywords. Another is to scan the “suggested search terms” at the bottom of the results page for a long-tail search query. Often times this will give an indication of what Google determines the intent of the query is, and offers clues as to what the most common phrasing is for asking that query.
Another technique is to take a deep dive into the search query data provided by Webmaster Tools. Look for long-ail queries generating an unusually high number of impressions in particular. These terms may represent terms featured frequently in auto-complete suggestions. Or perhaps these terms show up commonly in the suggested search results.
For one client specializing in guide books for verifying government-issued forms of identification, I noticed an abnormally high volume of impressions for the phrase, “Which forms of ID should be verified with an ID Checking Guide?”
Doing some testing, this 12-word query seems to get a boost from both auto-complete and suggested search. For example, by simply typing in “which forms of ID s…”, you can see that the full phrase appears via auto-complete:
Additionally, try typing in a relevant, but re-structured query like: “What kinds of identification do i need to check with guide?” This test will likely produce the original phrase as one of the “suggested search” results:
Looking at the data over the 28-day period from July 14th to August 10th, phrases containing the words “which forms of ID…” generated 10 clicks, several hundred impressions, and ranked in position 6.3 on average. The impressions and clicks were spread across 7 different pages, none of which truly answered the question in the target query directly.
As such, a new page was drafted and published on August 10th. The intent was to comprehensively answer the question: Which forms of ID should be verified with an id-checking guide?
The result? Over the 28-day period from September 21st to October 18th, Webmaster Tools shows that queries including the words “which forms of ID…” generated 138 clicks from just 216 impressions. The average position of 1.4 in the organic results led to an astounding 63.89% CTR. All of the activity comes from the newly developed page.
Disclaimer: I cannot say with total certainty that this case study is directly related to RankBrain. However, the idea is to identify patterns in search queries generating impressions for your site or your client’s site. By attempting to determine what searchers are looking for, and how Google is most comfortable phrasing that query (i.e. how RankBrain would), you may uncover significant opportunities. The opportunities mean new content that will rank extremely well for high traffic/low competition search terms.
The web is a constantly evolving eco-system, and the only constant in search marketing is change. As Google and others invest more resources into machine learning and understanding the intent of web searchers, it is imperative to stay ahead of the curve.
Just as Google’s RankBrain algorithm continues to teach itself, search marketers must also continue to educate themselves. Probing how web users are finding their site, and how they are structuring their web searches is key. The result is a fuller understanding of how to optimize relevant products, services, and content.