Posted by AndrewMiller Too often, we assume that SEO best practices will work in any industry against any competitive set. But most best practices go untested and may not be “best” in every situation. We all know that tactics that worked in 2020 won’t necessarily move the needle in 2021 as Core Web Vitals (CWV) and other signals shuffle to the front. We have to do better for our businesses and our clients. I’m a data nerd at heart with lots of battle scars from 15 years in SEO. The idea of analyzing thousands of local SERPs sounded like too much fun to pass up. I found some surprising correlations, and just as importantly, built a methodology and data set that can be updated quarterly to show changes over time. I analyzed 50,000+ SERPs in the retail banking sector so I could make sense of the massive shifts in rankings and search behaviors during the lockdown period. We have a lot of historical data for bank websites, so comparing pre/post COVID data would be easier than starting from scratch. I’ll share how I did it below. But first, I want to share WHY I think sharing this type of research is so important for the SEO community. Why validate SEO best practices with data?It’s a great time to be an SEO. We have amazing tools and can gather more data than ever. We have thriving communities and excellent basic training materials. Yet, we often see our craft distilled into overly-simplified “best practices” that are assumed to be universally true. But if there’s one universal truth in SEO, it’s that there are no universal truths. A best practice can be misinterpreted or outdated, leading to missed opportunities or outright harm to a business. Using the increasing importance of CWV as an example, SEOs have an opportunity (and obligation) to separate fact from fiction. We need to know if, and by how much, CWV will impact rankings over time so we can prioritize our efforts. We can elevate our SEO game individually and collectively by testing and validating best practices with research. It just takes a curious mind, the right tools, and a willingness to accept the results rather than force a narrative. Failing to validate best practices is a liability for SEO practitioners and shows an unwillingness to challenge assumptions. In my experience, a lack of data can lead to a senior stakeholders’ opinions carrying more weight than an SEO expert’s recommendations. Start by asking the right questionsReal insight comes from combining data from multiple sources to answer critical questions and ensure your strategies are backed by valid data. In my analysis of local banks, I started by listing the questions I wanted to know the answers to:
You could run this same analysis by replacing “banks” with other local business categories. The list of potential questions is endless so you can adjust them based on your needs. Here’s an important reminder - be prepared to accept the answers even if they are inconclusive or contradictory to your assumptions. Data-driven SEOs have to avoid confirmation bias if we’re going to remain objective. Here’s how I analyzed 50,000 search results in a few hoursI combined three of my favorite tools to analyze SERPs at scale and gather the data needed to answer my questions:
Step 1: Determine your data needsI used US Census Bureau data to identify all cities with populations over 100,000, because I wanted a representation of local bank SERPs across the country. My list ended up including 314 separate cities, but you could customize your list to suit your needs. I also wanted to gather data for desktop and mobile searches to compare SERP differences between the device types. Step 2: Identify your keywordsI chose “banks near me” and “banks in {city, st}” based on their strong local intent and high search volumes, compared to more specific keywords for banking services. Step 3: Generate a STAT import file in .csv formatOnce you have your keywords and market list, it’s time to prepare the bulk upload for STAT. Use the template provided in the link to create a .csv file with the following fields:
Each market, location, and device type will multiply the number of keywords you must track. I ended up with 1,256 keywords (314 markets X 2 keywords X 2 devices) in my import file. Once your file is complete, you can import to STAT and begin tracking. Step 4: Run a Top 20 Report in STAT for all keywordsSTAT’s built-in Google SERP Top 20 Comparison report captures the top 20 organic results from each SERP at different intervals (daily, weekly, monthly, etc.) to look at changes over time. I did not need daily data so I simply let it run on two consecutive days and removed the data I did not need. I re-run the same report quarterly to track changes over time. Watch the video below to learn how to set up this report! My 1,256 keywords generated over 25,000 rows of data per day. Each row is a different organic listing and includes the keyword, monthly search volume, rank (includes the local pack), base rank (does not include the local pack), https/http protocol of the ranking URL, the ranking URL, and your tags. Here’s an example of the raw output in CSV format: It’s easy to see how useful this data is by itself but it becomes even more powerful when we clean it up and start crawling the ranking URLs. Step 5: Clean up and normalize your STAT URLs dataAt this point you may have invested 1-2 hours in gathering the initial data. This step is a bit more time consuming, but data cleansing allows you to run more advanced analysis and uncover more useful insights in Screaming Frog. Here are the changes I made to the STAT rankings data to prepare for the next steps in Screaming Frog and Power BI. You’ll end up with multiple columns of URLs. Each serves a purpose later.
The 25,000+ rows in my data set are well within Excel’s limitations, so I am able to manipulate the data easily in one place. You may need to use a database (I like BigQuery) as your data sets grow. Step 6: Categorize your SERP results by website typeSkimming through the SERP results, it’s easy to see that banks are not the only type of website that rank for keywords with local search intent. Since one of my initial questions was SERP composition, I had to identify all of the different types of websites and label each one for further analysis. This step is by far the most time consuming and insightful. I spent 3 hours categorizing the initial batch of 25,000+ URLs into one of the following categories:
Your local SERPs will likely contain many of these website types and other unrelated categories such as food banks. Speed up the process by sorting and filtering your TLD and Normalized URL columns to categorize multiple rows simultaneously. For example, all the yelp.com rankings can be categorized as “Reviews” with a quick copy/paste. At this point, your rankings data set is complete and you are ready to begin crawling the top-ranking sites in your industry to see what they have in common. Step 7: Crawl your target websites with Screaming FrogMy initial STAT data identified over 6,600 unique pages from local bank websites that ranked in the top 20 organic search results. This is far too many pages to evaluate manually. Enter Screaming Frog, a crawler that mimics Google’s web crawler and extracts tons of SEO data from websites. I configured Screaming Frog to crawl each of the 6,600 ranking pages for a larger analysis of characteristics shared by top-ranking bank websites. Don’t just let SF loose though. Be sure to configure it properly to save time and avoid crawling unnecessary pages. These settings ensure we’ll get all the info we need to answer our questions in one crawl: List Mode: Paste in a de-duplicated Full URL list from your STAT data. In my case, this was 6,600+ URLs. Database Storage Mode: It may be a bit slower than Memory (RAM) Storage, but saving your crawl results on your hard disk ensures you won’t lose your results if you make a mistake (like I have many times) and close your report before you finish analyzing the data. Limit Crawl Depth: Set this to 0 (zero) so the spider will only crawl the URLs on your list without following internal links to other pages on those domains. APIs: I highly recommend using the Pagespeed Insights Integration to pull Lighthouse speed metrics directly into your crawl data. If you have a Moz account with API access, you can also pull link and domain data from the Moz API with the built-in integration. Once you have configured the spider, let it rip! It could take several minutes to several hours depending on how many URLs you’re crawling and your computer’s speed and memory constraints. Just be patient! You might try running larger crawls overnight or on an extra computer to avoid bogging your primary machine down. Step 8: Export your Screaming Frog crawl data to ExcelDumping your crawl data into Excel is remarkably easy. Step 9: Join your data sets in Power BIAt this point, you should have two data sources in Excel: one for your STAT rankings data and another for your Screaming Frog crawl data. Our goal is to combine the two data sources to see how organic search rank may be influenced by on-page SEO elements and site performance. To do this, we must first merge the data. If you have access to a Windows PC, the free version of Power BI is powerful enough to get you started. Begin by loading your two data sources into a new project using the Get Data wizard. Once your data sets are loaded, it’s time to make the magic happen by creating relationships in your data to unlock correlations between rankings and site characteristics. To combine your data in Power BI, create a many-to-many relationship between your STAT Full URL and Screaming Frog Original URL fields. If you are new to BI tools and data visualization, don’t worry! There are lots of helpful tutorials and videos just a quick search away. At this point, it’s really hard to break anything and you can experiment with lots of ways to analyze your data and share insights with many types of charts and graphs. I should note that Power BI is my preferred data visualization tool but you may be able to use Tableau or some equally powerful. Google Data Studio was not an option for this analysis because it only allows for left outer joins of the multiple data sources and does not support “many-to-many” relationships. It’s a technical way of saying Data Studio isn’t flexible enough to create the data relationships that we need. Step 10: Analyze and visualize!Power BI’s built-in visualizations allow you to quickly summarize and present data. This is where we can start analyzing the data to answer the questions we came up with earlier. Results — what did we learn?Here are a couple examples of the insights gleaned from merging our rankings and crawl data. Spoiler alert — CWV doesn’t strongly impact organic rankings….yet! Who are banks actually competing against in the SERPs? Is it primarily other banks?On desktops, about 67% of organic search results belong to financial institutions (banks and credit unions) with heavy competition from reviews sites (7%) and online directories (22%). This information helps shape our SEO strategies for banks by exposing opportunities to monitor and maintain listings in relevant directories and reviews sites. Okay, now let’s mash up our data sources to see how the distribution of website categories varies by rank on desktop devices. Suddenly, we can see that financial institutions actually occupy the majority of the top 3 results while reviews sites and directories are more prevalent in positions 4-10. How important are Core Web Vitals (CWV) for rankings? How does this change over time?Site performance and site speed are hot topics in SEO and will only become more important as CWV becomes a ranking signal in May this year. We can begin to understand the relationships between site speed and rankings by comparing STAT rankings and Pagespeed Insights data from Screaming Frog reports. As of January 2021, sites with higher Lighthouse Performance Scores (i.e. they load faster) tend to rank better than sites with lower scores. This could help justify investments in site speed and site performance. Some CWV elements correlate more closely with better rankings and others are more scattered. This isn’t to say CWV aren’t important or meaningful, but rather it’s a starting point for further analysis after May. So what? What can we learn from this type of analysis?Separately, STAT and Screaming Frog are incredibly powerful SEO tools. The data they provide are useful if you happen to be an SEO but the ability to merge data and extract relationships will multiply your value in any organization that values data, and acts on insights. Besides validating some generally accepted SEO knowledge with data (“faster sites are rewarded with better rankings”), better use of relational data can also help us avoid spending valuable time on less important tactics (“improve Cumulative Layout Shift at all costs!”). Of course, correlation does not imply causation, and aggregated data does not guarantee an outcome for individual sites. But if you’re a bank marketing professional responsible for customer acquisition from organic channels, you’ll need to bring this type of data to your stakeholders to justify increased investments in SEO. By sharing the tools and methodology, I hope others will take it further by building and contributing their additional findings to the SEO community. What other datasets can we combine to deepen our understanding of SERPs on a larger scale? Let me know your thoughts in the comments! Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don't have time to hunt down but want to read! from The Moz Blog https://ift.tt/2OlUWJ1 More on
0 Comments
Leave a Reply. |
Authorhttps://seouk4.weebly.com/ Archives
September 2021
Categories |