Sooo Many Irrelevant Pages - Process?

Digitaldar

Member
Joined
Dec 8, 2014
Messages
132
I am working with a client that has a ton of content. I did an audit to identify content "that has not seen the light of day" in quite some time - e.g. low pageviews over the past year :)

There were too many URLs to process independently (I figured if we could group them by topic, then we could work with topic themes)... but I am finding that the queries associated with many of these URLs are not relevant.... so grouping is proving challenging - also the URLs are not telltale of the topic so I have to look up each URL in Google Search Console - very time consuming.

My overall goal is to tell another team what to do with these URLs - they can work at them over the course of a year, no problem. But what to do with each one differs. The content is not bad... it likely just needs to be integrated into another topic - e.g. there is an article about the future of sales... it talks about creating a sense of community.... this is not something people search for... but that content could be moved into content about retaining customers... this is difficult to "procedurize" for non SEOs as well...

My brain is tired and it could be that I am missing something obvious here :) If I were to hand you 200 URLs from your site that have low pageviews... what would you do next? I think that the team would benefit from having all my data put into a database so that it could be visually displayed better and they could work on it. :)
 

Beris

Member
Joined
Mar 28, 2019
Messages
6
I struggle with this all the time... All of the spreadsheets and number-crunching only go so far.

I only have part of a process:
  • Nuke all outdated content (We're doing this event on January 2017)
  • Nuke content under XX words (some say 200, others higher)
  • Next, I look at Bounce rates (assuming posts got traffic at some time).

From there it bogs down. I try to go by client's current content need.

  • A tool like Screaming Frog can give you a hint about what the page content might be (Depending on how good the post creators focused, or understood H1s,metas etc).
  • If the site has a search box, I use it to find key terms. Sometimes it help.

    The hard part is letting go of all those words (or getting the client to let go).

    Please see my DM
 

pony

Member
Joined
Jun 26, 2019
Messages
82
If it's only 200 URLs I would do something similar to what Beris suggested:

1.) Analyze traffic/rankings/backlinks etc to the URLs to create an understanding of how valued each page is

2.) Individually go through the list to determine whether the content should:
a.) be redirected to a related piece of content already on the site
b.) be updated so that it targets a topic worth competing for
c.) be combined with a separate piece of topic as to not create cannibalizing pieces of content on the same site

We just did a similar practice with a site that had thousands of URLs that needed to be migrated to another site after an acquisition.
 

brettmandoes

Member
Joined
Nov 2, 2018
Messages
67
I am working with a client that has a ton of content. I did an audit to identify content "that has not seen the light of day" in quite some time - e.g. low pageviews over the past year :)

There were too many URLs to process independently (I figured if we could group them by topic, then we could work with topic themes)... but I am finding that the queries associated with many of these URLs are not relevant.... so grouping is proving challenging - also the URLs are not telltale of the topic so I have to look up each URL in Google Search Console - very time consuming.

My overall goal is to tell another team what to do with these URLs - they can work at them over the course of a year, no problem. But what to do with each one differs. The content is not bad... it likely just needs to be integrated into another topic - e.g. there is an article about the future of sales... it talks about creating a sense of community.... this is not something people search for... but that content could be moved into content about retaining customers... this is difficult to "procedurize" for non SEOs as well...

My brain is tired and it could be that I am missing something obvious here :) If I were to hand you 200 URLs from your site that have low pageviews... what would you do next? I think that the team would benefit from having all my data put into a database so that it could be visually displayed better and they could work on it. :)
Can you help me understand a couple things? Are we talking about only 200 URLs, or is it more like 200,000 URLs? I have different processes for each.

And two more questions:
1. Is there an ecommerce component to this?
2. Is any of the content auto-generated or parameterized?
 

brettmandoes

Member
Joined
Nov 2, 2018
Messages
67
Thank you, that simplifies things. I think 2200 is easily doable. Organize a spreadsheet with these columns: the URL, all sessions, organic entrances, word count, topic.

First, filter out anything that doesn't meet your threshold for each column. The rest is business critical content that you need to pay attention to. Copy that into a new sheet, and begin filling in the topic column. That part is manual. For some pages you may find having two or three topic columns is preferable. At the end of this you'll have content groupings that help inform how your new URLs should be structured with all the business critical content.

After that you break each topic into its own spreadsheet and conduct another manual content audit, this time deciding what to do with each URL. Mark a new column "decision" or something similar. This might be merge, rewrite, thin out, needs images, etc. A new information architecture will be born at this point. Business inputs are just as important here, not just SEO considerations.

From there just follow standard SEO best practices. Ensure top level pages are in navigation and deeper pages have solid internal linking to back them up, etc.
 

ClickThrive

Member
Joined
Sep 11, 2019
Messages
3
I just removed some 'zombie pages' for a current client with about 2k pages. Here is my process:
  • First I backed up the site so I can always grab the content I'm about to delete.
  • I added the sitemap URLs to a Google spreadsheet and extracted the slug from each page.
  • Skimmed and removed all pages I have no-followed (like thank-you pages) and just important pages I wanted to keep like the homepage and new pages.
  • I went into Google Analytics and grabbed pages that had traffic in the last 3 months. (Why 3 months? My personal choice. I think it would be a good start, and continue to slim over time.)
  • Extracted slugs from the pages w/traffic.
  • Placed all slugs into the same column and removed duplicates. This should leave you with pages with NO traffic in the past 3 months.
  • Re-skimmed the list and removed any pages I wanted to keep.
  • Grouped the pages into categories and 301'd them to relevant pages. This can take a while depending on how relevant you want your 301s.
  • deleted the pages.
Like Rich Ownings said: don't forget to check for backlinks! I don't believe any pages had backlinks, but I am relying on an Ahrefs site audit to let me know if I done messed up.

Would like to hear any opinions on this tactic. It's been a week and I've already seen improvements in some keywords.
 

BipperMedia

Member
Joined
Jan 13, 2019
Messages
142
I haven't seen anyone mention the use of site query yet... and I apologize if someone hinted at it and I may have missed it...

We refer to this clean up / optimization process as defragmenting (cleaning up, optimizing) an otherwise fragmented environment (your website, large scale of redundant pages).

I like the analogy (and this is going to give away my age a bit here...) of defragmenting a Windows XP computer. Do you remember the defragmentation tool? You could go in there, run a defrag process, and supposedly this would reorganize and optimize your entire hard drive -- not sure how effective it was... but I think the analogy is relevant!

The process goes something like this:

First, we use a combination of a Google sheet for logging all of our data, and second we start with using Google itself as our research tool.

1) we identify all of the main top level keyword phrases / top level topics / top category phrases, etc... the main keyword phrases relevant to the business, products, and services

2) we start with site querying the domain for these top level keyword phrases

By site querying the domain, we are now using Google as a research tool to tell us what they (Google) sees as the highest authoritative pages within the site for any given query

Also, we make sure the search result option is opened up to view the first 10 pages (100 results) all on one page... this will become more important in the later steps.

3) We look at the top ranked page within the site query results to determine if it's the most relevant page to label our "top category page" for that query (product, service, etc...)

If not, then we continue looking down through the results of the site query to identify the most relevant page.

4) Once the top level page is identified, we add it as the top level page in our sheet

5) then we use "cmnd + f" (mac) or "ctrl + f" (pc) to search the search results page... we simply type the top level keyword phrase into our little search bar, and this will highlight our keyword / topic within all of the pages in the search results of the site query

6) we start by identifying all of the pages that have the keyword phrase in the title (meta title) within the search results, and then we start logging these URL on our sheet.

First, we refer these URL's as "redundant" pages because they are... well... redundant and they are directly stealing equity away from your main (one / single) top tier targeted page.

7) depending on the scale of the site, we typically have a cut off point like 10 or 20 results from the query deep... if the site only has thousands of pages (and not 10's of thousands of pages) then you'll start to over analyze pages.

Plus, since we are using Google as the tool in this case, based on your site query, Google is ranking the pages within your site according to authority.

So once you get beyond the top 10 or 20 for any given site query, you are really getting down into thin relevance of authority anyway.

8) once we have our defined list of redundant pages that are stealing equity from our main page, we start analyzing the content on these redundant pages to identify pieces of content we can move over to the top tier page.

9) then we start migrating content from redundant pages over to the top tier page and we implement onsite site structuring techniques such as table of contents with jump links, <h2> sub titles, lists, etc... to organize the page.

At this point we are also looking for Schema / JSON markup opportunities, especially the amazing and powerful markup for FAQ's.

10) WE NEVER DELETE PAGES... please don't ever delete your pages! We simply 301 redirect all of the redundant pages up to the top tier page. By 301'ing the URLs (vs. deleting them) you then channel any existing page authority up to your new top tier page, which helps to drive authority.

The net result with all of the above is:
  • a clean, highly structured website
  • long form, high quality content on a single top tier / authoritative page
  • increased featured snippet presence from the top tier page
  • increased presence of site links from top tier page
  • and massive channeling of authority that's been shored up to your "few" pages by 301'ing the URL's up to the top tier URL
The channeling of your authority is, in my opinion, the magic that unlocks the power of this entire process.

Here's a screenshot of one the sites we ran through this defragmentation process... just to give some encouragement and trust behind this workflow:

 

Weekly Digest

Weekly Digest
Subscribe/Unsubscribe

Trending: Most Viewed

Promoted Posts

New advertising option: A review of your product or service posted by a Sterling Sky employee. This will also be shared on the Sterling Sky & LSF Twitter accounts, our Facebook group, LinkedIn, and both newsletters. More...

Local Search Forum


Google Product Exert

@LocalSearchLink

Join Our Facebook Group

Top