Announcement

Collapse
No announcement yet.

Search console strange results?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Search console strange results?

    Hi,

    in the google search console in the section "Alternative page with proper canonical tag" it has a lot of strange results. which is showing a html page after a html page

    for example https://www.reverseosmosisworld.co.u...-Monitors.html

    The user-declared conical is https://www.reverseosmosisworld.co.u...r-Filters.html

    Why is this happening and what is the fix please?
    Reverse Osmosis Water Filters

    #2
    I've no idea where Google is getting those page links from. They're not on any of your pages and not in your sitemap xml file and it looks to me as if your search function seems to be working OK.

    The other odd thing is that your server is even serving up a page for a URL like "Countertop-Water-Filters.html/Flow-Monitors.html" (note the.xxx.html/xxx.html). If you fix that so that the server returns a 404 error then the problem will go away. Your host would be the best bet to fix that problem.

    Mike

    PS. You're also missing an image on your server: uv-g12_steriliser.jpg
    -----------------------------------------

    First Tackle - Fly Fishing and Game Angling

    -----------------------------------------

    Comment


      #3
      Originally posted by mythandmagic View Post

      in the google search console in the section "Alternative page with proper canonical tag" it has a lot of strange results. which is showing a html page after a html page
      I have been getting similar emerging issues since August 2023 when I upgraded from v16.1.2 to v18.2.2 not to suggest that the v16 site import required for upgrade to v18 may have had an impact.

      It was noticed after upgrading that the "google-products.xml" file no longer contained duplicate products and more recently, previously not noticed, that this also affected the "sitemap.xml" file.

      Both these issues were fixed with Sellerdeck support when first noticed but this has made no difference to the indexing issues.

      After seeing this post, I checked again and see that the number of pages indexed green by Google Search console for our site roughly corresponds to those listed on the "google-products.xml" file, whereas, the listing on the "sitemap.xml" file lists fewer products including duplicates.

      I have not fathomed out yet why this is occurring but it does not seem to be impacting search performance for the 'green' indexed products.

      Strangely now our site has more 'grey' no-index products similar to your example than 'green' indexed products on google search console and it appears that section/sub-section .html pages are getting joined up somehow to create the not-indexed pages that do not exist on the server.

      Our site is set up using many duplicated products and I would ask if you also have duplicated products and also if you have set up any page redirects recently that possibly could confuse Google Search bots?

      Is anybody else experiencing similar problems with indexing on Google Search Console?
      Martin
      Mantra Audio

      Comment


        #4
        Back with Sellerdeck support on this one!

        What I think is happening is when the google search console crawl bot finds a product that exists within a section / sub-section on the website that is not included on the sitemap.xml then it is throwing a wobbly and flagging up the 'xxx.html/xxx.html' Not indexed - Alternate page with proper canonical tag - Failed.

        Example from my site:
        https://shop.mantra-audio.co.uk/acat....html/C-E.html

        https://shop.mantra-audio.co.uk/acat...ge-MA1638.html - exists on the website but is not included on the sitemap.xml so apparently the product is not indexed on google search console.

        Need to ensure that all active products (including duplicates) are incorporated on the sitemap.xml file first to see if this fixes the problem.
        Martin
        Mantra Audio

        Comment


          #5
          I to have now seen the grey line of unindexed pages in search console dramatically increase recently. My green indexed pages looks stable still.
          I have not upgraded for a couple of years so nothing I have done.
          I assume its a change in how Google is seeing the site or sitemap, worrying if it is harming performance.
          I am hosted with sellerdeck, wondering if that is the same of the others affected.
          It seems to be related to how sellerdeck treats canonical tags and pagination.

          See attached screenshot

          Click image for larger version  Name:	Untitled.png Views:	0 Size:	51.1 KB ID:	557279
          https://www.harrisontelescopes.co.uk/

          Ed Harrison - Menmuir Scotland

          Comment


            #6
            I raised a ticket yesterday for this, sent a support snapshot and the search console report, it has been progressed to development so I'll update the thread once I hear.
            https://www.harrisontelescopes.co.uk/

            Ed Harrison - Menmuir Scotland

            Comment


              #7
              I have now got all products (original + duplicates) fully indexed in the "sitemap.html / sitemap.xml" feed into Google Search Console indexing and these are all now shown properly indexed green on Google search console.

              However this has made no difference to the grey "not-indexed" pages that has now significantly exceeded the green properly "indexed" pages.

              When I click on an example "Not indexed - Alternate page with proper canonical tag" and click "INSPECT URL", I get an image of the issue like the screenshot below:

              Click image for larger version

Name:	GoogleNotIndexedPageScreenshotExample.jpg
Views:	253
Size:	199.9 KB
ID:	557287

              Not sure why the "/cgi-bin/*.pl..." perl script bits are impeding indexing.

              There is an option to index these pages but I cannot to see the point of the duplication involved in doing this as I consider the perl script bits should not be parsed in the first place by the Google bots.

              I have sent a copy of the above snapshot to Sellerdeck support on the ticket I have open.

              I also see that there is an option to enable a link to share the report with a third party and I have offered to enable this for Sellerdeck support - waiting response.
              Martin
              Mantra Audio

              Comment


                #8
                Support have sent a possible fix but suggested 18.23 is the answer, I'm snookered with that option as I'd lose my payment method to take cards.
                I get a preferential rate from PayPal to have all transactions through them.
                Splitting it between clearaccept and PayPal would cost dearly and make my simple accounting much more complicated.
                I do not know of any other website provider who only offers one (their own) card payment method...
                So anyone on 18.23 seeing a big increase in the grey unindexed pages?
                https://www.harrisontelescopes.co.uk/

                Ed Harrison - Menmuir Scotland

                Comment


                  #9
                  Martin

                  I think the problem in your latest post might be that you have 'Highlight Located Text' set in your search settings (in the 'Results' tab).

                  To do this sellerdeck shows the products found in a page generated by the scripts rather than the normal static page so it can highlight the text found. Presumably sellerdeck is using exactly the same page content, just changing the text colour where it finds the search words.

                  So for example, if you do a search on your site for 'Cartridge' and then click one of the products you get taken to the generated page using the sh00000001.pl script with 'cartridge' highlighted rather than the original html one.

                  This is not impeding indexing though. All the Google search console is saying is that Google found the generated page but isn't indexing it as it knows from the canonical where the real page is.

                  You could try disabling 'Highlight located text' and see if the results page then takes you directly to the html product page.

                  Mike
                  -----------------------------------------

                  First Tackle - Fly Fishing and Game Angling

                  -----------------------------------------

                  Comment


                    #10
                    Ed,

                    Support have sent a possible fix but suggested 18.23 is the answer.
                    Aha, I thought. If Sellerdeck have made a fix for this in 18.2.3 or earlier then it will be detailed in the release notes. I certainly couldn't spot anything but maybe I just missed it.

                    I wonder if this might be the same issue Martin is having? In which case see my answer above.

                    Mike

                    -----------------------------------------

                    First Tackle - Fly Fishing and Game Angling

                    -----------------------------------------

                    Comment


                      #11
                      Thanks Mike, I'll try that and the SD fix and watch closely...
                      https://www.harrisontelescopes.co.uk/

                      Ed Harrison - Menmuir Scotland

                      Comment


                        #12
                        Originally posted by Mike Hughes View Post

                        You could try disabling 'Highlight located text' and see if the results page then takes you directly to the html product page.
                        Mike

                        Thank you, I have made the change suggested and uploaded the site and the search results now take me directly to the respective html product pages without featuring the sh000001.pl script in the URL link(s).

                        I don't understand why the Not-indexed pages containing the sh000001.pl scripts are getting flagged up now by Google Search Console bots as they do not exist on the hosted site server and I had the same Search setting | Results | Highlight located text enabled using v16.1.2 before upgrading to v18.2.2 without any problems presented by Google Search Console.

                        It will be interesting to see if this change makes a difference.

                        BTW there is an entry in v18.2.3 Release Notes concerning the search script but I don't have any detail of this update and it may not be relevant to issue I have raised.
                        Fixed a potential vulnerability in the search script, SD-9102
                        Martin
                        Mantra Audio

                        Comment


                          #13
                          You could try disabling 'Highlight located text' and see if the results page then takes you directly to the html product page.

                          I am also trying this, I will post back soon with some results.
                          Reverse Osmosis Water Filters

                          Comment


                            #14
                            My highlight was unchecked, the majority of my recent rise in unindexed pages are due to pagination
                            https://www.harrisontelescopes.co.uk/

                            Ed Harrison - Menmuir Scotland

                            Comment


                              #15
                              The changes I tried have made no difference to reduce the Google Search Console "not indexed" pages appearing so I raised the issue "Googlebot smartphone failing" on Google Search Central Community and received the response today copied for information below:
                              Validation is for when the status is wrong, you've fixed the technical issue that caused it, and hence expect the status to change next time crawled. Validation is checking that the status DOES change.

                              These sound like ‘bogus' URLs (don’t represent real pages), and so correct to not be indexed.

                              ... so validation is inappropriate. if you use it will fail. (because the status is not changing, nothing was fixed, they still not valid URLs)

                              No action is needed on these. No 'solution' needed. You don't need to 'fix' everything in the not-indexed, section. Its NORMAL to have URLs crawled, not indexed (for many reasons)

                              Having a large number of 'not indexed' URLs is not in itself bad.

                              https://support.google.com/webmasters/thread/202184638/do-i-need-to-fix-all-pages-that-aren-t-indexed-aka-page-indexing-issues?hl=en
                              The linked poster page dated Feb 16, 2023 is worth a look as it is helpful to identify “not indexed” pages that need to be fixed from those where No ‘solution’ is needed - the majority if not all in my case.
                              Martin
                              Mantra Audio

                              Comment

                              Working...
                              X