Announcement

Collapse
No announcement yet.

Search.pm mod ro replace characters

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Search.pm mod ro replace characters

    Hello,

    Looking at http://community.actinic.com/showthread.php?t=41320 and the KB I can see it is possible to amend simple searches.

    Would anyone know how I might amend Search.pm to be able to substitute foreign characters for example replace an 'a' for an 'ä'?

    I can see this might be problematic though as searches for an 'a' could be for an 'â', 'ä', 'à' or an 'å'!

    This may not be the solution to my dilemma...

    Another option would be to have a hidden property containing a 'plain character' version of the product title and allowing the search to also search this. From what I understand this is not currently possible unless from the simple search. If only customers knew there alt codes. Id this still the case or is there now a way to include this?

    Alex
    Blog, Twitter, Facebook
    Actinic Ecommerce, CMS and Video production

    www.petraboase.com
    www.progrow.co.uk
    www.christopherpiperwines.co.uk
    www.cheeksandcherries.co.uk
    www.skatewarehouse.co.uk

    #2
    Hmm interesting problem, I'd have thought best way to attack that is to have JavaScript alter the search phrase inserted by a user into a plain string and then proceed as normal with the search. Products are then setup as the plain word(s).

    Comment


      #3
      Hi Lee, thanks for your help.

      Unfortunately it is the other way around. For example a visitor may search for a product that is called "Estèphe" but actually search for "Estephe". As far as Actinic simple search is concerned there are no matches (and rightly so)...

      So if the Search.pm or a Javascript hack were implemented it would simply not know that it should replace the second "e" but not the first or third.

      The only way I can think that might work is to create a custom property called "plain english version" and find a hack to get Actinic to also search that field. That way it could work for both the proper version and the plain english version. Otherwise the only option left would be to write the plain english version in the page and that would be wrong on many levels.

      Out of interest Google manages this rather well and from what I can work out does not seem to differentiate either way from an SEO point of view. I have not tested this scientifically (well I tried it twice...).

      If anyone has any cunning ideas how I might include said custom field in my simple search I will be as happy as a mouse in a cheese factory

      Alex
      Blog, Twitter, Facebook
      Actinic Ecommerce, CMS and Video production

      www.petraboase.com
      www.progrow.co.uk
      www.christopherpiperwines.co.uk
      www.cheeksandcherries.co.uk
      www.skatewarehouse.co.uk

      Comment


        #4
        I think to be indexed you may have to include the base (acccent free) versions of the words in the product description with styling as display:none. Not ideal for seo but may help with searching.

        I also have instances where I would very much like CVs to be indexed as well but it doesn't seem possible as far as I know.

        The other option is of course to use Google search on your site with results restricted to your site, similar ss used on the forum, but certainly not ideal either.

        Comment


          #5
          Very hard to fix. Reason: Catalog.exe creates the search dictionary of words and their locations and uploads it to the server (in a very compressed format). Only Actinic can change that code so at the moment we're stuck with the accented characters being in the dictionary.

          You can limit certain characters from being indexed (via Settings / Search Settings / Search Options / Valid Characters) but that's not going to help as it removes such characters from the word and you won't be able to match with the standard character in the search term.

          A true solution would probably need input from Actinic. Perhaps a list of accented letters and their plain text counterparts. Actinic when creating the search dictionary would replace all accented letters with their plain equivalents. Likewise when parsing the customer entered search terms. More code would perhaps be needed for the search results highlighting so it correctly highlights the true words. Big job.
          Norman - www.drillpine.biz
          Edinburgh, U K / Bitez, Turkey

          Comment


            #6
            I seem to remember from a thread a while back that one option for including additional search terms was to put them in the extended info text.

            So one option might be to put the 'plain' text version into the extended info box (as long as you aren't using the extended info already).

            Mike
            -----------------------------------------

            First Tackle - Fly Fishing and Game Angling

            -----------------------------------------

            Comment


              #7
              Annoyingly we are probably about to unselect this option as the ext info description contains references to other products causing them to show incorrectly (at least in the context of searching for specific products) in search results

              Bit of a teaser this one. Duncans's plan seems the only workable one but it feels a bit underhand in respect to SEO.

              You would think that as it is possible to include the extended info text in searches it would not be that much a stretch to also include a specific custom field. Hey ho. Thanks for your ideas.

              Alex
              Blog, Twitter, Facebook
              Actinic Ecommerce, CMS and Video production

              www.petraboase.com
              www.progrow.co.uk
              www.christopherpiperwines.co.uk
              www.cheeksandcherries.co.uk
              www.skatewarehouse.co.uk

              Comment


                #8
                I wouldn't use display:none. The search engines don't like hidden text and this could easily trip their spam systems.

                Do you use Norman's Tabber or something like that? One option would be to have someplace in your layout for relevant 'Tags' to help the search. This could be a separate tab or just some space at the end of your layout.

                Mike
                -----------------------------------------

                First Tackle - Fly Fishing and Game Angling

                -----------------------------------------

                Comment


                  #9
                  Hang on. Just done an experiment and it seems to be possible to tweak Search.pm to intercept the dictionary words after they've been fetched and replace special characters with plain ones. Do the same for the search terms and it appears to work. Try this:

                  Edit Search.pm (in your Site folder - back it up first).

                  Look for the line:
                  Code:
                  		my $sQuotedFragment = quotemeta($sFragment);
                  Immediately above it put:
                  Code:
                  		$sWord =~ tr/[àáâãäåæçèéêëìíîïòóôõöøùúûüý]/[aaaaaaaceeeeiiiioooooouuuuy]/;	# replace special characters with plain ones
                  		$sFragment =~ tr/[àáâãäåæçèéêëìíîïòóôõöøùúûüý]/[aaaaaaaceeeeiiiioooooouuuuy]/;	# replace special characters with plain ones
                  This will have side effect that search highlighting will only work if the search term uses the same accents as the word. Probably better to turn highlighting off via Settings / Search Settings / Results.
                  Norman - www.drillpine.biz
                  Edinburgh, U K / Bitez, Turkey

                  Comment


                    #10
                    Wow that looks ideal. I will give it a try and see how it works. Thanks very much for this!

                    Alex
                    Blog, Twitter, Facebook
                    Actinic Ecommerce, CMS and Video production

                    www.petraboase.com
                    www.progrow.co.uk
                    www.christopherpiperwines.co.uk
                    www.cheeksandcherries.co.uk
                    www.skatewarehouse.co.uk

                    Comment


                      #11
                      Good luck. Here's a note from Actinic.pm that explains why I'm not going to try to fix the highlighting:
                      # HighlightWords - highlight the specified words in the HTML

                      ####### WARNING WARNING WARNING WARNING WARNING ###############
                      #
                      # The following code does not follow normal Actinic coding standards.
                      # This is code for Perl experts at 11AM after a strong pot of coffee!
                      # It is a special case self-modifying code with run-time generation.
                      #
                      # It is strongly recommended that you review pages 72-73 of the Blue
                      # Camel. This uses "s'PATTERN'CODE_TO_CREATE_REPLACEMENT'gesi" and
                      # exploits the special properties of a "single quote" as a delimiter.
                      #
                      ####### WARNING WARNING WARNING WARNING WARNING ###############
                      Norman - www.drillpine.biz
                      Edinburgh, U K / Bitez, Turkey

                      Comment


                        #12
                        Blimey, sounds like a hard hat and hi-vis vest job.

                        Off to ponder and drink tea...

                        Thanks again
                        Blog, Twitter, Facebook
                        Actinic Ecommerce, CMS and Video production

                        www.petraboase.com
                        www.progrow.co.uk
                        www.christopherpiperwines.co.uk
                        www.cheeksandcherries.co.uk
                        www.skatewarehouse.co.uk

                        Comment


                          #13
                          I read this problem differently
                          able to substitute foreign characters for example replace an 'a' for an 'ä'?
                          so I thought we need to replace 'a' for an 'ä'
                          but Normans solution would be replacing 'ä' for an 'a'

                          replacing foreign characters for english equivilents would be easy, like the poster said, working out which 'e' needs replacing with 'è' is the whole issue

                          so I'm now a little bit confused, which is the solution we are looking for?

                          Comment


                            #14
                            Norman's patch replaces the foreign characters in both the indexed keywords and the search term to find any matches. => Customer will find the product if available.

                            Mike
                            -----------------------------------------

                            First Tackle - Fly Fishing and Game Angling

                            -----------------------------------------

                            Comment


                              #15
                              Hi Kevin,

                              Originally I was wondering if it was possible to substitute a plain English simple search for their correct foreign characters. As you pointed out this is not really an option as there are multiple permutations of what characters might equate to.

                              Then I was looking into an option of displaying the 'plain English' versions of a product name somewhere (ideally in a custom field that didn't need to be displayed anywhere) so that plain English searches would still show results for the correct foreign version. Seems to be a dead end.

                              Now Norman has suggested (with code!) that we could substitute the foreign characters for plain English versions when the search index is created. I am assuming that this would mean that plain English searches would be matched against a plain English search index thus resulting in matches. I am not sure if the searches for the original product title (i.e. using foreign characters) would still work. TBH that is less of a problem as most people don't know the alt codes anyway. It would be more likely to effect cut and paste searches.

                              Alex
                              Blog, Twitter, Facebook
                              Actinic Ecommerce, CMS and Video production

                              www.petraboase.com
                              www.progrow.co.uk
                              www.christopherpiperwines.co.uk
                              www.cheeksandcherries.co.uk
                              www.skatewarehouse.co.uk

                              Comment

                              Working...
                              X