Announcement

Collapse
No announcement yet.

Search script deficiencies

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Search script deficiencies

    We experienced two DOS (denial of service) attacks which utilized a weakness in the Actinic search scripts today.

    The attack was caused by a robot or user agent which ignores the robots.txt file - as such, it was continually attempting to call the Actinic search script ss000002.pl on target site. This script uses a large amount of RAM on larger sites, and repeated calls to the script deplete physical RAM on the server fairly quickly, causing the machine to use Virtual RAM (which is a many thousand times slower than chip RAM).

    When this occurred, we had to reboot the server the first time to get it back under control. The second time this happened, we were able to identify the culprit, block their IP (which was in Poland somewhere), then disable the search script temporarily while we installed a fix for the problem.

    The fix we installed was inside the client's .htaccess file.

    The .htaccess file now contains a list of user-agents which will no longer be served - this particular user-agent was WebStripper 2.53.

    Once we had installed this "fix" - we renabled the Actinic search script.

    We should point out that this DOS attack was more a set of circumstances than a deliberate attempt to bring the server down, but we had brought this problem to the attention of Actinic software many months ago. To-date, no other really workable solution has been provided by Actinic.

    This message is being published so that site owners can install their own blocking .htaccess script to prevent their site causing their hosting servers to melt-down. This fix will ONLY work if you have the ability to rewrite URLs using .htaccess, which will imply that your server has MOD_REWRITE or it's equivalent. Please ask your web hosting provider if you have this, or ask Actinic to provide a more permanent solution.

    Example .htaccess script follows:

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} ^DISCo\Pump.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Drip.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Gets.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^IBrowse.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\Ninja.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JustView.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\tool.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister\PiX.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\Explorer.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\Foto.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Pockey.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Slurp.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SpaceBison.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\Image\Collector.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\Sucker.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Webster.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Wget.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ia_archiver.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^lftp.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper.* [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut.*
    RewriteRule .[Jj][Pp][Gg]*$ http://www.yourdomain.com/leeches.html [L]


    You should also create an appropriate leeches.html file in your domain, and change the reference to www.yourdomain.com to YOUR DOMAIN.


    regards

    Web Your Business Support
    Web Design & Ecommerce - Affordable Web Hosting
    Free and low cost Merchant Accounts coming soon..
    NOD32 Antivirus - Reciprocal Links for Actinic Sites ONLY

    #2
    I've been seeing some webstrippers passing through, and not being a mod_rewrite "expert", I've been experimenting with the .htaccess file - the current version is:

    RewriteEngine On
    RewriteCond %{HTTP_USER_AGENT} "^DISCo\Pump" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Drip" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^EirGrabber" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^ExtractorPro" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^EyeNetIE" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^FlashGet" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^GetRight" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Gets" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Go!Zilla" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Go-Ahead-Got-It" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Grafula" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^IBrowse" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^InterGET" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Internet\Ninja" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^JetCar" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^JustView" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^MIDown\tool" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Mister\PiX" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^NearSite" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^NetSpider" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Offline\Explorer" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^PageGrabber" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Papa\Foto" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Pockey" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^ReGet" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Slurp" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^SpaceBison" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^SuperHTTP" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Teleport" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebAuto" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebCopier" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebFetch" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebReaper" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebSauger" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebStripper" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebWhacker" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebZIP" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Web\Image\Collector" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Web\Sucker" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Webster" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^Wget" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^eCatch" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^ia_archiver" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^lftp" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^WebStripper" [OR]
    RewriteCond %{HTTP_USER_AGENT} "^tAkeOut"
    RewriteRule .[Jj][Pp][Gg]*$ http://www.yourdomain.com/leeches.html [L]


    if anyone on the list has an improved list or improved format for this .htaccess blocklist, please contribute it!

    regards

    Web Your Business Support
    Web Design & Ecommerce - Affordable Web Hosting
    Free and low cost Merchant Accounts coming soon..
    NOD32 Antivirus - Reciprocal Links for Actinic Sites ONLY

    Comment


      #3
      I would suggest your don't kill links for Slurp as it is the Inktomi bot - we pay for listings at Inktomi so we wouldn't want to prevent it from spidering us!
      Matt
      Actinic User since v.3

      Custom Actinic Site Specialist:
      <a href="http://www.glowsticksdirect.co.uk/">GlowSticksDirect.co.uk</a>
      <a href="http://www.digishopdirect.co.uk/">DigiShopDirect.co.uk</a>
      <a href="http://www.calibreshopping.co.uk/">CalibreShopping.co.uk</a>

      Comment


        #4
        That's a good point for main - our finders are that Slurp is in the list because it ignores robots.txt (it retrieves it, and promptly ignores it in our experience).

        Having banned slurp for almost 2 years, our own site still ranks extremely well in Inktomi powered engines - you are HIGHLY advised to tailor the list to your own uses, accepting or rejecting ANY of the lines within the list.

        Slurp may become more important since Yahoo acquired Inktomi - you should make your own judgement call on each and every line within the .htaccess file we provided.
        Web Design & Ecommerce - Affordable Web Hosting
        Free and low cost Merchant Accounts coming soon..
        NOD32 Antivirus - Reciprocal Links for Actinic Sites ONLY

        Comment


          #5
          very effective if you really want your site removed from all the major search engines but on the whole checking http_user_agent doesn't work because robots likely to cause the problem usually have the option hide or forge their http_user_agent!

          Comment

          Working...
          X