flag of the United Kingdom
URBAN
Mainframe

User Comments

(for: Throttling Down)
1 | Posted by: Gabriel Mihalache (Registered User) | ~ 1 year, 4 months ago |

Who was it? Lately, MSNbot has requested pretty much all of my content in a matter of hours. At least I know that its searches will return correct paths, unlike Google who still returns URI which haven’t been valid for a year or more!

2 | Posted by: DarkBlue (Registered User) | ~ 1 year, 4 months ago |

Who was it?

It appeared to be a home-grown bot Gabriel, using LWP. 20+ IP addresses were used during the spidering. I should have used whois on the IPs, but I didn’t. I was panicking as I tried to get the server back up.

3 | Posted by: DarkBlue (Registered User) | ~ 1 year, 4 months ago |

My current throttle profile is as follows:

   # Keep track of up to 1024 IP addresses at a time;
   # Use a time period of 100 seconds;
   # Suspend service after 50 page requests
   ThrottleClientIP 1024 Document 50 100
   # No single client can make more than 20 requests per second:
   ThrottlePolicy Request 20 1

Whilst everything seems to be working okay presently, I’m not sure if the above policy is suitable.

Any advice will be gratefully received.

Your Comments
  • Formatting your comments
  • A valid email address is only required if you wish to receive notifications of new comments posted in relation to this page


remember my details:
notify me of new comments:


W3C VALIDATE XHTML
W3C VALIDATE CSS