Perl Programmer/Consultant
Remote System Administrator
Free Software
... contact me
 
  while ($making_other_plans) { life(); }
  location('ipsstb', 'internet');

 For Web Designers 2017-05-26 01:50:38 UTC Mail Delivery Problems? 

Saturday, September 28 2013

Wuala

As you might guess a guy like me ought to be, I'm absolutely compulsive about avoiding data loss. I provide an off-site backups service for my clients, and of course I take backups of my own stuff to the same server upon which I keep client data, and I keep another external drive sync'd with the most important stuff from my workstation. But that's not enough because I need off-site backups for my own data, too. I also need to keep a few devices synchronized. But I can't use the most popular services because they don't provide the level of security that I demand, the same level I provide for my clients.

Enter Wuala. It's similar to Google Drive and Dropbox, but with one very vital distinction: It's encrypted end-to-end, not just in transit. That means that the good folks of Wuala can't get at my data. It's mine. They won't be indexing my data to figure out which ads to serve to me. Someone getting hold of the files I've stored there, which are encrypted, won't be reading them any time soon.

The synchronization service ensures that my netbook has the files I consider most important on my workstation, up to date, when I hit the road. Any work I do while away is already on my workstation when I get back to the office -- no more fooling around with USB sticks, no more rsync'ing to get everything caught up. I just keep the Wuala client running, and the rest happens automagically.

And they give up 5GB of storage for free.

If you need something like that, give Wuala a try.

→ committed: 9/28/2013 00:34:38

[ / internet] permanent link

Comments: 0    Trackbacks: 0

   

Monday, August 06 2012

Bad Bots Must Be Punished.

I periodically look through my web server logs to pick out things that are not as they should be. You might recall from previous blog entries that I operate spam traps and so on -- last night I picked out of my server logs that some critter calling itself MJ12bot was going where no legitimate bots belong. But it's apparently trying to be a good bot because it leaves its calling card:

173.242.125.206 - - [06/Aug/2012:01:28:42 -0600] "GET /robots.txt HTTP/1.0" 200 1247 "-" "Mozilla/5.0 (compatible; MJ12bot/v1.4.3; http://www.majestic12.co.uk/bot.php?+")"

So off I go to that URI, and find that the folks who run the thing have said "If you have reason to believe that MJ12bot did NOT obey your robots.txt commands, then please let us know via email..." And so I did.

We discussed the matter via email a bit, and it seems probable that their bot encountered some kind of network error when it tried to grab my robots.txt file. Not an error response from my server, but a failure to even contact my server. To my way of thinking, in a case like that a properly designed bot will try again to get that file, and will not crawl the site until it gets either the file or a verifiable 404 Not Found. Not MJ12bot, though. The network failure is treated as if it were a 404, and is taken to mean that the whole darn site is wide open to them. Here's what their guy Alex said:

Sadly it's very difficult for us to diagnose this case - as you can see from your logs our bot grabs robots.txt, so we are not intentionally breaking your directives, it's just if bot could not get robots.txt then it could not obey it :(

Huh? Your bot encounters a network error and that gives you license to crawl my site in violation of my terms of service? It seems to me that if you know your crawler is broken in that way, which you do now, and you continue to run it knowing it's broken in that way, then what you're doing is willful negligence and that makes it intentional.

No worries here. I've informed the folks behind the thing that their bot is no longer welcome here and any connections it makes will be considered trespass. The fun part? When their bot comes around it will not see my web site. It will instead see a very, very long joke that will be delivered very, very slowly. How slowly? From start to finish will take from an hour and a half to more than six hours.

If you've seen a bad bot in your logs and want to punish it in this way, feel free to hit my contact form to inquire about it. It's a freebie if all you need is the application itself and very minimal installation/configuration instructions. After all: Bad Bots Must Be Punished!

→ committed: 8/6/2012 17:36:54

[ / internet / web_weirdness] permanent link

Comments: 6    Trackbacks: 0

   
Save the Net

Creative Commons License

Project Honeypot Member

 
June 2017
Mon Tue Wed Thu Fri Sat Sun
     
25
26 27 28 29 30    

By Month:

By category:

Feeds:

Served to 54.158.245.70:51626 at 01:50:38 GMT on Monday, June 26, 2017.

return(0.4694);