You are here: Blogsphere Longtail
Keep up to date with your favourite Rails bloggers in context.
regenerate paperclip thumbnails
@xentek Watch, learn, weep tha…
Here’s an awesome zipcode find…
Кофейные ссылки #3
Сегодня под кофе читаю Rails Envy. Самое интересное для меня:
На закуску вакансия:
iftop - Find out who is eating your bandwidth
Quote of the day
A Rails 2.1 case study [del.icio.us]
A couple people have asked me how I'm hosting the Sinatra-based pastie service we wrote in yesterday's revised tutorial. The previous version ran on a mongrel handler frontended by nginx, but for this version I decided to try something a little different.
One of the big announcements at Railsconf last month was that Passenger (aka mod_rails) would be releasing a v2.0 with support not only for Rails applications but also for Rack, meaning that any Rack-based Ruby web framework can also run on it. Yay for deployment options, right? So anyway, I figured we'd give that a shot.
For those of you who haven't yet mucked with it, Passenger is dead simple to setup. Run gem install passenger to pull down the gem, and execute the passenger-install-apache2-module command to build and install the Apache 2.2 module (you'll need the proper Apache libraries to be present of course). The command output will show you how to configure Apache to load the module.
Getting Passenger to run a Sinatra-based application also turned out to be remarkably easy. All you need to do is create a regular old Rackup script. The file will need to be named config.ru and should contain all the logic necessary to initialize our app:
require 'rubygems'
require 'sinatra'
Sinatra::Application.default_options.merge!(
:run => false,
:env => ENV['RACK_ENV']
)
require 'toopaste'
run Sinatra.application
Place this file in the folder on your server where toopaste.rb and the views directory reside.
Next, create a public directory. This is where any static images, JavaScripts, or stylesheets would be kept (we're not using any, in this simple example). Point your vhost's DocumentRoot here:
<VirtualHost *:80>
ServerName paste.zerosum.org
DocumentRoot /var/www/apps/toopaste/public
...
</VirtualHost>
You might as well create a tmp directory too. You can place a restart.txt file in this directory to tell Passenger it needs to reload the app without restarting Apache (you can use this in your cap restart tasks, too).
Couldn't be much easier. For more information on deploying other Rack-based frameworks (Merb, Camping, Ramaze, etc) and various other config options, check Passenger's user guide.
"OH: the trick is to write your unit tests while you’re sober. then write the code while..."
@xentek weird. Just do zennawa…
As so often happens in the world of software, someone aquires a version of Dreamweaver and instantly think that the website that they have spend hours working on is now the bees knees. Everyone has gotten this feeling at one time or other, I certainly have (blinking lights and ms clip art was all the rage back in the 90's).
A guy I knew during my my time in Dalian, China, got back in contact today proclaiming that he now has an outsourcing team asembled working as soldiers of fortune. Their mission, to crack open the mythical giant of the US & EU software industry and reap the rewards. Anyhow, we get to chatting about the venture of his and his plans for world-wide domination, he then shows me his website.

My eyes are still bleeding, it looks terrible, it is terrible, small children would run away from this colourful behemoth, grown adults would roll into the fetal position and cry out for their mother. Let's just say it was bad. So anyhow I proceed to give him the benefit of the doubt, I remember saying that blue backgrounds full of ships and bright yellow links have no place in an outsourcing website, that the 90's want their webpage back, that messages to their customers should never contain the phrase "Just do it ,man!!!" and above all that practice of harvesting links pointing to the 90% of the worlds spam resources is just one very bad idea.
To top it all off, during my browsing of this site (I was wearing protective glasses at the time), I was infected with a number of nasty trojans and viruses. My friend has become a spam king. He was wondering why he had no US & EU clients, so I gave it to him straight. Every company, usually gets to give their first impression via their website, getting infected with anything is always a negative in their view. These days so many mediums of information are are all inspected before a deal is done to ensure that the opposite party is legit. My first google search for my friends page yielded results of Viagra, Erectile Disfunction and JS_DLOADER.JS, it's a very weird combination.
One thing I always noticed was a huge divide between website design in the western world and design in China. It's not a bad thing, it's a matter of culture. I always equated it with walking down the main streets of the respective cities, in the middle kingdom you are faced with glaring neon lights proclaiming KTV bars, electronics and noodles, Ireland, thankfully, is a bit more quaint in that regard. So when I see websites in China with all the bells and whistles I don't particularly pay much notice but do spend a lot of time trying to see past the junk for the nuggets of gold that are off to some side. In the western world, things are much more minimalistic, each page focuses on a specific are, one which automatically draws the attention to the users point of view. At the end of the day, it is chalk and cheese, I would have great difficulty designing any Chinese site and vice versa.
I do hope that he takes some of the advice I gave him, but I doubt he will. I'm a minimalist type of guy, I like clean lines, some curves and good content. Flash, FLEX, Silverlight, well those are things I am not into, sure they have their uses, can look very eye-catching but mostly they are just used for evil purposes. I know what I like and can definitely tell you what I hate. Bells and whistles = FAIL. Spam url's = EPIC FAIL....

Viacom had astonishing balls to ask for the source code for the search functions that power Google and YouTube, the source code for YouTube's new "Video ID" program, a complete set of every video ever removed from the site, databases containing information on every video ever hosted at YouTube, and a copy of every private video.
So in the ruling by a Judge in the District of New York awarded Viacom's representatives to all of YouTubes logs. I have a number of concerns about this whole debacle which I'll talk about here. What the hell is Google doing with over 12 Terabytes of data just containing logs? I can really sense EU lawyers salivating over this, the EU have unbelievably strict privacy laws.
Why was Viacom looking for access to source code? There are hundreds of cloned services out there that mirror YouTube, they were not the first to create this service, obtaining and possibly leaking code would just enable more clones to be made available.
Whatever happened to YouTube's amazing technology that would automatically weed out copyright content from the site? Was this implemented? If so, how many videos were removed as a result? Who are Viacom looking for information on, uploaders or the actual content consumers?
Lastly, I personally find it hilarious that Viacom were requesting the search technology behind Google and Youtube. They are the same technology, can you imagine for one second Viacom actually aquiring the most intimate details of Google's core business? That's right, neither can I. What is really astonishing is that a division of Viacom, CBS, has major deals with Youtube, that involves having a large CBS channel within the site.
It's fast becoming very clear that despite all the press releases and lawsuits there is a major battle waging between old media and new. Content owners are trying to figure out what to do and how to do it before the old medium of television dies a lonely death (Not for some time I expect). Youtube, Justin.tv, and many more of it's ilk, divide and conquer.
Google may shortly begin regretting buying the service, . Mark Cuban once famously commented “Only a ‘moron’ would buy YouTube”, he may be right after all. These types of lawsuit are only the tip of the iceberg. It happened with book publishers, it has happened with radio, it's happening with record labels and artists. Video is the new battleground...
Off-Topic: Novos Planos na Locaweb
Galera, começamos a lançar alguma novidades aqui na Locaweb. Tem muito mais por vir. Mas para começar, foram para ar os novos planos de hospedagem compartilhada.
Resumindo, pelos mesmos preços, começando em R$ 18, agora temos 25 vezes mais espaço, 10 vezes mais domínios, 10 vezes banda e bases MySQL ilimitadas em todas os planos. Vejam mais detalhes aqui.
Como exemplo, no Plano Expresso isso significa 5Gb de espaço em disco, 100Gb de transferência mensal, 50 domínios. Acho que isso deve tornar as contas mais atraentes aos brasileiros.
Falando nisso, estou bastante ocupado por aqui :-) Temos muita coisa legal no forno. O Trial de Rails ainda está em andamento mas as inscrições já foram fechadas porque temos mais gente do que esperávamos! Peço desculpas pelo atraso em liberar algumas contas, mas estou terminado as últimas ativações, ufa! Aguardo o feedback de vocês!
Perfect weather out on the patio

java weka.classifiers.trees.J48 -t /some/where/train.arff -d /other/place/j48.model
java weka.classifiers.trees.J48 -l /other/place/j48.model -T /some/where/test.arff
In response to 37Signals announcing that they will stop supporting IE 6, I checked my Google Analytics and discovered something surprising: 7% of EnfranchisedMind readers still use IE 6.
To them, I say — PLEASE UPGRADE. And let me know why you’re still using it — I’m really, genuinely curious.
So, why should you upgrade? I’ll quote 37 Signals:
The Internet Explorer 6 browser was released back in 2001, and Internet Explorer 7, the replacement, was released nearly two years ago in 2006. Modern web browsers such as IE 7, Firefox, and Safari provide significantly better online experiences. Since IE 6 usage has finally dipped below a small minority threshold of our customers, it’s time to finally move beyond IE 6.
[...]
IE 6 is a last-generation browser. This means that IE 6 can’t provide the same web experience that modern browsers can. Continued support of IE 6 means that we can’t optimize our interfaces or provide an enhanced customer experience in our apps. Supporting IE 6 means slower progress, less progress, and, in some places, no progress. We want to make sure the experience is the best it can be for the vast majority of our customers, and continuing to support IE 6 holds us back.
More information can be found at the Stop IE6 Campaign. Specifically, see the Top 10 Reasons (there are actually 12 of them…).
As Internet Explorer 8 cruises into being, can we please agree to put to death this ancient, buggy, insecure piece of code?


@blaix You need to go get that…
@xentek Another SVN client for…
Refactoring an ActiveRecord callback
Inspired by a few articles and pesentations (1 2 3 4), I decided it was time to cleanup some of the logic in my Post model related to a particular ActiveRecord callback. The fact that I needed some comments to explain what it is doing should be a red flag.
before_validation :update_published_at_if_necessary
def published?
self.is_published == true
end
def unpublished?
! published?
end
protected
# Ensure that published_at is set accordingly.
#
# * Unpublished posts should not have this set
# * Published posts should have it set to the current time
def update_published_at_if_necessary
if new_record? && published? && published_at.nil?
self.published_at = Time.now
end
if ! published? && !published_at.nil?
self.published_at = nil
end
end
Fortunately, I have tests in place that excercise this logic, so as long as my tests are passing, the refactorings must work (hopefully!).
My first impression is that the callback is doing too much work. Let's split it up into two pieces.
before_validation :set_published_at_to_now_if_necessary
before_validation :unset_published_at_if_necessary
protected
def set_published_at_to_now_if_necessary
if new_record? && published? && published_at.nil?
self.published_at = Time.now
end
end
def unset_published_at_if_necessary
if ! published? && !published_at.nil?
self.published_at = nil
end
end
That's somewhat better. At least the methods are more focused.
Hmm, I probably don't need to check the existing published_at value when something isn't published. I can also make use of unpublished?
def unset_published_at_if_necessary
if unpublished?
self.published_at = nil
end
end
I would do something similar for set_published_at_if_necessary, but I don't want to override the published_at if it was explicitly set. Maybe I post something in the future, or past. I go a little crazy sometimes with that.
I could probably simplify the conditional logic in set_published_at_if_necessary by making a new method.
protected
def set_published_at_to_now_if_necessary
if new_published_post_without_published_at?
self.published_at = Time.now
end
end
def new_published_post_without_published_at?
new_record? && published? && published_at.nil?
end
Having if_necessary into the method names are kind of bugging me. before_filter supports :if and :unless options, so we should use those, and remove the conditionals from the callback methods.
before_validation :set_published_at_to_now, :if => :new_published_post_without_published_at?
before_validation :unset_published_at, :if => :unpublished?
protected
def set_published_at_to_now
self.published_at = Time.now
end
def unset_published_at
self.published_at = nil
end
Looking good. Looking pretty, pretty good.
Let's see the finished product:
before_validation :set_published_at_to_now, :if => :new_published_post_without_published_at?
before_validation :unset_published_at, :if => :unpublished?
def published?
self.is_published == true
end
def unpublished?
! published?
end
protected
def set_published_at_to_now
self.published_at = Time.now
end
def unset_published_at
self.published_at = nil
end
def new_published_post_without_published_at?
new_record? && published? && published_at.nil?
end
I'm pretty happy with this. Reads really well. Some of the methods have kind of long names, but I can deal with that.
The only part I don't really like is published? and unpublished?, but that's for another day.
This September, I’ll be presenting at RailsConf Europe on EC2, MapReduce, and Distributed Processing. The talk will explain the MapReduce approach to distributed processing, will show a few example implementations, and will discuss MapReduce vs. other distributed processing techniques.
Whether you’ll be there or not, if you’re interested in learning more about MapReduce, here are some resources. I’ll write a few more posts on the subject before the conference, so watch this space as well.
Cluster Computing and MapReduce is a great series of video lectures given to Google interns in 2007. The first two are the most appropriate: the first introduces distributed processing concept, while the second covers MapReduce itself.
MapReduce: Simplified Data Processing on Large Clusters is the paper by Jeffrey Dean and Sanjay Ghemawat of Google that got things going in the first place.
MapReduce for Ruby: Ridiculously Easy Distributed Programming discusses MapReduce and introduces Starfish, a Ruby library for distributed processing. Starfish is not a MapReduce implementation, however – it takes a somewhat different approach to distributed processing.
Skynet (a few writeups: InfoQ, Dion Almaer) is another Ruby-based distributed processing system inspired by MapReduce.
Writing Ruby Map-Reduce programs for Hadoop discusses using Ruby to wrap Hadoop, a MapReduce-like system built in Java.
Introduction to Parallel Programming and MapReduce at Google Code University, a good overview of distributed processing and the MapReduce approach.
And finally, one article that you should avoid:
MapReduce: A major step backwards compares MapReduce to relational databases, and says that MapReduces loses out because it doesn’t support database indices, database views, Crystal reports, etc. Basically, the complaint is that MapReduce isn’t SQL compliant. WTF? Clearly, the author(s) didn’t understand what MapReduce is. The problem, as explained elsewhere, is that the authors thought that MapReduce == CouchDB/SimpleDB. Which is obviously not true. %s/MapReduce/SimpleDB the original article and it makes some sense. But long story short, this article will teach you nothing about MapReduce, and will likely confuse you further. So stay away.
Cattle Brands

During my road trip across the United States we took a short rest
somewhere in the middle of Montana and I saw this interesting display
of local cattle brands.
My favorite is the “Lazy P Swinging 9”. I think I see the beginings of
the Weezer =W= in there too!
SproutCore - a javascript framework
Tell us what you think of the new BlogSphere feature. We are continually looking to improve and update the
functionality based on your feedback.

Find your next Ruby on Rails project or job.
Exclusive content,
regularly updated - onsite and tele-working positions listed.
My go-to man for rails questions.
-
J.Z, United States