Stories from December 20th, 2010

Google Ngram Viewer

Google has a neat & simple tool online as part of their massive Google Books undertaking called the ‘Ngram Viewer’.  Simply enter a comma-delimited list of words and see the frequency of those words in printed literature going back to the 1800′s and earlier.  In the example above I compare ‘napoleon’ to ‘caesar’, and you can see how prior to 1920 Napoleon was pretty popular, but afterwards Caesar takes over.

The “Google Million”. All are in English with dates ranging from 1500 to 2008. No more than about 6000 books were chosen from any one year, which means that all of the scanned books from early years are present, and books from later years are randomly sampled. The random samplings reflect the subject distributions for the year (so there are more computer books in 2000 than 1980). Books with low OCR quality were removed, and serials were removed.

Google Ngram Viewer.

Science

 
Stories from December 16th, 2010

Google Map Foreclosure Tricks

Barry Ritholtz at The Big Picture economic blog points us to a nice trick that you can do with Google Maps to see foreclosures across the nation, and in your neighborhood.

  1. Goto Google Maps.
  2. On the right hand side of the map, select “More”
  3. This will pull up a drop-down menu from which you can select “Real Estate”
  4. On the left hand side, select “Foreclosure”

Note: This map does not reveal any of the millions of REOs that have already been sold by the banks that hold them.

But the maps do reveal an entire nation littered with foreclosure sales. It is an ugly and graphic depiction of how much inventory is out there, and why housing is still many years away from being healthy.

via : Google Map Foreclosure Tricks @ The Big Picture

Graphics ,

 
Stories from November 16th, 2010

Clean Up your Messy Data with Google Refine

Google has released a new open-source tool called ‘Google Refine’ that aims to make cleaning up messy datasets a breeze.  Their description is a bit sparse:

Google Refine is a power tool for working with messy data, cleaning it up, transforming it from one format into another, extending it with web services, and linking it to databases like Freebase.

Essentially it’s a desktop software accessible via a web-browser, which means you don’t have to upload your data to google to benefit from this,  that uses lots of ‘intelligent’ algorithms to fix common problems with data like bad field alignment, format inconsistencies, and mangled input.  It’s really meant for more database-style inputs (row/column style), but could be great for cleaning up user-survey inputs or large downloaded datasets. The main focus of the software is to first execute a filter to get just the part of the data you want, then apply a single operation to the group. That operation can be anything from ‘delete’ (to remove offending rows from a cut-n-paste’d Wiki entry) to reformat (to convert lines into tables). Watch the videos after the break for more details.

via google-refine – Project Hosting on Google Code.

Read more…

Science ,

 
Stories from November 5th, 2010

The Dangers of Visualization: Nicaragua Raids Costa Rica

Shown above is the border of Nicaragua and Costa Rica (the grey line) on both Google MAps and Bing Maps.  They’re not identical, but they’re close enough right?  Let me point you to a piece on SearchEngine Land about a recent military ‘event’.

A Nicaraguan military commander, relying on Google Maps, moved troops into an area near San Juan Lake along the border between his country and Costa Rica. The troops are accused of setting up camp there, taking down a Costa Rican flag and raising the Nicaraguan flag, doing work to clean up a nearby river, and dumping the sediment in Costa Rican territory.

There’s a lot to take away from this, most of which people in the field already know.  People rely very heavily on visualization, without properly checking backgrounds, which doesn’t normally result in Military action but frequently results in misrepresentation and poor understanding of the facts.  Being an Expert in visualization is only partly about isosurfaces and algorithms, a large part of it is in understanding how the resulting visualization is be perceived on both a physical (Human Vision) and psychological level.

Let this be a lesson to you: What you think may be a simple data glitch could actually result in a war between neighboring countries.

via Nicaragua Raids Costa Rica, Blames Google Maps.

Science

 
Stories from September 30th, 2010

Google Proposes replacing JPEG with WebP

Today, Google announced a possible replacement for the worldwide accepted standard for web images.  Just like JPEG, Google’s new ‘WebP’ format is a lossy compression standard based on the guts of their WebM video codec (formerly called VP8).

WebP uses predictive coding to encode an image, the same methodology used by the VP8 video codec to compress keyframes in videos. Predictive coding uses the values in neighboring blocks of pixels to predict the values in a block, and then encodes only the difference (residual) between the actual values and the prediction. The residuals typically contain many zero values, which can be compressed much more effectively. The residuals are then transformed, quantized and entropy-coded as usual. WebP also uses variable block sizes.

So why try to change the world?  For a 40% filesize reduction, that’s why.  Cutting most of the images in the world in half would be a huge win.   According to the Pingdom tools, downloading the main page of VizWorld.com is 360k of data, and 300K of it is Images.  Cutting that down to 200k would be a nice start, and doing it worldwide would be huge.

However, JPEG has huge support from both software and browsers.  And not all is wine and roses in the WebP world: encoding an image into WebP format takes an average of 8 times longer.  However, due to it’s similarity to WebM & VP8, hardware designed to accelerate those standards is suitable for WebP as well.

via WebP Home and CNet News

Science ,

 
Stories from September 13th, 2010

Google This – 13 Years of World Domination Visualized

Tiago Veloso’s latest contribution on InspiredMag is a collection of visualizations of the growth of Google and their various properties (YouTube) to world dominating proportions.

September 15, 1997. That was the day Larry Page and Sergey Brin officially registered the domain google.com, and the internet was never the same.

Because true inspiration – and innovation – can really transform the World, today we bring a selection of data-visualizations that show how Google stepped up to be the giant that we all know today. I also recommend a visit at Wikipedia’s article for a more “text” version of this incredible path.

via Google This – 13 Years of World Domination Visualized | Inspired Magazine.

Graphics , ,

Infographic: Just how Massive Is Google anyway?

 
Stories from September 2nd, 2010

Google releases SketchUp v8

It seems Google has quietly released a new major upgrade for SketchUp that includes model geolocation, color imagery, and a new tool called ‘Building Maker’.  They’ve added improved tools for photo matching, and all around seem to have done a good job of adapting SketchUp to be a great tool for architectural modeling.

The pro version adds support for basic constructive solid geometry operations like union, intersection, and subtraction, as well as some CAD-like features such as angular dimensions and volume calculation.

Google SketchUp.

Graphics , ,

 
Stories from August 24th, 2010

Tracking Google’s Acquisitions

 
Stories from August 10th, 2010

Understanding Google PageRank

VizWorld.com is a production of VizWorld, LLC © 2009