Do Your Photos Leak?

August 16, 2010

One of the side effects of having so many digital devices in our lives that collect so much information, even incidentally, is that it can be difficult to be sure what information might be getting disclosed, and to whom.  The New York Times, in a recent article, pointed out another potential risk, probably unknown to most users, that arises when photographs or videos are posted on the Web.

Many modern digital cameras, and cameras embedded in devices like smart phones, produce output files that include not only the image itself, but also metadata: data that give information about the image.  One of the most common formats for this metadata is EXIF, used with JPEG and TIFF image files.  It was originally introduced so that information about a photo (such as the date, time, shutter speed, and aperture) could be attached to the image.  Subsequently, it has been extended to incorporate other types of data as well, including (and this is the kernel of the problem) geographic location, so-called “geotagging”.   The location data may be derived from a GPS receiver in the device, from cell-tower triangulation, or some combination of methods.  Other image file formats, including video formats, have their own methods of recording metadata.

When photos or videos taken with these devices are uploaded to a Web site, the geotagging information is often preserved; this allows anyone viewing the photo with appropriate tools (which are readily available) to see the exact location where the photo was taken.   (There are some sites that typically do not preserve the metadata.)  The possibilities were explored in a paper [PDF] by Gerald Friedland and Robin Sommer, of the University of California, Berkeley, who presented the research in the “HotSec ’10” workshop at the 2010 USENIX Security Symposium, held last week in Washington DC.  The authors explore several different scenarios that use Web-posted photos to electronically “case the joint”.

  • Craigslist ads often contain images of items for sale.  Depending on how the images were posted, the metadata, including location data, may be accessible.  The authors note that some of these images are of valuable objects, such as jewelry, that were obviously taken at home, providing an inviting target for burglars.
  • Twitter users may post photos  associated with their “tweets”, typically on TwitPic, which preserves location data.  The researchers were able to find the home address of a well-known Twitter user using this data.  They were also able to discover a non-published Twitter feed for another celebrity, by searching for images tagged with his address.
  • YouTube also provides some opportunities for mischief.  The researchers started by searching YouTube for videos located within a certain radius of a chosen location.  They then looked for vacation videos posted by those same users from different locations: for example, at least 1,000 miles away.  This technique also provides an excellent automated “shopping list” for burglars.

Some of the devices used, such as the Apple iPhone 3GS, can achieve locational accuracy better than GPS, by combining GPS data with triangulation data from cellphone towers.  The authors estimate that, under favorable conditions, an accuracy of ± 1 meter is possible.  Even under poorer conditions, localization to a specific address is usually easy.

The potential for harm is amplified by the fact that most users probably do not know how much data they are actually revealing.  Even for those who do, navigating the combination of device configuration menus and Web site data sharing controls is not easy.  The authors suggest the development of a common framework for these settings; they also suggest that allowing locational data to be made “fuzzy”  (for example, include location data, but only with enough resolution to identify the city) could go a long way toward preventing the worst abuses.

Google Bike Maps Revisited

August 16, 2010

Back in March, I posted a note about a new feature of Google Maps: the addition of routes and directions for bicyclists.  This was introduced as a “beta”service, as is customary with Google, but my initial reactions, and some reported later, were mostly favorable.  It is almost a given that any such service would exhibit some early problems, mostly data-related, because the availability of cycling route information is much more limited than for auto routes.  On balance, it seemed like a positive development to have this information available  at all.

This past week’s New York Times had an article with some more recent reactions to Google’s service.  For a beta version just introduced earlier this year, it is a fairy ambitious undertaking:

The beta version for bicyclists is just a few months old, but it is already reshaping how bike enthusiasts travel. Spanning more than 200 cities nationwide — and with plans to roll out bicycle routes internationally — Google Maps relies on a mash-up of data, from publicly available sources like bike maps to user-generated information.

The Google service, or any other service, is not likely to match the knowledge embedded in a locally-produced bike map, augmented by advice from local cyclists.  The quality of the basic data that Google uses varies by city, too.  Some “bike friendly” places, like Portland, Oregon, have extensive data available on local routes and trails.  Still, the quality of the routes should improve as more user data is submitted.

Mr. Barth [Dave Barth, Google Maps product manager] says that as Google Maps software becomes more user-generated — it has already been deluged with over 20,000 suggested corrections — bikers will be able to edit the data on a hand-held device as they actually cycle.

Any route-finding service is bound to miss some of the tricks that you find by exploring, and sometimes exploring is half the fun.  But it’s handy to have an easily-accessible resource that you can use when visiting a place for the first time.

%d bloggers like this: