Seam Carving for Content-Aware Image Resizing

RayAlome shared this video with me: Advanced Photo Resizing

Don’t let the boring title fool you, this video is amazing.

Technology by Ariel Shamir of the Efi Arazi School of Computer Science in Israel.

I’ve got to say, this software is rather amazing. Appears to have been presented during SIGGRAPH 2007 by Shai Avidan (Mitsubishi Electric Research Lab) and Ariel Shamir (The interdisciplinary Center & MERL). As the disclaimer suggests, I wasn’t too attracted to the title and it sounded somewhat boring. However, as I was watching the video, it never stopped to amaze me how awesome the software/algorithm they wrote did.

Image resizing has always been a problem. When you scale down an image, you lose too much important information and when you enlarge images you get pixelation. This new technique appears to be able to fix both those problems.

It’s hard for me to explain what exactly is going on, but by looking at the levels and gradients and other ways computer sees images that humans don’t, they label areas of the pictures to be high energy and low energy. When you resize an image today, you generally remove 1 vertical or horizontal line every few lines. However, that leaves you image jaggedly and funky looking, especially if you’re not resizing to same aspect ratio.

By find finding these high and low energy zones, they are able to remove a seam (non-straight line) that is the lowest possible energy as defined by the user. So as you stretch or shrink the image, the algorithm automatically calculates what to add and what to remove, leaving the image in the crispest and sharpest state possible. I also really enjoyed the ability to give positive or negative weights to area I want to remove and keep, allowing me to remove people I don’t in an image, and keeping those I want. Take that you bystander who always seems to appear randomly in my digipix.

I don’t really truly get how the algorithm works for enlarging pictures. It mentioned it find the lowest energy seam and expand that and interpolate the colors to match the area. However, I assume if you’re adding to a lowest energy area already, it’ll remain the lowest energy area. However, during the demo of stretching the image. You could clearly see the lowest energy seam doesn’t remain the same. Oh well, guess we’ll have to wait for the paper to be released to see what’s going on.

But can you imagine if browsers, photo viewers, or even thumbnail generators utilized this algorithm? That would be so sweet!

I brought up the point of apply this to video as it sounded a very cool area to apply this. Imagine watching YouTube videos without the blurriness and shrinking only removes the useless areas. Or stretching a video from 640×480 to 1920×1080 to make it HD. Haha. I would really want to see what a movie would look like then.

Anyway, another interesting SIGGRAPH 2007 presentation was the Scene Completion Using Millions of Photographs by James Hays and Alexei A. Efros from Carnegie Mellon University.

Abstract

What can you do with a million images? In this paper we present a new image completion algorithm powered by a huge database of photographs gathered from the Web. The algorithm patches up holes in images by finding similar image regions in the database that are not only seamless but also semantically valid. Our chief insight is that while the space of images is effectively infinite, the space of semantically differentiable scenes is actually not that large. For many image completion tasks we are able to find similar scenes which contain image fragments that will convincingly complete the image. Our algorithm is entirely data-driven, requiring no annotations or labelling by the user. Unlike existing image completion methods, our algorithm can generate a diverse set of image completions and we allow users to select among them. We demonstrate the superiority of our algorithm over existing image completion approaches.

Yet another really interesting algorithm. This one allows you to fill in an missing area or remove unwanted objects from a photo and have it filled in with something else. Solves quite a different problem from the former, yet I was really astonished by the results in their paper and presentation.

The algorithm scans through thousands if not millions of images to see if there’s a match that would fill in the whole in your image. I like the results. Building not match the surround, remove it. Construction vehicles obstructing the view, remove it. This is fun!

Leave a Reply Cancel reply