CIO

Google’s Clips camera offers a snapshot of things to come

The most revolutionary product of the year so far is Google’s ‘trivial’ Clips camera. The reason might surprise you.
  • Mike Elgan (Computerworld (US))
  • 07 October, 2017 22:00

Google held a big hardware event this week, announcing a couple of new Pixel-branded smartphones, two Google Home devices, a new Pixelbook laptop, new earbuds called Pixel Buds, and a consumer camera called Google Clips.

Of all the new Google products announced, Google Clips is the most interesting by far — which is to say that it represents the most interesting trend. This consumer device represents the future of enterprise A.I.

But wait, you might say. Isn’t Google’s Pixel Buds product the most revolutionary? Its ability to translate language in real time is something out of science fiction, and the elimination of language barriers surely has major implications for the future of mankind.

All that’s true — sort of. Many companies (including Google) have been building real-time language translation software and delivering it fast to smartphones. Google Translate is amazing, and I’ve been using it for years as I travel around the world.

The only translation innovation in the Pixel Buds is that the earbuds have external, outward-facing speakers in addition to the inward-facing ones, and outgoing translations play through those speakers while incoming translations play through the regular earbud speakers.

In other words, Pixel Buds simply play the audio from Google Translate, but intelligently choosing between two sets of speakers for playback.

The effect is mind-blowing, but the “innovation” of speaker selection ... not so much.

Google Clips, on the other hand, is the real revolution.

Why Google Clips changes everything

Google Clips is a $249 camera for parents.

I’ll speculate as to why Google chose to target that particular demographic in a moment. But first, a few facts about the camera itself.

Clips is a 12-megapixel camera. The housing is two inches by two inches square, and it’s got a clip on the back. The front features a round, black wide-angle lens housing (the lens captures 130 degrees) and a blinking light while taking pictures, which makes it obvious that it’s a camera — it’s not a spy camera.

The Clips camera itself has no screen. Instead, you use a smartphone both to review pictures and control the camera in other ways. The camera does have a button for taking pictures, but that’s not supposed to be the main way pictures are taken.

So far, the camera I’ve described sounds like any number of existing products, including the “lifelogging” cams I’ve told you about previously in this space.

But the revolutionary part is the software. Google Clips uses artificial intelligence (A.I.) to choose when to take pictures. To “use” the camera, you twist the lens to get it started, place it somewhere, then forget about it.

It learns familiar faces, then favors those people (and pets!) when deciding when to take pictures. It looks for smiles and action, novel situations and other criteria. It discards blurry shots.

Each time it takes pictures, it captures a burst of photos at 15 frames per second, which you can use or edit as a GIF or from which you can cherry-pick your favorite still photographs.

The Clips has no microphone, and it cannot record sound.

In short, the A.I. is designed to take great pictures and GIFs, but with the advantage of taking pictures where there’s no photographer around to change the actions of the photographed.

And here’s the revolution: The face recognition takes place on the device, not in the cloud. Pictures are stored on the device, not in the cloud.

These are the attributes Google touts as ensuring privacy. No sound. No automatic uploading.

Of course, you can use the app to choose clips for uploading to Google Photos. Once uploaded into Google Photos, the pictures will be processed again for face recognition, and this time with names attached, if you’ve used the name-to-face feature in Google Photos.

Why Google targeted Clips at parents

I’m speculating here, but I believe Google arrived at parents through a process of elimination.

Google was slammed hard with its Google Glass experiment, because that device put a camera on people’s faces, which many in the public and press felt uncomfortable with.

A large number of startups have since come out with little, square, wearable clip-on cameras, most of which fizzled in the market due to high price, low picture quality and the fact that wearing a camera can be socially awkward.

Google Clips looks externally like any number of these cameras, and my guess is that Google’s initial intent was to both join ’em and beat ’em, by offering a clip-on, wearable camera powered by A.I.

But Google was right in the announcement: Clip-on cameras produce especially horrible photos. They tend to be blurry. The angle’s all wrong. It’s not a good experience.

For that or some other reason, Google decided to discourage clipping the Clips camera on clothing.

The Clips camera as-is makes an inferior security camera and an inferior action camera.

But parents are a perfect target market. The reason is that they just can’t take enough pictures of their kids. Taking pictures of children often intervenes in normal life. Kids know parents are taking pictures, so they either stop what they’re doing or they pose or they complain about having their picture taken.

The Clips camera is a solution. Parents can set it and forget it. When they come back later, maybe the camera captured some amazing moment without any disruption.

Best of all, it’s private and secure; nothing gets uploaded unless expressly selected for uploading.

The real revolution is: A.I. on camera

It’s clear that Google’s mission has little to do with selling cameras. It’s all about finding a path to camera-based A.I.

Google arrives at products like Clips by working backwards from the goal of using data in new ways to benefit users, customers, mankind — whatever.

A.I. in general and machine learning in particular offer new opportunities to generate action and insights from sensor-driven data.

Cameras are the Mother of All Sensors, in part because of the quality of the data and in part because of the ubiquity of cameras.

The public is squeamish about privacy invasion, and in fact privacy invasion is real and rampant. How to apply A.I. to camera data securely? Build it into the camera!

Start with parent cams, move later to dash cams, webcams, and security cams, and eventually put cameras everywhere. With the A.I. built into the cameras, the “product” of industrial-grade cameras never has to be pictures or videos — just insights.

This idea will change everything for enterprises.

What’s the idea? Imagine what’s possible with A.I.-based cameras where the output is data, rather than images.

For example, imagine cameras all over a big warehouse. Their output could be minute-by-minute lists of who comes and goes, how many widgets are stored in the facility at any given time, and other useful data.

Other applications could involve data and images, working together.

Security cameras could work in the opposite way as Google Clips. Just like the human security guard at the front desk, they could get to know familiar faces, and ignore them, while zooming in on, tracking and recording the behavior and movement of unfamiliar faces. The A.I. could identify suspicious behavior and report it. The images could later be extracted as evidence.

Another important function could be to gauge consumer “feelings” — literally.

A company called Silver Logic Labs is working on an algorithm that watches video to identify how people feel. One application is to replace TV Nielsen ratings with real-time data about how audiences feel about what they’re watching. The technology can work through an ordinary laptop webcam.

Consumers probably won’t accept actual video uploaded to the cloud where their faces will be stored, recognized and processed. But if the video never left the camera chip, and if only encrypted anonymized data were transmitted, consumers could maintain privacy, and TV studios and advertisers could get the ultimate audience measurement system.

Lives could be saved. Silver Logic’s technology could be used to predict strokes or to help police judge who’s a threat and who isn’t. The trick is getting over the privacy hurdle. A camera that can’t convey pictures or that shares only those selected with a court order is a shortcut to privacy.

The bottom line is that cameras are the ultimate sensor for data that will be processed via A.I. But right now, the need for privacy is blocking the revolution.

The solution can be found in Google’s Clips camera. By putting A.I. on the camera itself, it’s possible to gain the benefits of camera data without the privacy risks.

And that’s why Google is doing this. It’s all the benefits of A.I.-processed camera data — without the privacy invasion.