CIO

How to Ensure Privacy in the Age of HTML5

HTML5, the latest version of the language of the Web, was designed with Web applications in mind. It contains a slew of new application programming interfaces (APIs) designed to allow the Web developer to access device hardware and software using JavaScript.

Some of the more exciting HTML5 specifications include the following:

Read this list and you could conclude that HTML5 is being designed specifically for hackers and identity thieves. The reality, however, is that that the authors of HTML5 take privacy very seriously.

Concerns over HTML5 weakening privacy protections were most famously and visibly expressed on a front-page New York Times article back on Oct. 10, 2010. New Web Code Draws Concern Over Privacy Risks talks mostly about the additional tracking capabilities enabled by new HTML5 browser storage capabilities. In particular, Samy Kamkar's Evercookie application is singled out as a particularly sinister example. Evercookie is a JavaScript app that writes tracking data to numerous places in a user's browser, making the data difficult to remove through normal means. Even worse, Evercookie will recreate all cookies if it discovers that they've been removed.

Kamkar created Evercookie to demonstrate the ease with which new storage mechanisms could be exploited by marketers to track users. Marketers paid attention and quickly adopted Evercookie to track users.

Scared yet? You should be.

But HTML5 isn't the problem. In fact, HTML5 is part of the solution.

HTML5 Improves Web Security, Eliminates Need for Plug-ins

The current state of the Web-even leaving HTML5 completely out of it-includes tracking cookies, Flash cookies and hacked Web sites distributing malware. Moreover, 6.3 percent of Web surfers worldwide (many of them in China) still use the notably insecure Microsoft Internet Explorer 6.

HTML5 aims to make the Web more secure, in part, by eliminating the need for browser plug-ins. This is a great start. Two of the most commonly installed browser plug-ins, Java and Flash, are also the two biggest security holes in any Web browser.

Simply by being installed, plug-ins make the browser less secure. Not only that, but plug-ins are generally written for multiple operating systems; a vulnerability in a plug-in such as Java or Flash is a vulnerability in Windows, MacOS and Linux. Another wrinkle is that a large percentage of installed plug-ins don't have the latest security patches. Overall, plug-ins represent a major problem.

News: Oracle Releases Java EE 7 with Eye on HTML5 DevelopmentMore: HTML5 Web Storage Loophole Can Be Abused to Fill Hard Disks With Junk Data

Many of HTML5's new features-built-in video and audio playback, vector and bitmap animation, device access and Web storage, for example-are designed to eliminate the need for plug-ins. By bringing what was once considered "extra" functionality under the roof of the browser-and, more importantly, under the roof of approved standards-security and privacy can be integrated in a much more coherent, careful way.

HTML5 Device Access APIs and Privacy Preferences

The broad category of device access APIs present another potential HTML5 privacy issue. It seems only natural to many of us that the continuing expansion of the Web and the webification of all sorts of computing devices will create many innovative products and services. Just as desktop Web applications are taking over many tasks that used to be the sole domain of packaged software, mobile computing is also increasingly shifting towards the Web.

The biggest missing piece for today's mobile Web apps is the limited device access capabilities of mobile browsers when compared to the capabilities of native mobile apps. Mobile Web apps can't, for example, cause your phone to vibrate, read the current state of the battery or measure ambient light. Most new mobile Web browsers can, however, access your current location and your camera. As these new capabilities are baked into browsers, privacy is a major concern.

In native apps, device access privacy preferences are typically managed through the installation process. When you install an Android app, for example, you receive notification of the types of access that the app requests. At that point, you can choose to allow or disallow the requested access. After you install the app, the permissions are set, and that app can access your camera, contacts or whatever you approved.

News: Fujitsu Eyes Enterprise Security with HTML5 Mobile App Platform

Mobile Web app privacy and security is tricky, since a Web app may change at any time and upgrades don't require your active involvement. Most of the time, this is a big benefit of Web apps; you get constant upgrades without the annoying upgrade process that native apps require. The downside is that any change may cause a previously secure and trustworthy app to become less so, even harmful.

To understand how browsers deal with this potential problem, we need to first define some terms:

Notice is the requirement that an API notify a user that data is being collected. Currently, browsers have slightly different mechanisms for giving notice, but the notification bar at the top of the browser window is becoming the most common method. You can see an example of an API triggering a notice by visiting a site that uses HTML5 geolocation with a browser that supports geolocation; the latest versions of all the major browsers do.

Feature: How Mobile Apps Developers Can Best Target Geolocation

Consent is the process of obtaining user permission for the API to access the device. Consent can be implicit or explicit. For example, if you press a "Take a Picture" button, you're implicitly giving permission for the app to use the camera. On the other hand, if you click an "Email a Friend" button, you're not implicitly giving the Contacts API permission to spam everyone in your contacts database. Each HTML5 API assumes that explicit permission is required by default but defines circumstances in which implicit permission is acceptable.

Minimization is the requirement that APIs make it easy to gather as little information as is required for the task at hand.

Control is the capability of the user to manage permission choices. Users must be able to revoke a browser's access to a device after they've granted it. Optionally, they should be allowed to whitelist and blacklist applications.

Finally, access is the capability of the user to view and delete his history of sharing device access with applications.

Geolocation is perhaps the most potentially privacy-invading HTML5 APIs. Interestingly, it's also one of the most widely implemented APIs. To get an idea of privacy measures for individual device APIs, it's helpful to look at how Geolocation did it.

Study: Geolocation Apps Draw Users Despite Privacy Concerns

Section 4 of the proposed Geolocation API Specification is dedicated to privacy. It divides privacy concerns into two areas of focus: considerations for implementers of the Geolocation API (browser creators) and for recipients of location information (software developers).

The job of designing the actual mechanisms and user interfaces for requesting, obtaining and managing permissions is left up to the actual browser. The specification simply says that, in order to comply with the spec, location information must not be obtained without permission in order for a user agent (a Web browser).

The specification places additional requirements on recipients of location data, too. They must disclose that they are collecting the data, protect the data against unauthorized access, allow the user to update and delete any data they store, tell the user how long the data will be stored, tell the user whether the data will be retransmitted and disclose how the data is secured.

Related: Facebook Official: HTML5 Backers Need to Step Up Mobile Efforts

The browser part of the equation isn't worrisome. Browser makers take their responsibility to provide a secure environment very seriously. The recipient (Web app developer) part of picture should worry, though. The processes by which a recipient of location data-or data from any device, for that matter-meets the requirements of the specification are currently up to the individual developer. Some developers aren't even aware of the requirements, and there's no enforcement mechanism.

Even though the browser specifically asks if your location data may be obtained, you might have little, if any, knowledge or assurance that the data won't be stored or used for purposes other than the one you approved. This is the next front in the battle for Web privacy.

Solutions for Protecting Privacy Few and Far Between

The World Wide Web Consortium's Platform for Privacy Preferences Project (P3P) was designed several years ago to tackle just this sort of problem by creating a standard language websites could use to communicate their privacy policies. With P3P, browsers could inform users of site policies and even let them opt out of visiting sites with policies they weren't comfortable with. P3P never caught on with browser makers, however, and its work has been suspended.

Nowadays, the W3C's Privacy Interest Group and Tracking Protection Working Group represent just two of the ongoing efforts to increase and standardize security and privacy on the Web-and in HTML5.

Perhaps the most notable advance in browser privacy in recent months is the implementation of the Do Not Track (DNT) specification by all major browser makers. Some browsers, including Internet Explorer 10, have gone so far as to enable DNT by default.

Report: Mozilla Spars With Ad Group Over Do-Not-TrackRelated: California Attorney General Urges App Developers to Respect Users' Privacy

DNT is a browser preference sent via the HTTP header to Web sites. As protections go, it's actually pretty weak, as websites currently must voluntarily abide by a user's preference to not be tracked.

Although the advertising industry has generally said it would respect DNT preferences, little has been done about it. The proposed California Right to Know Act of 2013, for example, would allow people to ask businesses for a report of what the business knows about them. After being opposed by Internet industry lobbying groups, the Right to Know act has been withdrawn from consideration for at least the rest of 2013.

Absent a viable voluntary mechanism for websites to disclose their policies, legislation looks like the only good solution to a problem that's only getting worse as marketers gather more data about users. The head of the Federal Trade Commission, Edith Ramirez, recently urged the ad industry to make good on it DNT promise. In the meantime, Do Not Track legislation has been proposed in Congress, and the issue receives more attention as the standard continues to be hashed out.

Chris Minnick runs a Web design and development company and regularly teaches HTML5 classes for Ed2Go. Ed Tittel is a full-time freelance writer and consultant who specializes in Web markup languages, information security and Windows OSes. Together, Minnick and Tittel are the authors of the forthcoming book Beginning Programming with HTML5 and CSS3 For Dummiesas well as numerous other books.

Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn.

Read more about web services in CIO's Web Services Drilldown.