Data Aggregation as a Method of Protecting Privacy in Smart Grid Networks
Written by Gelareh Taban and Alvaro A. Cárdenas
Data collection in Advanced Metering Infrastructures (AMI) presents new opportunities for utilities but, at the same time, can compromise the privacy of electricity consumers. Data aggregation can alleviate this challenge by combining collected sensitive data into a single representation; however, the accountability of individual smart meter data can be lost because an attacker can falsify aggregates without being easily detected.
While the modernization of the power grid promises many social benefits, the data collected by advanced metering systems (AMI) introduce new and fundamental issues for electricity consumers, involving personal privacy and the integrity of the collected data. Electricity usage is sensed and collected frequently, and on a large scale; however, because sensing is passive, consumers have little awareness of their exposure. Utilities for their part need to be confident in the soundness of the data they collect from AMI systems, as accurate electricity load information will prevent electricity fraud and ward off attacks against the electricity distribution infrastructure. Of course, the accuracy of collected data is also important for consumers to guarantee correct billing.
By monitoring high-granularity energy consumption, detailed information about electricity users' lives can be inferred. Because of the value of this information, many parties will try to gain access to AMI data, including advertising companies (for targeted advertisement based on consumer profiles), law enforcement (to identify irregular activities such as electricity used for the growing of marijuana), and criminals (to identify which houses are unoccupied so as to plan break-ins).
Within the current regulatory and judicial environment, it is possible to re-purpose the household consumption data gathered by AMI projects to reveal and exploit personal identifying information. While utility companies have the responsibility of protecting AMI records, this data can be shared with third parties, either by consumers unaware of their privacy exposure, or by the utility companies. In Oklahoma, one of the few states to have adopted AMI data protection legislation so far (HB 1079), the utility company owns the meter data and it is "authorized to share customer data without customer consent with third parties who assist the utility in its business and services, as required by law, in emergency situations, or in a business transaction such as a merger."
Not only can data be shared voluntarily, but information collected by smart meters can also be the target of attackers, both outsiders exploiting vulnerabilities where the data is stored or rogue employees working for a utility company or one of its contractors.
In response to these concerns, governments and standard organizations are working on privacy standards and policies to guide AMI deployments. In the United States, the Obama Administration examined privacy issues in its June 2011 smart grid policy framework report. The report recommends that state and Federal regulators should consider, as a starting point, methods to ensure that consumers' detailed energy usage data are protected in a manner consistent with Federal Fair Information Practice (FIP) principles.
Handling energy-use data is at the core of many privacy use cases, including data aggregation. Data aggregation of AMI data can be used to avoid "fingerprinting" consumer behavior—being able to associate specific consumption patterns with specific individuals—by collecting and storing aggregate behavior. While utilities require individual consumer data for billing, that data can be collected over long intervals (several hours); data that needs to be accessed more frequently than that—data needed for operational purposes such as load forecasting or demand-response—can be aggregated across customers at intervals of less than an hour before being processed and used.
Data aggregation can be done at the utility, but an even more efficient and secure alternative is to perform data aggregation directly in AMI mesh networks. Data aggregation is an important basic technology used in wireless sensor networks: data are collected from different sources and expressed, based on specific variables, in a summarized format. In general, sensor nodes are logically organized as a tree, called the aggregation tree, which is rooted at a collection point. The leaf nodes on the aggregation tree act as sensing devices that measure the environment. The internal nodes on the tree act as aggregator devices that combine the data they receive from their descendants before forwarding the aggregate to their parent nodes. The collection point acts as an intermediary between the sensor network and a user outside the network, querying the data collected by the sensor network.
By eliminating redundant or unnecessary information from transmitted data streams, data aggregation can achieve two fundamental objectives. First, aggregation can drastically reduce network traffic, thus improving the efficiency of the AMI network. Second, since only the necessary information is retained, aggregation improves the privacy of customers from not only entities outside the network but also, to a lesser extent, even entities within the network.
A significant risk of data aggregation, however, is a potential lack of accountability. A node that is captured by an adversary can report arbitrary values as its aggregation result; the adversary thereby corrupts the measurements of all the nodes in its entire aggregation sub-tree. In this kind of powerful and yet cost-effective integrity attack, an adversary can control the measurements of a large portion of the network by compromising a selected number of well-positioned aggregating nodes.
Previous work on integrity-assured aggregation has not explored fully how to maintain the privacy of sensor data. In particular, the most common approach for a utility to verify the integrity of an alleged data aggregate is to recompute the aggregate, or an approximation of it, using the original sensed data, and subsequently verify that the alleged aggregate is identical or close enough to the recomputed value. However, if the utility is given verification access to the original data, the privacy of the sensed data is breached.
Thus, we need further work in practical Privacy and Integrity Assurance (PIA) algorithms for AMI networks. Solutions need to avoid use of costly sensor operations or functions (such as zero- knowledge proofs of knowledge or secure multi-party computation between the sensors) because the wide deployment of smart meters makes more demanding specs in hardware requirements for meters an unattractive proposition. Practical proposals can be achieved by leveraging efficient algorithms that can only compute a limited set of aggregation functionalities (including the sum). A practical solution to the PIA problem can have a significant impact on the way AMI networks are deployed and will achieve privacy-by-design goals for smart grid deployments.
Gelareh Taban is a security engineer working in Silicon Valley. She received her M.S. and Ph.D. degrees from the University of Maryland, College Park, and a B.S. in Computer Engineering from the University of Wollongong, Australia. Her research interests include security and privacy in networks, applied cryptography, and digital rights management.
Alvaro A. Cárdenas, an IEEE member, is a research staff engineer at Fujitsu Laboratories of America. Prior to this he was a postdoctoral fellow at the University of California, Berkeley, where he worked on security of critical infrastructure systems. His current research focuses on "big data" analytics for security, smart grid, network security, cyber-physical systems, and wireless communications for embedded systems and the Internet of Things. He holds M.S. and Ph.D. degrees from the University of Maryland, College Park, and a B.S. from Universidad de los Andes.