Please use this identifier to cite or link to this item: http://localhost/handle/Hannan/228231
Title: Dynamic Privacy Pricing: A Multi-Armed Bandit Approach With Time-Variant Rewards
Authors: Lei Xu;Chunxiao Jiang;Yi Qian;Youjian Zhao;Jianhua Li;Yong Ren
Year: 2017
Publisher: IEEE
Abstract: Recently, the conflict between exploiting the value of personal data and protecting individuals' privacy has attracted much attention. Personal data market provides a promising solution to this conflict, while determining the price of privacy is a tough issue. In this paper, we study the pricing problem in a setting where a data collector sequentially buys data from multiple data owners whose valuations of privacy are randomly drawn from an unknown distribution. To maximize the total payoff, the collector needs to dynamically adjust the prices offered to owners. We model the sequential decision-making problem of the collector as a multi-armed bandit problem with each arm representing a candidate price. Specifically, the privacy protection technique adopted by the collector is taken into account. Protecting privacy generally causes a negative effect on the value of data, and this effect is embodied by the time-variant distributions of the rewards associated with arms. Based on the classic upper confidence bound policy, we propose two learning policies for the bandit problem. The first policy estimates the expected reward of a price by counting how many times the price has been accepted by data owners. The second policy treats the time-variant data value as a context and uses ridge regression to estimate the rewards in different contexts. Simulation results on real-world data demonstrate that by applying the proposed policies, the collector can get a payoff which is close to that he can get by setting a fixed price, which is the best in hindsight, for all data owners.
URI: http://localhost/handle/Hannan/228231
volume: 12
issue: 2
More Information: 271,
285
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7572170.pdf1.89 MBAdobe PDF
Title: Dynamic Privacy Pricing: A Multi-Armed Bandit Approach With Time-Variant Rewards
Authors: Lei Xu;Chunxiao Jiang;Yi Qian;Youjian Zhao;Jianhua Li;Yong Ren
Year: 2017
Publisher: IEEE
Abstract: Recently, the conflict between exploiting the value of personal data and protecting individuals' privacy has attracted much attention. Personal data market provides a promising solution to this conflict, while determining the price of privacy is a tough issue. In this paper, we study the pricing problem in a setting where a data collector sequentially buys data from multiple data owners whose valuations of privacy are randomly drawn from an unknown distribution. To maximize the total payoff, the collector needs to dynamically adjust the prices offered to owners. We model the sequential decision-making problem of the collector as a multi-armed bandit problem with each arm representing a candidate price. Specifically, the privacy protection technique adopted by the collector is taken into account. Protecting privacy generally causes a negative effect on the value of data, and this effect is embodied by the time-variant distributions of the rewards associated with arms. Based on the classic upper confidence bound policy, we propose two learning policies for the bandit problem. The first policy estimates the expected reward of a price by counting how many times the price has been accepted by data owners. The second policy treats the time-variant data value as a context and uses ridge regression to estimate the rewards in different contexts. Simulation results on real-world data demonstrate that by applying the proposed policies, the collector can get a payoff which is close to that he can get by setting a fixed price, which is the best in hindsight, for all data owners.
URI: http://localhost/handle/Hannan/228231
volume: 12
issue: 2
More Information: 271,
285
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7572170.pdf1.89 MBAdobe PDF
Title: Dynamic Privacy Pricing: A Multi-Armed Bandit Approach With Time-Variant Rewards
Authors: Lei Xu;Chunxiao Jiang;Yi Qian;Youjian Zhao;Jianhua Li;Yong Ren
Year: 2017
Publisher: IEEE
Abstract: Recently, the conflict between exploiting the value of personal data and protecting individuals' privacy has attracted much attention. Personal data market provides a promising solution to this conflict, while determining the price of privacy is a tough issue. In this paper, we study the pricing problem in a setting where a data collector sequentially buys data from multiple data owners whose valuations of privacy are randomly drawn from an unknown distribution. To maximize the total payoff, the collector needs to dynamically adjust the prices offered to owners. We model the sequential decision-making problem of the collector as a multi-armed bandit problem with each arm representing a candidate price. Specifically, the privacy protection technique adopted by the collector is taken into account. Protecting privacy generally causes a negative effect on the value of data, and this effect is embodied by the time-variant distributions of the rewards associated with arms. Based on the classic upper confidence bound policy, we propose two learning policies for the bandit problem. The first policy estimates the expected reward of a price by counting how many times the price has been accepted by data owners. The second policy treats the time-variant data value as a context and uses ridge regression to estimate the rewards in different contexts. Simulation results on real-world data demonstrate that by applying the proposed policies, the collector can get a payoff which is close to that he can get by setting a fixed price, which is the best in hindsight, for all data owners.
URI: http://localhost/handle/Hannan/228231
volume: 12
issue: 2
More Information: 271,
285
Appears in Collections:2017

Files in This Item:
File SizeFormat 
7572170.pdf1.89 MBAdobe PDF