Privacy Preservation in Data Mining using Anonymization Technique

December 27, 2017 | Penulis: IJSTE | Kategori: Privacy, Data Mining, Personally Identifiable Information, Data Management, Securities
Share Embed


Deskripsi Singkat

Description: Data mining is the process of extracting interesting patterns or knowledge from huge amount of data. In rec...

Deskripsi

Data mining is the process of extracting interesting patterns or knowledge from huge amount of data. In recent years, there has been a tremendous growth in the amount of personal data that can be collected and analyzed by the organizations. As hardware costs go down, organizations find it easier than ever to keep any piece of information acquired from the ongoing activities of their clients. These organizations constantly seek to make better use of the data they possess, and utilize data mining tools to extract useful knowledge and patterns from the data. Also, the current trend in business collaboration shares the data and mine results to gain mutual benefit [2]. This data does not include explicit identifiers of an individual like name or address but it does contain data like date of birth, pin code, sex, marital-status etc. which when combined with other publicly released data like voter registration data can identify an individual. The previous literature of privacy preserving data publication has focused on performing “one-time” releases. Specifically, none of the existing solutions supports re-publication of the micro data multiple time publishing, after it has been updated with insertions and deletions. This is a serious drawback, because currently a publisher cannot provide researchers with the most recent dataset continuously. Based on survey of theoretical analysis, we develop a new generalization principle l-scarcity that effectively limits the risk of privacy disclosure in re-publication. And it’s a new method modifying of l-diversity and m-invariance by combining of these two methods. They provide a privacy on re-publication of the microdata. We consider a more realistic setting of sequential releases by Insertions, deletions and updates and Transient/permanent values. We cannot simply adapt these existing privacy models to this realistic setting.
Lihat lebih banyak...

Komentar

Hak Cipta © 2017 PDFDOKUMEN Inc.