Differential Privacy Based Solution for Protecting Privacy of Big Data
Y. Sowmya1 , M. NagaRatna2 , C. Shoba Bindhu3
Section:Research Paper, Product Type: Journal Paper
Volume-6 ,
Issue-6 , Page no. 707-713, Jun-2018
CrossRef-DOI: https://doi.org/10.26438/ijcse/v6i6.707713
Online published on Jun 30, 2018
Copyright © Y. Sowmya, M. NagaRatna, C. Shoba Bindhu . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: Y. Sowmya, M. NagaRatna, C. Shoba Bindhu, “Differential Privacy Based Solution for Protecting Privacy of Big Data,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.6, pp.707-713, 2018.
MLA Style Citation: Y. Sowmya, M. NagaRatna, C. Shoba Bindhu "Differential Privacy Based Solution for Protecting Privacy of Big Data." International Journal of Computer Sciences and Engineering 6.6 (2018): 707-713.
APA Style Citation: Y. Sowmya, M. NagaRatna, C. Shoba Bindhu, (2018). Differential Privacy Based Solution for Protecting Privacy of Big Data. International Journal of Computer Sciences and Engineering, 6(6), 707-713.
BibTex Style Citation:
@article{Sowmya_2018,
author = {Y. Sowmya, M. NagaRatna, C. Shoba Bindhu},
title = {Differential Privacy Based Solution for Protecting Privacy of Big Data},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {6 2018},
volume = {6},
Issue = {6},
month = {6},
year = {2018},
issn = {2347-2693},
pages = {707-713},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2242},
doi = {https://doi.org/10.26438/ijcse/v6i6.707713}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i6.707713}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2242
TI - Differential Privacy Based Solution for Protecting Privacy of Big Data
T2 - International Journal of Computer Sciences and Engineering
AU - Y. Sowmya, M. NagaRatna, C. Shoba Bindhu
PY - 2018
DA - 2018/06/30
PB - IJCSE, Indore, INDIA
SP - 707-713
IS - 6
VL - 6
SN - 2347-2693
ER -
VIEWS | XML | |
485 | 344 downloads | 303 downloads |
Abstract
With the emergence of distributed programming frameworks like Hadoop and cloud computing technology, big data and its analytics became a reality. As big data needs huge amount of storage and computing resources, cloud has given solution to the needs of big data. However, it is important to protect big data from privacy attacks. Disclosure of identity of an entity or organization or a person in the big data is an example for loss of privacy. In other words, non-disclosure of privacy of certain sensitive attributes is nothing but preserving privacy of big data. As traditional computing is replaced by Internet based computing, it became essential to deal with privacy of big data. Many techniques came into existence to protect big data. In this paper, we considered a specific case where an adversary launches attack to know the presence or absence of an entity in the big data. We proposed an algorithm based on differential privacy to withstand the aforementioned privacy attack on big data workload in MapReduce programming paradigm. We built a prototype application and deployed it in Elastic MapReduce (EMR) of Amazon Elastic Compute Cloud (EC2). The experimental results revealed the utility of the proposed algorithm and showed proof of the concept.
Key-Words / Index Term
Big data, big data privacy, differential privacy, Elastic MapReduce (EMR)
References
[1] Imran Ghani, Naghmeh Niknejad, Seung Ryul Jeong, “Energy Saving In Green Cloud Computing Data Centers: A Review”, Journal Of Theoretical And Applied Information Technology. 74 (1), P1-16, 2015.
[2] Danny Manongga, Wiranto Herry Utomo, Hendry, “E-Learning Development As Public Infrastructure Of Cloud Computing”, Journal Of Theoretical And Applied Information Technology. 62 (1), P1-6, 2014.
[3] V.Suresh Kumar, Dr. Aramudhan, “Hybrid Optimized List Scheduling And Trust Based Resource Selection In Cloud Computing”, Journal Of Theoretical And Applied Information Technology. 69 (3), P1-9, 2014.
[4] Bachtiar H. Simamora, M.Sc., Ph.D., Julirzal Sarmedy, S.Kom, “Improving Services Through Adoption Of Cloud Computing At Pt Xyz In Indonesia”, Journal Of Theoretical And Applied Information Technology. 73 (3), P1-10, 2015.
[5] P. Kumar And Sheila Anand, “An Approach To Optimize Workflow Scheduling For Cloud Computing Environment”, Journal Of Theoretical And Applied Information Technology. 57 (3), P1-7, 2013.
[6] Ayman G. Fayoumi, “Performance Evaluation Of A Cloud Based Load Balancer Severing Pareto Traffic”, Journal Of Theoretical And Applied Information Technology. 32 (1), P1-7, 2011.
[7] Ratna Sari, Yohannes Kurniawan, “Cloud Computing Technology Infrastructure To Support The Knowledge Management Process (A Case Study Approach)”, Journal Of Theoretical And Applied Information Technology. 73 (3), P1-6, 2015.
[8] S.Sudha, V.Madhu Viswanatham, “Addressing Security And Privacy Issues In Cloud Computing”, Journal Of Theoretical And Applied Information Technology. 48 (2), P1-13,2013.
[9] M. Lemoudden, N. Ben Bouazza, B. El Ouahidi, D. Bourget, “A Survey Of Cloud Computing Security Overview Of Attack Vectors And Defense Mechanisms”, Journal Of Theoretical And Applied Information Technology. 54 (2), P1-6, 2013.
[10] Abdellah Idrissi And Manar Abourezq, “Skyline In Cloud Computing”, Journal Of Theoretical And Applied Information Technology. 60 (3), P1-12, 2015.
[11] Marcos D. Assuncaoa, Rodrigo N. Calheirosb, Silvia Bianchic, Marco A. S. Nettoc And Rajkumar Buyyab, “Big Data Computing And Clouds: Trends And Future Directions”, Acm. P1-44, 2014.
[12] Arpit Gupta,Rajiv Pandey, And Komal Verma, “Analysing Distributed Big Data Through Hadoop Map Reduce”, Ieee. 129, P1-7, 2015.
[13] Kamran Siddique, Zahid Akhtar, Edward J. Yoon, Young-Sik Jeong, I, Dipankar Dasgupta, And Yangwoo Kim, “Apache Hama: An Emerging Bulk Synchronous Parallel Computing Framework For Big Data Applications”, Ieee. 4 ,P1-9, 2016.
[14] Pedro Roger Magalhaes Vasconcelos And Gisele Azevedo De Araujo Freitas, “Performance Analysis Of Hadoop Mapreduce On An Opennebula Cloud With Kvm And Openvz Virtualizations”, Icitst. P1-7, 2014.
[15] Priya P. Sharma And Chandrakant P. Navdeti, “Securing Big Data Hadoop: A Review Of Security Issues, Threats And Solution”, Ijcsit. 5, P1-6, 2014.
[16] Lizhe Wanga, Jie Taoc, Rajiv Ranjan D, Holger Martenc, Achim Streit C, Jingying Chene And Dan Chena, “G-Hadoop: Mapreduce Across Distributed Data Centres For Data-Intensive Computing”, Ieee, P1-14, 2013.
[17] Yanish Pradhananga,Shridevi Karande And Chandraprakash Karande, “High Performance Analytics Of Bigdata With Dynamic And Optimized Hadoop Cluster”, Isbn, P1-7, 2016.
[18] Avita Katal, Mohammad Wazid And R H Goudar, “Big Data: Issues, Challenges, Tools And Good Practice”, Ieee, P1-6, 2104.
[19] Miguel G. Xavier, Marcelo V. Neves And Cesar A. F. De Rose, “A Performance Comparison Of Container-Based Virtualization Systems For Mapreduce Clusters”, Acm, P1-9, 2014.
[20] Alberto Fernandez, Sara Del Rio, Victoria Lopez, Abdullah Bawakid, Maria J. Del Jesus, Jose M. Benítez And Francisco Herrera, “Big Data With Cloud Computing: An Insight On The Computing Environment, Mapreduce, And Programming Frameworks”, Acm, P1-31,2014.
[21] Vinod Kumar Vavilapallih, Arun C Murthyh, Chris Douglasm, Sharad Agarwali ,Mahadev Konarh, Robert Evansy, Thomas Gravesy, Jason Lowey, Hitesh Shahh, Siddharth Sethh ,Bikas Sahah, Carlo Curinom And Owen O’malleyh San, “Apache Hadoop Yarn: Yet Another Resource Negotiator”, Acm, P1-P16, 2013.
[22] Ngu Wah Win And Thandar Thein, “An Efficient Big Data Analytics Platform For Mobile Devices”, Ijcsis. P1-5, 2015.
[23] Jiaqi Zhaoa, Lizhe Wangb, Jie Taoc, Jinjun Chend, Weiye Sunc, Rajiv Ranjane, Joanna Kołodziejf, Achim Streitc And Dimitrios Georgakopoulose, “A Security Framework In G-Hadoop For Big Data Computing Across Distributed Cloud Data Centres” Journal Of Computer And System Sciences, P1-14, 2014.
[24] Amresh Kumar,Kiran M.,Saikat Mukherjee And Ravi Prakash G, “Verification And Validation Of Mapreduce Program Model For Parallel K-Means Algorithm On Hadoop Cluster”, International Journal Of Computer Applications. 72, P1-P8, 2013.
[25] Mythreyee S,Poornima Purohit And Apoorva D.R, “A Study On Use Of Big Data In Cloud Computing Environment”, Ijariit. P1-7, 2017.
[26] Katarina Grolinger, Michael Hayes, Wilson A. Higashino, Alexandra L`heureux, David S. Allison And Miriam A.M. Capretz, “Challenges For Mapreduce In Big Data”, IEEE, P1-P10, 2014.
[27] Karthik Kambatlaa, Giorgos Kollias B, Vipin Kumarc And Ananth Gramaa, “Trends In Big Data Analytics”, IEEE, P1-13, 2014.
[28] Erkang Chenga, Liya Maa, Adam Blaissea, Erik Blaschb, Carolyn Sheaffb, Genshe Chenc, Jie Wua And Haibin Linga, “Efficient Feature Extraction Fromwide Area Motion Imagery By Mapreduce In Hadoop”, Acm. P1-9, 2015.
[29] John A. Miller, Casey Bowman, Vishnu Gowda Harish And Shannon Quinn, “Open Source Big Data Analytics Frameworks Written In Scala”, IEEE. 1-5, 2016.
[30] Harshawardhan S. Bhosale, Prof. Devendra And P. Gadekar, “A Review Paper On Big Data And Hadoop”, Ijsrp, P1-7, 2014.
[31] Yaxiong Zhao, Jie Wu, And Cong Liu, “Dache: A Data Aware Caching For Big-Data Applications Using The Mapreduce Framework”, Tsinghua Science And Technology. P1-12, 2014.
[32] Seyed Reza Pakize, “A Comprehensive View Of Hadoop Mapreduce Scheduling Algorithms” Ijcncs. P1-10, 2014.
[33] Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez And Scot, “Apache Spark: A Unified Engine For Big Data Processing”, Acm. 59, P1-10, 2016.
[34] Yeonhee Lee And Youngseok Lee, “Toward Scalable Internet Traffic Measurement And Analysis With Hadoop”, Acm. P1-8, 2013.
[35] Jingwei Huang, David M. Nicol, And Roy H. Campbell, “Denial-Of-Service Threat To Hadoop/Yarn Clusters With Multi-Tenancy”, Ieee. P1-8, 2014.
[36] Xindong Wu,Xingquan Zhu,Gong-Qing Wu, “Data Mining With Big Data”, Ieee. 26 (1), P.97-107, 2014).
[37] C.L. Philip Chen , Chun-Yang Zhang, “Data-Intensive Applications, Challenges, Techniques And Technologies: A Survey On Big Data”, Elsevier. P.32-44, 2014.
[38] R. Agrawal And R. Srikant , “Privacy-Preserving Data Mining”, In Proceedings Of The Acm Sigmod Conference On Management Of Data. Dallas, Pp.439-450. 2000.
[39] Securities And Exchange Commission, Edgar Log File Data Set. Available: Https://Www.Sec.Gov/Data/Edgar-Log-File-Data-Set. Last Accessed 10 November 2016.
[40] H. Kousar and B.R.P. Babu, “Efficient Map/Reduce secure data using Multiagent System,” International Journal of Computer Sciences and Engineering. 6 (5), p1-5, 2018.
[41] M. Murugesan and A. Kalaiyarasi, “An Efficient Deduplication Mechanism for Big Data Analysis in Cloud Environments,” International Journal of Computer Sciences and Engineering. 6 (4), p1-7,2018.