Open Access   Article Go Back

AN EFFICIENT DEDUPLICATION MECHANISM FOR BIG DATA ANALYSIS IN CLOUD ENVIRONMENTS

M.Murugesan 1 , A. Kalaiyarasi2

  1. Dept.of CSE, M.Kumarasamy College of Engineering, Karur,India.
  2. Dept.of CSE, M.Kumarasamy College of Engineering, Karur,India.

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-4 , Page no. 389-395, Apr-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i4.389395

Online published on Apr 30, 2018

Copyright © M.Murugesan, A. Kalaiyarasi . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: M.Murugesan, A. Kalaiyarasi, “AN EFFICIENT DEDUPLICATION MECHANISM FOR BIG DATA ANALYSIS IN CLOUD ENVIRONMENTS,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.4, pp.389-395, 2018.

MLA Style Citation: M.Murugesan, A. Kalaiyarasi "AN EFFICIENT DEDUPLICATION MECHANISM FOR BIG DATA ANALYSIS IN CLOUD ENVIRONMENTS." International Journal of Computer Sciences and Engineering 6.4 (2018): 389-395.

APA Style Citation: M.Murugesan, A. Kalaiyarasi, (2018). AN EFFICIENT DEDUPLICATION MECHANISM FOR BIG DATA ANALYSIS IN CLOUD ENVIRONMENTS. International Journal of Computer Sciences and Engineering, 6(4), 389-395.

BibTex Style Citation:
@article{Kalaiyarasi_2018,
author = {M.Murugesan, A. Kalaiyarasi},
title = {AN EFFICIENT DEDUPLICATION MECHANISM FOR BIG DATA ANALYSIS IN CLOUD ENVIRONMENTS},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {4 2018},
volume = {6},
Issue = {4},
month = {4},
year = {2018},
issn = {2347-2693},
pages = {389-395},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=1907},
doi = {https://doi.org/10.26438/ijcse/v6i4.389395}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i4.389395}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=1907
TI - AN EFFICIENT DEDUPLICATION MECHANISM FOR BIG DATA ANALYSIS IN CLOUD ENVIRONMENTS
T2 - International Journal of Computer Sciences and Engineering
AU - M.Murugesan, A. Kalaiyarasi
PY - 2018
DA - 2018/04/30
PB - IJCSE, Indore, INDIA
SP - 389-395
IS - 4
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
755 286 downloads 139 downloads
  
  
           

Abstract

With the consistent and exponential increment of the quantity of clients and the span of their information, information deduplication turns out to be increasingly a need for distributed storage suppliers. By putting away a one of a kind duplicate of copy information, cloud suppliers significantly diminish their capacity and information exchange costs. These immense volumes of information require some down to earth stages for the capacity, handling and accessibility and cloud innovation offers every one of the possibilities to satisfy these necessities. Information deduplication is alluded to as a procedure offered to distributed storage suppliers (CSPs) to dispense with the copy information and keep just a solitary one of a kind duplicate of it for storage room sparing reason.In this paper, we display a plan that allows an all the more fine-grained exchange off. The instinct is that outsourced information may require distinctive levels of assurance, contingent upon how mainstream it is: content shared by numerous clients.We show an originalfelt that isolates data according to their reputation. In light of this thought, we outline an encryption arrange for that ensures semantic security for obnoxious information and gives weaker security and better putting away and transmission restrict benefits for eminent information. Subsequently, information de-duplication can be able for standard information, while semantically secure encryptionguarantees unsavory substance. We can use the backup recover system at the time of blocking and also analyze frequent login access system.

Key-Words / Index Term

Cloud storage, Chunks, Similarity matching, Data security, Backup Recovery

References

[1]. X. Zhang, C. Liu, S. Nepal and J. Chen, “An Efficient Quasiidentifier Index based Approach for Privacy Preservation over Incremental Data Sets on Cloud,” Journal of Computer and System Sciences (JCSS), 79(5): 542-555, 2013.
[2]. X. Zhang, T. Yang, C. Liu and J. Chen, “A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization using Systems, in MapReduce on Cloud,” IEEE Transactions on Parallel and Distributed, 25(2): 363-373, 2014.
[3]. N. Laptev, K. Zeng and C. Zaniolo, “Very fast estimation for result and accuracy of big data analytics: The EARL system,” Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE), pp. 1296-1299, 2013.
[4]. T. Condie, P. Mineiro, N. Polyzotis and M. Weimer, “Machine learning on Big Data,” Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE), pp. 1242-1244, 2013.
[5]. Aboulnaga and S. Babu, “Workload management for Big Data analytics,” Proceedings of the 29th IEEE International Conference on Data Engineering (ICDE), pp. 1249, 2013
[6]. M. Abadi, D. Boneh, I. Mironov, A. Raghunathan, and G. Segev,“Message-locked encryption for lock-dependent messages,” in Advancesin Cryptology - CRYPTO 2013 - 33rd Annual Cryptology Conference,Santa Barbara, CA, USA, August 18-22, 2013. Proceedings, PartI, 2013, pp. 374–391.
[7]. L. Wang, J. Zhan, W. Shi and Y. Liang, “In cloud, can scientific communities benefit from the economies of scale?” IEEE Transactions on Parallel and Distributed Systems 23(2): 296-303, 2012.
[8]. B. Li, E. Mazur, Y. Diao, A. McGregor and P. Shenoy, “A platform for scalable one-pass analytics using mapreduce,” in: Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD`11), 2011, pp. 985-996.
[9]. R. Kienzler, R. Bruggmann, A. Ranganathan and N. Tatbul, “Stream as you go: The case for incremental data access and processing in the cloud,” IEEE ICDE International Workshop on Data Management in the Cloud (DMC`12), 2012
[10]. C. Olston, G. Chiou, L. Chitnis, F. Liu, Y. Han, M. Larsson, A. Neumann, V.B.N. Rao, V. Sankarasubramanian, S. Seth, C. Tian, T. ZiCornell and X. Wang, “Nova: Continuous pig/hadoop workflows,” Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD`11), pp. 1081-1090, 2011.
[11]. K.H. Lee, Y.J. Lee, H. Choi, Y.D. Chung and B. Moon, “Parallel data processing with mapreduce: A survey,” ACM SIGMOD Record 40(4): 11-20, 2012.
[12]. T. Jiang, X. Chen, Q. Wu, J. Ma, W. Susilo, and W. Lou, “Secure andefficient cloud data Deduplication with randomized tag,” IEEE Trans.Information Forensics and Security, vol. PP, no. 99,
[13]. Y. Zhou, D. Feng, W. Xia, M. Fu, F. Huang, Y. Zhang, and C. Li, “Secdep: A user-aware efficient fine-grained secure Deduplication scheme with multi-level key management,” in IEEE 31st Symposium on Mass Storage Systems and Technologies, MSST 2015, Santa Clara, CA, USA, May 30 - June 5, 2015, 2015, pp. 1–14.
[14]. Z. Yan, W. Ding, X. Yu, H. Zhu, and R. H. Deng, “Deduplication on encrypted big data in cloud,” IEEE Trans. Big Data, vol. 2, no. 2, pp. 138–150, 2016.
[15]. “Prism (surveillance program),” https://www.theguardian.com/us-news/ prism.
[16]. R. Bhaskar, S. Guha, S. Laxman, and P. Naldurg, “Verito: A practical system for transparency and accountability in virtual economies,” in 20th Annual Network and Distributed System Security Symposium, NDSS 2013, San Diego, California, USA, February 24-27, 2013, 2013.
[17]. D. Boyd, K. Crawford, S. Shaikh, and V. Ravishankar, “Six provocations for big data,” Six-provocations-for-Big-Data1.pdf.
[18]. X. Yang, R. Lu, H. Liang, and X. Tang, “SFPM: A secure and fine-grained privacy-preserving matching protocol for mobile social networking,” Big Data Research, vol. 3, pp. 2–9, 2016.
[19]. A. K. Mohan, E. Cutrell, and B. Parthasarathy, “Instituting credibility, accountability and transparency in local service delivery?: helpline and aasthi in karnataka, india,” in International conference on information and communication technologies and development, ICTD 2013, Cape Town, South Africa, December 7-10, 2013, Volume 1: Papers, 2013, pp. 238 247.
[20]. S Saravanan, V Venkatachalam, Improving map reduce task scheduling and micro-partitioning mechanism for mobile cloud multimedia services [J]. Int J of AdvIntell Paradigms 8(2), 156–167 (2016)
[21]. S Saravanan, V Venkatachalam,“Advance Map Reduce Task Scheduling algorithm using mobile cloud multimedia services architecture” IEEE Digital Explore,pp21-25,2014.
[22]. S.Saravanan, Arivarasan.”An efficientranked keyword search for effective utilization of outsourced cloud data” Journal of Global Research in Computer Science, Vol4(4), pp:8-12
[23]. S.Swathi “Preemptive Virtual Machine Scheduling Using CLOUDSIM Tool”, International Journal of Advances in Engineering, 2015, 1(3), 323 -327 ISSN: 2394-9260, pp:323-327.
[24]. S Saravanan, V Venkatachalam, Then Malligai "Optimization of SLA violation in cloud computing A artificial bee colony"2015, 1(3), 323 -327 ISSN: 2394-9260, pp:410-414.
[25]. M.Murugesan, A.Kalaiyarasi, “Secure Data Compression Scheme in Cloud Environments With Backup Recovery Scheme”, International Journal of Pure and Applied Mathematics, issue Feb. 2018, pp467-471.
[26]. Dr.P.Santhi, “Privacy Preserving and consistency check of Data Store in Cloud using Attribute based Encryption”, International Journal of Engineering Development and Research, issue 2017.