Open Access   Article Go Back

A Deduplication -Aware similarity finding and removal system for Cloud Provider and Its Users

K. Reddy Pradeep1 , G. Sreehitha2

Section:Research Paper, Product Type: Journal Paper
Volume-6 , Issue-9 , Page no. 732-736, Sep-2018

CrossRef-DOI:   https://doi.org/10.26438/ijcse/v6i9.732736

Online published on Sep 30, 2018

Copyright © K. Reddy Pradeep, G. Sreehitha . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at   Google Scholar | DPI Digital Library

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: K. Reddy Pradeep, G. Sreehitha, “A Deduplication -Aware similarity finding and removal system for Cloud Provider and Its Users,” International Journal of Computer Sciences and Engineering, Vol.6, Issue.9, pp.732-736, 2018.

MLA Style Citation: K. Reddy Pradeep, G. Sreehitha "A Deduplication -Aware similarity finding and removal system for Cloud Provider and Its Users." International Journal of Computer Sciences and Engineering 6.9 (2018): 732-736.

APA Style Citation: K. Reddy Pradeep, G. Sreehitha, (2018). A Deduplication -Aware similarity finding and removal system for Cloud Provider and Its Users. International Journal of Computer Sciences and Engineering, 6(9), 732-736.

BibTex Style Citation:
@article{Pradeep_2018,
author = {K. Reddy Pradeep, G. Sreehitha},
title = {A Deduplication -Aware similarity finding and removal system for Cloud Provider and Its Users},
journal = {International Journal of Computer Sciences and Engineering},
issue_date = {9 2018},
volume = {6},
Issue = {9},
month = {9},
year = {2018},
issn = {2347-2693},
pages = {732-736},
url = {https://www.ijcseonline.org/full_paper_view.php?paper_id=2935},
doi = {https://doi.org/10.26438/ijcse/v6i9.732736}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i9.732736}
UR - https://www.ijcseonline.org/full_paper_view.php?paper_id=2935
TI - A Deduplication -Aware similarity finding and removal system for Cloud Provider and Its Users
T2 - International Journal of Computer Sciences and Engineering
AU - K. Reddy Pradeep, G. Sreehitha
PY - 2018
DA - 2018/09/30
PB - IJCSE, Indore, INDIA
SP - 732-736
IS - 9
VL - 6
SN - 2347-2693
ER -

VIEWS PDF XML
411 252 downloads 243 downloads
  
  
           

Abstract

Data reduction has become increasingly very important in storage systems thanks to the explosive growth of digital information among the globe that has ushered among the large information era. In existing system cloud suppliers offer less method capability and thus displease their users for poor service quality. Therefore, it is vital for a cloud provider to select out applicable servers to provide services; such it reduces worth the most quantity as potential wherever as satisfying its users at the same time. Here the foremost disadvantage duplication therefore to beat of those problems we tend to tend to pick planned model. Throughout this paper, we tend to gift DARE, a low-overhead Deduplication-Aware likeness detection and Elimination theme that effectively exploits existing duplicate-adjacency information for terribly economical likeness detection in information deduplication based backup/archiving storage systems. Our experimental results and backup data sets show that DARE only consumes concerning 1/4 and 1/2 severally of the computation and classification overheads required by the conventional super-feature approaches whereas investigating 2-10% extra redundancy and achieving an improved turnout, by exploiting existing duplicate-adjacency information for likeness detection and finding the “sweet spot” for the super-feature approach.

Key-Words / Index Term

Data deduplication, delta compression, storage system, index structure, performance evaluation.

References

[1]. B. Zhu, K. Li, and R. H. Patterson, “Avoiding the disk bottleneck in the data domain deduplication file system,” in Proc. 6th USENIX Conf. File Storage Technol., Feb. 2008, vol. 8, pp. 1–14.
[2]. D. T. Meyer and W. J. Bolosky, “A study of practical deduplication,” ACM Trans. Storage, vol. 7, no. 4, p. 14, 2012.
[3]. G. Wallace, F. Douglis, H. Qian, P. Shilane, S. Smaldone, M. Chamness, and W. Hsu, “Characteristics of backup workloads in production systems,” in Proc. 10th USENIX Conf. File Storage Technol., Feb. 2012, pp. 33–48.
[4]. A. El-Shimi, R. Kalach, A. Kumar, A. Ottean, J. Li, and S. Sengupta, “Primary data deduplication large scale study and system design,” in Proc. Conf. USENIX Annu. Tech. Conf., Jun. 2012, pp. 285– 296.
[5]. L. L. You, K. T. Pollack, and D. D. Long, “Deep store: An archival storage system architecture,” in Proc. 21st Int. Conf. Data Eng., Apr. 2005, pp. 804–815.
[6]. A. Muthitacharoen, B. Chen, and D. Mazieres, “A low-bandwidth network file system,” in Proc. ACM Symp. Oper. Syst. Principles. Oct. 2001, pp. 1–14.
[7]. N. Agrawal, W. Bolosky, J. Douceur, and J. Lorch. A five-year study of file-system metadata. In FAST’07: Proceedings of 5th Conference on File and Storage Technologies, pages 31–45, February 2007. [2] M. G. Baker, J. H. Hartman, M. D. Kupfer, K. W. Shirriff, and J. K. Ousterhout. Measurements of a distributed file system. In Proceedings of the Thirteenth Symposium on Operating Systems Principles, Oct. 1991.
[8]. W. Hsu and A. J. Smith. Characteristics of I/O traffic in personal computer and server workloads. IBM Systems Journal, 42:347–372, April 2003.
[9]. IDC. Worldwide purpose-built backup appliance 2011-2015 forecast and 2010 vendor shares, 2011. [17] E. Kruus, C. Ungureanu, and C. Dubnicki. Bimodal content defined chunking for backup streams. In FAST’10: Proceedings of the 8th Conference on File and Storage Technologies, February 2010.
[10]. P. Kulkarni, F. Douglis, J. LaVoie, and J. M. Tracey. Redundancy elimination within large collections of files. In Proceedings of the USENIX Annual Technical Conference, pages 59–72, 2004.
[11]. D. A. Lelewer and D. S. Hirschberg. Data compression. ACM Computing Surveys, 19:261–296, 1987. [20] A. Leung, S. Pasupathy, G. Goodson, and E. L. Miller. Measurement and analysis of large-scale network file system workloads. In Proceedings of the 2008 USENIX Technical Conference, June 2008.
[12]. J. Bennett, M. Bauer, and D. Kinchlea. Characteristics of files in NFS environments. In SIGSMALL’91: Proceedings of 1991 Symposium on Small Systems, June 1991.
[13]. D. R. Bobbarjung, S. Jagannathan, and C. Dubnicki. Improving duplicate elimination in storage systems. Transactions on Storage, 2:424–448, November 2006.
[14]. W. J. Bolosky, S. Corbin, D. Goebel, and J. R. Douceur. Single instance storage in Windows 2000. In Proceedings of the 4th conference on USENIX Windows Systems Symposium - Volume 4, pages 2– 2, Berkeley, CA, USA, 2000. USENIX Association.
[15]. M. Chamness. Capacity forecasting in a backup storage environment. In LISA’11: Proceedings of the 25th Large Installation System Administration Conference, Dec. 2011.
[16]. A. Chervenak, V. Vellanki, and Z. Kurmas. Protecting file systems: A survey of backup techniques. In Joint NASA and IEEE Mass Storage Conference, 1998.
[17]. W. Dong, F. Douglis, K. Li, H. Patterson, S. Reddy, and P. Shilane. Tradeoffs in scalable data routing for deduplication clusters. In FAST’11: Proceedings of 9th Conference on File and Storage Technologies, Feb. 2011.