GoodSync is a cross-platform file backup/synchronization platform that supports all major file synchornization protocols and a number of cloud file storage services.
We often get questions from out customers about the support within GoodSync for file de-duplicataion and compression as ways to optimimize the backup.
In our design process, we have considered a number of optimization options including bandwidth throttling (to avoid network clogging, file transfer through mulitple parallel
channels(for fast networks), block-level synchronization (for large files that only changes by a little bit), "trusting" one of the sides to have the latest data in sync logs
(when all changes to one side is only done remotely through a syncronization/backup procedure) to avoid lengthy analysis.
Among the potential optimization options we have also considered de-duplication and compression. We would like to share with you some results of our internal investigations converting these two processes.
De-duplication is a process of removing identical copies of files, leaving only one "master copy" of each file.
Depending on the nature of business and actual local policies and procedures, we have heard of cross-company de-deduplication process to reduce the size of all stored files by the factor of 5 (80%).
Most of our clients with desktop backup files (excluding email message storage) achieve de-duplication across the entire company in single to low percent's. De-duplication normally happends on the server side which requires installation of a backup server.
The main benefit of de-deuplication is to save on file server space.
De-duplication has various draw backs:
- It's impossible to perform de-duplication on a cloud file storage because a file server needs to be installed.
- When it's possible, the server requires a complex overhead to keep track of removed copies.
- Once a file is corrupted, it is impossible to recover it from its copies. The more copies we removed, the more people will get affected by the corruption.
- The complexity of managing de-duplicated files creates a huge overhead. On the other hand, as it turns out de-duplication can be easily and effectively replaced by a single configuration step: if a large number of people are using the same files, create a common folder and perform syncronization of all files in that folder between a central location and individial computers. GoodSync allows for that functionality.
First, all modern files are compressed: Office, Images, Movies, PDFS,etc.
Second, compression takes a lot of computing resources. So there is a necessary trade-off between negatively impacting computing capabilites on the work place of an employee in order to save on network traffic and file server space.
Overall, considering the nature of modern files formats most utilized by computer users and the restraints of desktop computational power, compression is counter-productive for desktop back optimization.
Overall, based on our research, both de-duplication and compression raise more problems than solutions when applied to the backup of user files.