Hadoop Create Archive

Utility to create Archive in Hadoop

Using hadoop command of Big Data.



Usage: hadoop archive -archiveName name -p <parent> [-r <replication factor>] <src>* <dest>

-archiveName is the name of the archive you would like to create. An example would be foo.har. The name should have a *.har extension. The parent argument is to specify the relative path to which the files should be archived to. 

Example would be :

-p /foo/bar a/b/c e/f/g

Here /foo/bar is the parent path and a/b/c, e/f/g are relative paths to parent. Note that this is a Map/Reduce job that creates the archives. You would need a map reduce cluster to run this. For a detailed example the later sections.

-r indicates the desired replication factor; if this optional argument is not specified, a replication factor of 10 will be used.

If you just want to archive a single directory /foo/bar then you can just use

hadoop archive -archiveName zoo.har -p /foo/bar -r 3 /outputdir

Archive Name:
Replication Factor:
Source Directory:
Destination Directory: