Processing ZIP Files With Java
Zip is an archive file format. The most common file extension for zip files is .zip. This file format can be thought of as a container for other files and folders. It doesn't only work as a container, but also it optionally compresses the files inside it. The most common compression algorithm used for zip file format is DEFLATE compression algorithm. The file format was created by Phil Katz and it was first implemented in PKZIP utility from PKWARE, Inc. Since then it was adapted by many different software tools and operating systems. Windows and Mac has built-in support for zip. Many Linux distributions come with built-in zip support. It is also easy to install zip utility packages.
There are a lot of other archive file formats that are actually zip files under the skin. For example, the jar package of Java is nothing but zip files. The android application package format apk is also a zip file format in reality. There are countless other commonly used file formats that are nothing but zip files.
Java provides handy classes for working with zip files. The java.util.zip package contains all the necessary classes for working with zip. You can read and write zip files with the help of these classes and associated methods.
How Zip Works
Every programmer should know how zip works before they start coding for it. I am not going to write another technical specification paper here but will rather describe in plain English how the zip file system works.
A zip file can contain any number of files and folders inside of it. File and folders can stay in any other folder. A zip file emulates a file system. Every file inside of the zip file can have file name, file size, file creation and modification date-time and other meta information. Every file can also be compressed or uncompressed inside zip. All the information about each file and folder is stored in a separate portion inside the zip file.
The files and folders you want to keep inside a zip file are copied byte by byte and then they can be compressed or uncompressed according to your preference. Those bytes are written to the zip file along with its metadata. All the common meta information that is needed by the zip spec is copied, packed and placed as a table inside another part of the file so that later a zip utility can look up that table. Do not confuse that table as a database table or something. It's just the zip's own way of storing things.
Java Classes for Working with Zip
ZipInputStream, ZipOutputStream, ZipEntry are the most important classes for working with zip files in Java. There are other classes that provide a higher degree of abstraction. ZipInputStream is responsible for working with zip file reading, ZipOutputStream is responsible for writing to zip files. ZipEntry represents individual file/folder entry for zip files.
Getting Started with Code
To get started with coding for zip files in Java, you can choose any IDE or code editor you like. Create a Java project with your IDE. Create a public class with a main method to write functional code inside that. I am calling the public class JavaZip. You can call it whatever you like. Import the necessary classes for working with zip and working with file streams. We need some classes from the java.io package for working with binary files. So, our initial code should look like the following.
import java.util.zip.ZipEntry ;
import java.util.zip.ZipInputStream ;
import java.util.zip.ZipOutputStream ;
import java.io.FileInputStream;
import java.io.FileOutputStream;
public class JavaZip {
public static void main(String[] args){
}
}
All the imported classes are not necessary at this moment.
Writing a File to a ZIP file
Let's say, we have a file named file1.txt in the current working directory. We want to zip this file. Our ZipOutputStream object will demand a ZipEntry object before you can copy the bytes from the file1.txt. So, we create a zip entry with the filename in its constructor.
ZipEntry zen = ZipEntry("file1.txt");
If you want you can name it different than the original file. You could have written filxxxx.txt instead of file1.txt or something else. There is no real connection between the name with the ZipEntry and the file name in the system file. But for now I do not see any need to use a different name instead of the original file name.
But before we can put the ZipEntry inside the ZipOutputStream we need to create the ZipOutputStream object. But again ZipOutputStream cannot work without a real file in the system. So, we are going to create a FileOutputStream for that. I want to create a new file named my_archive.zip in the current working directory.
import java.util.zip.ZipEntry ;
import java.util.zip.ZipInputStream ;
import java.util.zip.ZipOutputStream ;
import java.io.FileInputStream;
import java.io.FileOutputStream;
public class JavaZip {
public static void main(String[] args) throws Exception{
FileOutputStream fos = new FileOutputStream("my_archive.zip");
FileInputStream fin = new FileInputStream("file1.txt");
ZipOutputStream zos = new ZipOutputStream(fos);
byte[] byteBuffer = new byte[200];
ZipEntry zen = new ZipEntry("file1.txt");
zos.putNextEntry(zen);
int size;
while( (size = fin.read(byteBuffer)) > 0){
zos.write(byteBuffer, 0, size);
}
zos.closeEntry();
zos.close();
fos.close();
fin.close();
}
}
Hit build and run to see a zip file created in your current working directory. Open the file with a zip viewer or extract it. You will see that our intended file is living inside the zip in peace!
Reading Files from a Zip File
Reading a zip file and extracting data is quite similar to writing to a zip file. You need to open ZipInputStream with FileInputStream and then you have to get entry from ZipInputStream object. You can call get getNextEntry() until you get null in return to list all the entries in the zip file.
import java.util.zip.ZipEntry ;
import java.util.zip.ZipInputStream ;
import java.util.zip.ZipOutputStream ;
import java.io.FileInputStream;
import java.io.FileOutputStream;
public class JavaZip {
public static void main(String[] args) throws Exception{
FileInputStream fin = new FileInputStream("my_archive.zip");
ZipInputStream zis = new ZipInputStream(fin);
ZipEntry zipEntry = zis.getNextEntry();
System.out.println("File Size: " + zipEntry.getSize());
System.out.println("Last Modification Date: " + zipEntry.getLastModifiedTime());
FileOutputStream fout = new FileOutputStream("Extracted - " + zipEntry.getName());
for (int c = zis.read(); c != -1; c = zis.read()) {
fout.write(c);
}
zis.closeEntry();
fout.close();
zis.close();
fin.close();
}
}
Look at the current working directory and you will see a new file there. In the above code I did not iterate over all the entries. Again I did not handle exception. It was so to make this lesson easier for you when you are beginning working with it.
Java has some other convenient utility classes for working with zip files. Everything cannot be discussed in a single article. I have plan to write more articles on this topic in future. In the meantime you are advised to keep practicing and take a look at the official Java documentation on the classes.
Recent Stories
Top DiscoverSDK Experts
Compare Products
Select up to three two products to compare by clicking on the compare icon () of each product.
{{compareToolModel.Error}}
{{CommentsModel.TotalCount}} Comments
Your Comment