MongoDB - Using GridFS
In the last article we look at indexes in MongoDB. In the article we’ll be learning about one of the coolest features in Mongo: storing any and every file type out there! It’s true! And while we can do this in MySQL, file storage in MongoDB has several important advantages with the most important being that MongoDB is better suited for scalability, and for cases where we have more that one machine supporting the application. In cases like this you won’t have to drive yourself crazy trying to distribute files among various computers—Mongo does it for you.
You many not really need the contents of this article. We’re all working with MongoDB via some form of abstraction that the programing language gives us—it doesn’t matter if it’s Java, PHP, Node.js, or .Net. At the end of the day, the subject of uploading and handling files via abstraction is pretty straightforward. But it is worthwhile to learn how GridFS works for when something goes wrong.
Dealing with files (and documents over 16MB) is done by GridFS, a plugin included with MongoDB whether you’re working in Linux or Windows. If you run the following in the Linux console...
mongofiles --version
version 2.6.4
you’ll be able to see the version of the .mongofiles file. If you have windows, open the cmd to the folder where you put your Mongo files and run mongofiles.exe –version. This should do the trick. In all the examples here which are geared toward Linux, just take the Linux line and add the .exe to mongofiles and it should work in Windows. (Or consider making the switch to Linux already. It’s super easy!)
In order to upload a file to MongDB, we need to start with a file. If you don’t have one ready, just create one with this line of code:
dd if=/dev/zero of=demo.test bs=524288 count=1
This will create demo.test that will be half a megabyte. But any ol’ file will do—even an image of Bar Rafaeli (if you need a little early morning SEO). Now that we screwed around with a files, we need to upload one to GridFS. This we’ll do via:
mongofiles -d test put demo.test
demo.test will be the name of our file. Hold up, what’s going on here? First we’ve got the mongofiles, then the -d test operator which is important because it states to which database we’ll be putting the file in—in this case test. Then we’ve got the word put and then the file name.
The result should be something like:
connected to: 127.0.0.1
added file: { _id: ObjectId('541fd00c54af6edf6f9cb818'), filename: "demo.test", chunkSize: 261120, uploadDate: new Date(1411371020875), md5: "59071590099d21dd439896592338bf95", length: 524288 }
done!
If we go into mongo and have a quick peek at the collections in the db that we put the file in, we’ll find we have two new collections!
fs.chunks
Fs.files
The .chunks file contains parts of the file, as the name suggests. The file is divided into several parts, with each part being connected to metadata in the .files file. If we look at .files we’ll see all the files there, or in this case our one single file.
db.fs.files.find({})
{ "_id" : ObjectId("541fd00c54af6edf6f9cb818"), "filename" : "demo.test", "chunkSize" : 261120, "uploadDate" : ISODate("2014-09-22T07:30:20.875Z"), "md5" : "59071590099d21dd439896592338bf95", "length" : 524288 }
We can also see the files that the file is divided into, their size, and the total size of the file. What’s important here is the OjbectId.
We can search for the files—since they have ObjectId we can easily run a reference on them to other collections. We can also extract with the ObjectId—useful with text files, but less so with binary files.
And how do we call the file? Also with mongofiles:
mongofiles get demo.test
To see all the files we can use:
mongofiles list
There are a few other things that we can do with mongofiles (take a few minutes and check out their help section), but most of us won’t need more than this since if you use GridFS, you’ll do so by way of class that will take care of it for you. What’s important to remember is that in the end, the files go into two collections—one called chunks that contains the parts of the file, and the other called files the contains the data of the actual file, including the ObjectId. In .files there are already automatic indexes of the filename, which makes the whole process of searching for files faster.
In the next article we’ll learn about importing and exporting files in MongoDB.
About the author: Ran Bar-Zik is an experienced web developer whose personal blog, Internet Israel, features articles and guides on Node.js, MongoDB, Git, SASS, jQuery, HTML 5, MySQL, and more. Translation of the original article by Aaron Raizen.
Recent Stories
Top DiscoverSDK Experts
Compare Products
Select up to three two products to compare by clicking on the compare icon () of each product.
{{compareToolModel.Error}}
{{CommentsModel.TotalCount}} Comments
Your Comment