TL:DR - Skip the theory - Take me to the code
Prerequisites
Notes: For this article it is required that you have installed working version of Node.js on your machine. You will also need an http client for request handling. For this purpose, I will use Postman. ย
What are streams for Node.js?
Streams are a very basic method of data transmission. In a nutshell, they divide your data into smaller chunks and transfer (pipe) them, one by one, from one place to another. Whenever you're watching a video on Netflix, you're experiencing them first hand - not the whole video is initially sent to your browser, but only parts of it, piece by piece.
A lot of npm and native node modules are using them under the hood, as they come with a few neat features:
- Asynchronously sending requests and responses
- Reading data from - and writing data to one another - physical location
- Processing data without putting them into memory
The processing part makes streams particularly charming as it makes dealing with bigger files more efficient and lives the spirit of node's event loop unblocking i/o magic.
To visualize streams, consider the following example.
You have a single file with a size of 4 gb. When processing this file, it is loaded into your computers memory. That would be quite a boulder to digest all at once.
Buffering means loading data into RAM. Only after buffering the full file, it will be sent to a server.
Streams, in comparison to the example above, would not read/write the file as a whole, but rather split it into smaller chunks. These can then be sent, consumed or worked through one by one, lowering stress for the hardware during runtime. And that's exactly what we'll build now.
Instead of loading the whole file, streams process parts (chunks) of it one by one.
In a nutshell, streams splits a computer resource into smaller pieces, working through these one by one, instead of processing it as a whole.
Get started
... or skip to the full example right away
Let's formulate the features we'd like to have:
- To keep it simple, we will work with a single index file that opens an express server.
- Inside of it, there's a route that reacts to POST - requests and in which the streaming will take place.
- The file sent will be uploaded to the project's root directory.
- (Optional): We are able to monitor the streaming progress while the upload takes place.
Also, let's do the following to get started:
- Open up your favourite text editor and create a new folder.
- Initialize a npm project and install the necessary modules.
- Add an index.js file, which we'll populate with our code in a moment.
# Initialize the project
$ npm init -y
# Install the express module
$ npm i express
# Optionally add nodemon as dev dependency
$ npm i -D nodemon
# Create the index.js file
# $ New-Item index.js (Windows Powershell)
$ touch index.js (Linux Terminal)
When everything is done, you should have a folder structure that looks like this:
project-directory
| - node_modules
| - package.json
| - index.js
Create the server
Add the following to your index.js file to create the server listening to request:
// Load the necessary modules and define a port
const app = require('express')();
const fs = require('fs');
const path = require('path');
const port = process.env.PORT || 3000;
// Add a basic route to check if server's up
app.get('/', (req, res) => {
res.status(200).send(`Server up and running`);
});
// Mount the app to a port
app.listen(port, () => {
console.log('Server running at http://127.0.0.1:3000/');
});
Then open the project directory in a terminal / shell and start the server up.
# If you're using nodemon, go with this
# in the package.json:
# { ...
# "scripts": {
# "dev": "nodemon index.js"
# }
# ... }
# Then, run the dev - script
$ npm run dev
# Else, start it up with the node command
$ node index.js
Navigate to http://localhost:3000. You should see the expected response.
Writing a basic stream to save data to a file
There are two types of streaming methods - one for reading, and one for writing. A very simplistic example of how to use them goes like this, whereas whereFrom and whereTo are the respective path to from and to where the stream should operate. This can either be a physical path on your hard-drive, a memory buffer or a URL. ย
const fs = require("fs");
const readStream = fs.createReadStream(whereFrom)
const writeStream = fs.createWriteStream(whereTo)
// You could achieve the same with destructuring:
const {createReadStream, createWriteStream} = require("fs");
After being created and till it closes, the stream emits a series of events that we can use to hook up callback functions. One of these events is 'open', which fires right after the stream is instantiated.
Add the following below the app.get() method in the index.js - file
app.post('/', (req, res) => {
const filePath = path.join(__dirname, `/image.jpg`);
const stream = fs.createWriteStream(filePath);
stream.on('open', () => req.pipe(stream););
});
What I found particular interesting about this one is:
Why does thereq
argument have a pipe method?
The answer is noted in the http - module documentation which express builds on - a request itself is an object that inherits from the parent 'Stream' class, therefor has all its methods available.
Having added the stream, let us now reload the server, move to Postman and do the following:
- Change the request method to POST and add the URL localhost:3000.
- Select the 'Body' tab, check the binary option and choose a file you would like to upload. As we've hardcoded the name to be 'image.jpg', an actual image would be preferable.
- Click on 'Send' and check back to the code editor.
If everything went well, you'll notice the file you just chose is now available in the project's root directory. Try to open it and check if the streaming went successful.
If that was the functionality you were looking for, you could stop reading here. If you're curious to see what else a stream has in stock, read ahead.
Use stream -events and -methods
Streams, after being created, emit events. In the code above, we're using the 'open' - event to only pipe data from the request to its destination after the stream is opened. These events work very similar to the ones you know from app.use(). and make use of node's event loop. Let's now take a look at some of these which can be used to control the code flow
Event 'open'
As soon as the stream is declared and starts its job, it fires the open event. That is the perfect opportunity to start processing data, just as we've done previously.
Event 'drain'
Whenever a data chunk is being processed, it's 'drained' to / from somewhere. You can use this event to e.g. monitor how much bytes have been streamed.
Event 'close'
After all data has been sent, the stream closes. A simple use case for 'close' is to notify a calling function that the file has been completely processed and can be considered available for further operations.
Event 'error'
If things go sideways, the error event can be used to perform an action to catch exceptions.
Let us now integrate the three new events with some basic features. Add the following to your main.js file, below the closing of the 'open' event:
stream.on('drain', () => {
// Calculate how much data has been piped yet
const written = parseInt(stream.bytesWritten);
const total = parseInt(headers['content-length']);
const pWritten = (written / total * 100).toFixed(2)
console.log(`Processing ... ${pWritten}% done`);
});
stream.on('close', () => {
// Send a success response back to the client
const msg = `Data uploaded to ${filePath}`;
console.log('Processing ... 100%');
console.log(msg);
res.status(200).send({ status: 'success', msg });
});
stream.on('error', err => {
// Send an error message to the client
console.error(err);
res.status(500).send({ status: 'error', err });
});
Wrap up & modularization
Since you probably would not drop your functions right into a .post() callback, let's go ahead and create its own function to wrap this article up. I'll spare you with the details, you can find the finalized code below.
Also, if you skipped from above, the following is happening here:
- The code below creates an express server that handles incoming post requests.
- When a client sends a file stream to the route, its contents are uploaded.
- During the upload, four events are fired.
- In these, functions are called to process the file's content and provide basic feedback on the upload progress.
Now it's your turn. How about building a user interface that takes over the job of sending a file to the root path? To make it more interesting, try using the browser's filereader API and send the file asynchronously, instead of using a form. Or use a module like Sharp to process an image before streaming it back to the client.
PS: In case you try the former method, make sure to send the file as an ArrayBuffer
// Load the necessary modules and define a port
const app = require('express')();
const fs = require('fs');
const path = require('path');
const port = process.env.PORT || 3000;
// Take in the request & filepath, stream the file to the filePath
const uploadFile = (req, filePath) => {
return new Promise((resolve, reject) => {
const stream = fs.createWriteStream(filePath);
// With the open - event, data will start being written
// from the request to the stream's destination path
stream.on('open', () => {
console.log('Stream open ... 0.00%');
req.pipe(stream);
});
// Drain is fired whenever a data chunk is written.
// When that happens, print how much data has been written yet.
stream.on('drain', () => {
const written = parseInt(stream.bytesWritten);
const total = parseInt(req.headers['content-length']);
const pWritten = ((written / total) * 100).toFixed(2);
console.log(`Processing ... ${pWritten}% done`);
});
// When the stream is finished, print a final message
// Also, resolve the location of the file to calling function
stream.on('close', () => {
console.log('Processing ... 100%');
resolve(filePath);
});
// If something goes wrong, reject the primise
stream.on('error', err => {
console.error(err);
reject(err);
});
});
};
// Add a basic get - route to check if server's up
app.get('/', (req, res) => {
res.status(200).send(`Server up and running`);
});
// Add a route to accept incoming post requests for the fileupload.
// Also, attach two callback functions to handle the response.
app.post('/', (req, res) => {
const filePath = path.join(__dirname, `/image.jpg`);
uploadFile(req, filePath)
.then(path => res.send({ status: 'success', path }))
.catch(err => res.send({ status: 'error', err }));
});
// Mount the app to a port
app.listen(port, () => {
console.log('Server running at http://127.0.0.1:3000/');
});
This post was originally published at https://q-bit.me/use-node-streams-to-upload-files/
Thank you for reading. If you enjoyed this article, let's stay in touch on Twitter ๐ค @qbitme
Top comments (10)
This article is really valuable, I also want to ask you, just as @longbotton_dev did, to write more of such amazing articles on Node.js core/fundamentals. I have a few questions though:
Question 1. When I do the above, the file, which is an image, is uploaded successfully, and the size of the upload file matches the original file as well, but when I open the uploaded image (the one on the server) it's just blank. I tried with a few different images to make sure it's not an issue of a certain image.
My code:
Client:
Server:
Question 2. Towards the end of the article you've said:
But according to this SO answer:
So it seems that using FileReader is not interesting at all... or maybe I'm getting it wrong.
I hope you're still checking dev.to ๐ Thanks a lot.
Hi there. Thank you for your reply :-)
I'm still here, will try and replicate your first case.
Good point on the filereader as well. This one was one of the first posts I made when learning Javascript & Node.js and wasn't very familiar with what's good for performance and what's not. If you have an input field available, you don't necessarily need the file reader. I just figured it'd be helpful to include because it'd be the next thing I took a look at.
PS: Out of curiousity: What about Node.js core content would you like to read? I thought about writing an article on how the http module works, but I feel like that'd be a bit trivial.
This might be ridiculous (or funny?), but after I added this comment here, I read a chapter on streams from a node.js book and now that I check node.js docs again, I don't see much more that I personally need to learn about node.js itself. But I think the main reason I asked for more is because the first part of your article (the theory) was so well explained that excited me :D and I thought your articles will be valuable for future readers about whatever they should be.
Did you manage to replicate the issue (the first case)?
I also want to recommend a few things about the article:
<input type="file">
does not load the files in RAM and it is things likefetch
that create the stream automatically internally as soon as they are making the request (so no need to use the FileReader API explicitly). I know it's not directly related to your article, but realizing this connected some vagueness dots for me personally.Thanks for your response.
Noted. I do try to make my articles easily graspable. Sometimes it works, sometimes it doesn't.
Yes. All you have to do is to leave the form data out. Or implement a form parser on the backend. You're basically handling the raw binary data without the form wrapper. I'm not exactly sure how form data is parsed, but I did change your code so it looks like so and it worked (same server code). I attached the full staticially served
index.html
file:I found the answer to my third point too, I'll add here for future readers:
Based on the documentation,
createWriteStream
returns an instance of<fs.WriteStream>
that has an 'open' event which is emitted when the<fs.WriteStream>
's file is opened: nodejs.org/api/fs.html#event-open_1(btw, this is weird, I did the exact same thing as you and passed the data leaving form data out, but still not working for me, but thanks anyways).
Just to be on the same page. In many upload file kinda websites they preview the input . I believe that has load into ram to preview right? ( I mean if the preview is enabled)
Yes. Images are always loaded into memory when rendering a page. Instead of instantly uploading the img and providing a link, some pages store images as base64 on the user's computer and permit uploading only after a confirmation.
In case you guys who commented here are still with me - I've written another article on Node.js fundamentals. It somewhat builds up on streams. I intend to write more under the series 'Node.js fundamentals'. Again, thank you so much for your feedback, I really appreciate it.
Check out how to implement Server-Sent Events with Node here: dev.to/tqbit/how-to-use-nodejs-for...
Thanks for the article, it is well written!
This is amazing
Please keep making more core node js content