Multithreading and multiprocess in Node.js
Refs
- Node.js Worker Threads
- Deep dive into threads and processes in Node.js
- How do cluster and worker threads work in node.js
Multithreading: Worker thread
What and why
A thread that enables node.js to execute JavaScript in parallel. Useful to handle CPU intensive jobs.
How-to
Create a worker file => Make a promise in caller file => Define on message/error/exit hooks
worker.js file:
1 | const { workerData, parentPort } = require('worker_threads') |
index.js file:
1 | const { Worker } = require('worker_threads') |
Key points:
- Worker use
workerData
to receive data from caller andparentPort
to post data to caller. - Worker use
parentPort.postMessage
to send data to caller and caller useon('message')
to receive.
Multiprocess: Fork and Cluster
An example to address the problem of single-threaded node.js
Single threaded node.js will block on a time-consuming request.
Example:
1 | const http = require('http'); |
Call localhost:3000/compute
will block
Pros and cons of single-threaded node.js
Pros:
- Simple, no creation and switching of threads.
- Event loop and non-blocking asynchronous mechanism ensures high performance for high concurrency.
Cons: - CPU intensive calculation may block entire node.js app.
- An error may kill the thread, thus kill entire app. A daemon thread should be considered.
- Single thread does not take advantage of a multi-core CPU.
Fork
Use child_process.fork
to create new process.
Main.js:
1 | const http = require('http'); |
compute.js:
1 | const computation = () => { |
Cluster
cluster
can create worker process in a single file.
Example:
1 | const http = require('http'); |
Running code above, we got:
1 | Master process id is 18428, cpu number 4 |
We can find these process in task manager:
Now send a request in browser, and id of one of the four server process is returned:
Try refresh many times and different pids may return(It depends, maybe one unlucky process shoulders all the workload).
Now kill one the worker 1460
in task manger and we got:
1 | worker process died,id 1460 |
Refresh the browser and result is another pid other than 1460:
You see, now our server is much more robust than before. We got four worker process, killing one of them and there are still three working.
Cluster calls the same fork
method from child_process
module under the hood. Cluster is a master-slave model, where master manages and schedules slaves.
Why no Error: EADDRINUSE
when multiple processes listens on the same port?
The child processes aren’t listening to the same port. Incoming socket connections to the master process are being delegated to the child processes. There’s special handling for clustered process in server.listen()
, it calls a method named listenInCluster()
in some circumstances. See explanation here
Multithreading vs multiprocess
cluster
- One process is launched on each CPU and can communicate via IPC.
- Each process has its own memory with its own Node (v8) instance. Creating tons of them may create memory issues.
- Great for spawning many HTTP servers that share the same port b/c the master process will multiplex the requests to the child processes.
worker threads
- One process total
- Creates multiple threads with each thread having one Node instance (one event loop, one JS engine). Most Node API’s are available to each thread except a few. So essentially Node is embedding itself and creating a new thread.
- Shares memory with other threads (e.g.
SharedArrayBuffer
) - Great for CPU intensive tasks like processing data or accessing the file system. Because NodeJS is single threaded, synchronous tasks can be made more efficient with workers
Multithreading and multiprocess in Node.js
https://blog-cdt1.vercel.app/2022/07/15/Multithreading-and-multiprocess-in-Node-js/