Important
Recently, I started working on the fourth generation of Jenny, a Telegram bot with a quirky personality. Jenny can see the environment around her through the lens of a 2MP USB Webcam. It has always been my dream to build a robot that can recognize me and interact with me. That's why I have been exploring ways to use facial recognition in Nodejs. This is a basic demonstration of how I intend to integrate it into my robot.
Luckily, I found this YouTube video by Robert Bunch (@robertbunch), in which he briefly introduced me to face-api.js and Tensorflow.js.
Later, I found that face-api.js is no longer maintained, and the last update was like... four years ago. So, I switched to one of its forks by Vladimir Mandic (@vladmandic). However, that repository also seems to be archived at the time of writing this.
Run the following command in the terminal:
npm install
This will install the following NPM dependencies:
- @vladmandic/face-api: One of the forks of face-api.js
- canvas: Required for polyfilling
Canvas
in Nodejs - @tensorflow/tfjs and @tensorflow/tfjs-node: Required for this fork of face-api.js
- node-screenshots: Required for taking screenshots using Nodejs (used for testing)
- node-webcam: Required for capturing webcam feed using Nodejs (used for testing)
Update: This will also install these additional NPM packages:
- express: Required for creating a web server for the webcam demonstration
- open: Required for opening the webcam demonstration in the browser
- colorjs.io: Required for manipulating the colors of the detection labels
- html_colors: Required to get a list of predefined colors
Before testing the demo, we need to train the model using our own images. To do this, we can create one or more subfolders in the data directory, each containing images of a specific person's face. For instance, if we want to train the model for two people named Kaniz and Shahriar, the directory structure would be as follows:
/
└── 📁 data
├── 📁 kaniz
└── 📁 shahriar
These subfolders can contain one or more supported images (jpg/jpeg
or png
) with the person's face visible:
/
└── 📁 data
├── 📁 kaniz
│ ├── 📄 DSC_0233.jpg
│ ├── 📄 FB_IMG_1715114037413.jpg
│ ├── 📄 IMG_20240711_224231_495.jpg
│ ├── 📄 basis_club_pfp.png
│ └── ...
└── 📁 shahriar
├── 📄 IMG_20240715_214759_834.jpg
├── 📄 IMG_20240718_111745_269.jpg
├── 📄 IMG_20240715_001924_628.jpg
├── 📄 IMG_20240715_062111_313.jpg
├── 📄 IMG_20240507_014108_611.jpg
└── ...
Note
The images can be named any way; it is NOT necessary to rename them as img_01.jpg
, img_02.jpg
, and so on.
When the files are in place, we can run:
node train.js
It may take some time to train the model, depending on the number of images it has been trained on. But once the model is trained, it will create a JSON file trained.json
containing the descriptors:
[
{
"label": "kaniz",
"descriptors": [[...], [...], ...]
},
{
"label": "shahriar",
"descriptors": [[...], [...], ...]
}
]
Note
- It is possible to train the model with ONLY 1 image per person but to improve accuracy, it is recommended to use 20-30 images.
- Each image should contain a single well-lit and clearly visible face. Higher-resolution photos (e.g., 2MP/4MP) may improve detection accuracy.
Once the model is successfully trained, we can test the demonstration.
Create an image named test.jpg
in the current folder to detect faces from the image. Then run:
node .
It is possible to detect faces from whatever is currently visible on the screen. To do this, change the source
variable to "screenshot"
in the index.js
file. Then run node .
If you want to detect faces within a specific window, you need to provide a sourceID
. We can obtain a list of available windows by running:
node metadata.js
Copy the ID of your desired window and paste it to the sourceID
variable. If the sourceID
is null, the entire monitor would be used to detect faces.
Important
There is a better way to detect faces from webcam video.
We can use the navigator.getUserMedia(...)
function to get webcam video directly in the browser. To see the webcam demo, run:
node webcam.js
You can switch to your preferred webcam by setting the webcamID
variable in the public/script.js
file. There are a couple of other video options that can be adjusted. Here are the defaults:
const webcamID = 2 // can be obtained by running `node metadata.js`
const portrait = true // set to false for landscape mode
const ratio = portrait ? 9 / 16 : 16 / 9 // aspect ratio
const height = 720 // can be 720, 1080, 1440
const frameRate = 30
Original: It is also possible to detect faces from the webcam feed. To do this, change the source
variable to "webcam"
in the index.js
file. Get an available list of webcams by running node metadata.js
and then update the webcamDeviceID
variable.
In any case, it will generate a result.jpg
file and draw boxes around the recognized faces.
We can obtain a list of available devices by running:
node metadata.js
It should yield something similar to the following:
✅ 1 monitor found
🔷 ID: 65537 \\.\DISPLAY1 (Primary)
✅ 4 windows found
🔷 ID: 65742 (Windows Explorer)
🔷 ID: 853460 (MPC-HC x64)
🔷 ID: 460142 (Visual Studio Code)
🔷 ID: 1246018 (Firefox)
✅ Available cameras:
🔷 ID: "1"
🔷 ID: "2"
🔷 ID: "3"
Here are a few key points I would like to address:
-
face-api.js (and its forks) depend on node-canvas for polyfilling
Canvas
in Nodejs. I did not need to build from source because it was on a Windows machine. However, if you plan to use it in an unsupported machine (e.g., Raspberry Pi), you might need to build from source. Click here for instructions. -
@vladmandic/face-api requires both @tensorflow/tfjs and @tensorflow/tfjs-node to work correctly. We must install the same version of both packages.
-
If you encounter any errors while importing
@tensorflow/tfjs
, you might need to copy these two files:node_modules/@tensorflow/tfjs-node/deps/lib/tensorflow.dll node_modules/@tensorflow/tfjs-node/deps/lib/tensorflow.lib
and paste them into:
node_modules/@tensorflow/tfjs-node/lib/napi-v8
The source code is licensed under the MIT License.