Welcome to my weekend project, KhataLook!
For those unfamiliar with Hindi, "KhataLook" combines two words: Khata, meaning ledger, and Look, meaning search. So, KhataLook translates to "looking through the ledger" — but with a modern twist.
Borrowed money is a significant challenge for retail shop owners, and keeping track of it can be a cumbersome task. This inspired me to create KhataLook, a face recognition system designed to simplify this process. With KhataLook, shop owners can register faces of borrowers in the system, and when a recognized face enters the store, the system automatically announces the amount pending 💀
So, without wasting much time procrastinating, I built a prototype for this idea, and here it is!
GitHub Repository: KhataLook
Made the Logo using Canva
Project Overview
KhataLook allows shop owners to get a relief from remembering the people with borrows.
For the prototype, I decided to build a web application.
Tech Stack:
- Frontend: React
- Face Recognition: face-api.js
- Database: Cloud Firestore
- Text-to-Speech conversion: Google Cloud TTS API
Project Setup and Dependencies
As soon as I initiated the React project, First step was to install dependencies.
- Material UI (I just love it)
- Axios
- face-api.js
- Firebase
- React-Router
Database Setup
I chose to move on with NoSQL databases. As I am a Google Cloud Enthusiast, I went ahead with Cloud Firestore
- Initiated a Firebase Project
- Went to Run > Cloud Firestore and created a database
My database was ready within minutes
- Went to Project Settings > Add app and copied my Firebase Configuration
// Sample Configuration
import { initializeApp } from "firebase/app";
import { getFirestore } from "firebase/firestore";
const firebaseConfig = {
apiKey: "YOUR_API_KEY",
authDomain: "YOUR_PROJECT_ID.firebaseapp.com",
projectId: "YOUR_PROJECT_ID",
storageBucket: "YOUR_PROJECT_ID.appspot.com",
messagingSenderId: "YOUR_MESSAGING_SENDER_ID",
appId: "YOUR_APP_ID",
};
const app = initializeApp(firebaseConfig);
const db = getFirestore(app);
export { db };
Face Registration
- Initiated with downloading the models and putting them in the public/models folder
- Loading the models
// Load face-api.js models
export const loadModels = async () => {
await faceapi.nets.ssdMobilenetv1.loadFromUri('/models');
await faceapi.nets.faceLandmark68Net.loadFromUri('/models');
await faceapi.nets.faceRecognitionNet.loadFromUri('/models');
};
- Fired up the video stream
const startVideo = () => {
navigator.mediaDevices
.getUserMedia({ video: true })
.then((stream) => {
videoRef.current.srcObject = stream;
videoRef.current.play();
setCapturing(true);
})
.catch((err) => {
console.error("Error accessing the webcam: ", err);
});
};
- Helper function to push data
export const registerFace = async (name, mobileNumber, amount_pending, descriptor) => {
try {
await addDoc(collection(db, 'users'), {
name,
mobileNumber,
amount_pending,
faceDescriptor: Array.from(descriptor), // Convert Float32Array to array
});
alert('Face registered successfully!');
} catch (e) {
console.error('Error adding document: ', e);
}
};
- And TaDa
Face Recognition
- Fired up the video stream from camera, Pulled all the registered faces from the database and searched through them
const recognizeFace = async () => {
const context = canvasRef.current.getContext('2d');
context.drawImage(videoRef.current, 0, 0, canvasRef.current.width, canvasRef.current.height);
const descriptor = await detectFace(canvasRef.current);
if (!descriptor) return;
// Compare the detected face with registered faces
const faceMatcher = new faceapi.FaceMatcher(registeredFacesRef.current.map(face => new faceapi.LabeledFaceDescriptors(
face.name, [face.faceDescriptor])), 0.6); // Set a threshold for similarity
const bestMatch = faceMatcher.findBestMatch(descriptor);
if (bestMatch.label !== 'unknown') {
const recognizedFace = registeredFacesRef.current.find(face => face.name === bestMatch.label);
setRecognizedName(recognizedFace.name);
setAmountPending(recognizedFace.amount_pending);
playAudioMessage(recognizedFace.name, recognizedFace.amount_pending);
} else {
setRecognizedName('Face not recognized');
setAmountPending(0);
}
};
- And here the result is
Audio Announcement
- Went straight to Google TTS AI, Created an API Key
- Pasted the API Key in this helper function
const getSpeechAudio = async (text) => {
try {
const response = await axios.post(
`https://texttospeech.googleapis.com/v1/text:synthesize?key=YOUR_API_KEY_HERE`,
{
input: { text: text },
voice: { languageCode: 'hi-IN', ssmlGender: 'FEMALE' }, // Language and gender of the voice
audioConfig: { audioEncoding: 'MP3' },
}
);
const binaryString = window.atob(response.data.audioContent);
const binaryLen = binaryString.length;
const bytes = new Uint8Array(binaryLen);
for (let i = 0; i < binaryLen; i++) {
bytes[i] = binaryString.charCodeAt(i);
}
return bytes.buffer;
} catch (error) {
console.error('Error generating speech:', error);
return null;
}
};
- Finally played the audio
const playAudioMessage = async (name, amount) => {
let message = '';
// This text is in Hindi, Modify it as per your preference
message = `${name} Ji, aapke ${amount} rupye udhaar hai.`; // Default message
const audioContent = await getSpeechAudio(message);
if (audioContent && audioRef.current) { // Check if audioRef is defined
const audioBlob = new Blob([audioContent], { type: 'audio/mp3' });
const audioUrl = URL.createObjectURL(audioBlob);
audioRef.current.src = audioUrl;
audioRef.current.play();
} else {
console.error('Audio element is not available or audio content is invalid.');
}
};
Clone the project, Set it up on your local system and listen to the audio and use the application by yourself.
Conclusion
While it was just a very simple prototype of a random spontaneous idea.
If anyone is interested in taking this project further, feel free to contribute by submitting a pull request.
I’d love to hear your thoughts in the comments! Let me know if you think this idea is feasible.
Could just be another fun project or maybe a popular product !
Top comments (0)