In part 5 we've created our first QR Code. Our code is not complete, but rather it's specific to our case (version 2, byte mode) and we've applied a fixed mask, violating the specification that tell us to choose the best mask among the predefined 8.
Anyway, this is the result:
In order to choose the best mask, we need to compute the penalty score that we get from each one, and select the mask with the lowest score. The penalty score is the sum of the penalties obtained using 4 rules, as follows.
Rule 1
The first rule says that each sequence of 5 or more consecutive dark/light modules in a row or column gets a penalty of the length of the sequence minus 2.
This is the penalties for the QR Code above just for the horizontal sequences:
It sums up to 102. When adding up the penalties for vertical sequences, we should get 138 more, for a total score of 240 for rule 1.
In code
Let's start with a pretty straightforward helper function to compute the penalty in a certain sequence of modules according to rule 1:
function getLinePenalty(line) {
let count = 0;
let counting = 0; // To keep trick of which modules we're counting
let penalty = 0;
for (const cell of line) {
if (cell !== counting) {
counting = cell;
count = 1;
} else {
count++;
if (count === 5) {
penalty += 3;
} else if (count > 5) {
penalty++;
}
}
}
return penalty;
}
Now we're going to use it to compute the total penalty score of rows and columns (remember that matrix
is supposed to be an array of Uint8Array
s containing either 0
or 1
- our QR Code):
let totalPenalty = 0;
const rowPenalty = matrix.reduce((sum, row) => sum + getLinePenalty(row), 0);
totalPenalty += rowPenalty;
const columnPenalty = matrix.reduce((sum, _, columnIndex) => {
const column = matrix.map(row => row[columnIndex]);
return sum + getLinePenalty(column);
}, 0);
totalPenalty += columnPenalty;
Rule 2
The second rule says that each rectangular region of dark/light modules of size m×n gets a penalty of 3×(m - 1)×(n - 1).
… ok. But how should we identify such rectangles, if there are several ways to split a certain area?
Fortunately, we have an equivalent strategy: just add a penalty of 3 for each 2×2 square of dark/light modules, including overlapping ones.
There are 60 of such 2×2 squares in the picture above, for a total penalty score of 180.
In code
This rule is quite simple: all you have to do is checking if three adjacent modules are equal to the current one:
let blocks = 0;
const size = matrix.length
for (let row = 0; row < size - 1; row++) {
for (let column = 0; column < size - 1; column++) {
const module = matrix[row][column];
if (
matrix[row][column + 1] === module &&
matrix[row + 1][column] === module &&
matrix[row + 1][column + 1] === module
) {
blocks++;
}
}
}
totalPenalty += blocks * 3;
Rule 3
The third rules says that each sequence of dark-light-dark-dark-dark-light-dark-light-light-light-light modules (⬛⬜⬛⬛⬛⬜⬛⬜⬜⬜⬜) or its reverse (⬜⬜⬜⬜⬛⬜⬛⬛⬛⬜⬛), found on any row or column, adds a penalty of 40.
I have no idea! Seriously, if someone has any info about this, please speak up! As user @latestversion points out in the comments, the reason can be found in section 7.8.1 of the QR Code ISO 18004 specification:
The module pattern 1011101 particularly found in the finder pattern should be avoided in other areas of the symbol as much as possible.
Anyway, here are said patterns highlighted in our QR Code:
Four patterns, for a penalty of 160.
In code
First of all, let's put those sequences in constants:
const RULE_3_PATTERN = new Uint8Array([1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0]);
const RULE_3_REVERSED_PATTERN = RULE_3_PATTERN.slice().reverse();
Then, for each row and column, let's check how many times they contain the pattern or its reverse:
let patterns = 0;
for (let index = 0; index < size; index++) {
// Checking the rows...
const row = matrix[index];
for (let columnIndex = 0; columnIndex < size - 11; columnIndex++) {
if ([RULE_3_PATTERN, RULE_3_REVERSED_PATTERN].some(
pattern => pattern.every(
(cell, ptr) => cell === row[columnIndex + ptr]
)
)) {
patterns++;
}
}
// Checking the columns...
for (let rowIndex = 0; rowIndex < size - 11; rowIndex++) {
if ([RULE_3_PATTERN, RULE_3_REVERSED_PATTERN].some(
pattern => pattern.every(
(cell, ptr) => cell === matrix[rowIndex + ptr][index]
)
)) {
patterns++;
}
}
}
totalPenalty += patterns * 40;
Rule 4
Rule 4 is mostly computational. Follow these steps:
- compute the percentage of dark modules;
- if the percentage is greater than 50, round down to the nearest multiple of 5; otherwise, round it up;
- subtract 50 and double the absolute value of the difference: that's our penalty for rule 4.
In our case, 50.4% of the modules (315 out of 625) are dark, so we round down to 50, subtract 50 and double the difference: it's 0.
If we had, for example, a percentage of 42%, we'd have rounded up to 45%, then get a penalty of 10. For 67%, we'd round down to 65% and get a penalty of 30.
Note that you can actually make the computation based on the light modules instead of the dark: it's the same thing, if you check the formula.
In code
First, let's compute the amount of dark (or light) modules:
const totalModules = size * size;
const darkModules = matrix.reduce(
(sum, line) => sum + line.reduce(
(lineSum, cell) => lineSum + cell
, 0)
, 0);
const percentage = darkModules * 100 / totalModules;
In order to round down to the nearest multiple of n, we divide by n, round down to the nearest integer and multiply back by n. It's similar when rounding up.
const roundedPercentage = percentage > 50
? Math.floor(percentage / 5) * 5
: Math.ceil(percentage / 5) * 5;
Finally, let's compute the penalty score:
const mixPenalty = Math.abs(roundedPercentage - 50) * 2;
totalPenalty += maxPenalty;
Since basically Math.trunc = x => x > 0 ? Math.floor(x) : Math.ceil(x)
(MDN, we can come up with a more compact formula like this:
const mixPenalty = Math.abs(Math.trunc(percentage / 5 - 10)) * 10;
The complete penalty score function
Let's gather all the code above in a single function:
const RULE_3_PATTERN = new Uint8Array([1, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0]);
const RULE_3_REVERSED_PATTERN = RULE_3_PATTERN.slice().reverse();
function getLinePenalty(line) {
let count = 0;
let counting = 0;
let penalty = 0;
for (const cell of line) {
if (cell !== counting) {
counting = cell;
count = 1;
} else {
count++;
if (count === 5) {
penalty += 3;
} else if (count > 5) {
penalty++;
}
}
}
return penalty;
}
function getPenaltyScore(matrix) {
let totalPenalty = 0;
// Rule 1
const rowPenalty = matrix.reduce(
(sum, row) => sum + getLinePenalty(row)
, 0);
totalPenalty += rowPenalty;
const columnPenalty = matrix.reduce((sum, _, columnIndex) => {
const column = matrix.map(row => row[columnIndex]);
return sum + getLinePenalty(column);
}, 0);
totalPenalty += columnPenalty;
// Rule 2
let blocks = 0;
const size = matrix.length
for (let row = 0; row < size - 1; row++) {
for (let column = 0; column < size - 1; column++) {
const module = matrix[row][column];
if (
matrix[row][column + 1] === module &&
matrix[row + 1][column] === module &&
matrix[row + 1][column + 1] === module
) {
blocks++;
}
}
}
totalPenalty += blocks * 3;
// Rule 3
let patterns = 0;
for (let index = 0; index < size; index++) {
const row = matrix[index];
for (let columnIndex = 0; columnIndex < size - 11; columnIndex++) {
if ([RULE_3_PATTERN, RULE_3_REVERSED_PATTERN].some(
pattern => pattern.every(
(cell, ptr) => cell === row[columnIndex + ptr]
)
)) {
patterns++;
}
}
for (let rowIndex = 0; rowIndex < size - 11; rowIndex++) {
if ([RULE_3_PATTERN, RULE_3_REVERSED_PATTERN].some(
pattern => pattern.every(
(cell, ptr) => cell === matrix[rowIndex + ptr][index]
)
)) {
patterns++;
}
}
}
totalPenalty += patterns * 40;
// Rule 4
const totalModules = size * size;
const darkModules = matrix.reduce(
(sum, line) => sum + line.reduce(
(lineSum, cell) => lineSum + cell
, 0)
, 0);
const percentage = darkModules * 100 / totalModules;
const mixPenalty = Math.abs(Math.trunc(percentage / 5 - 10)) * 10;
return totalPenalty + mixPenalty;
}
Total penalty score
The total penalty score in our case is therefore 240 + 180 + 160 + 0 = 580. New we need to find which mask yields the lowest penaly, and this function should do the trick:
function getOptimalMask(version, codewords, errorLevel) {
let bestMatrix;
let bestScore = Infinity;
let bestMask = -1;
for (let index = 0; index < 8; index++) {
const matrix = getMaskedQRCode(version, codewords, errorLevel, index);
const penaltyScore = getPenaltyScore(matrix);
if (penaltyScore < bestScore) {
bestScore = penaltyScore;
bestMatrix = matrix;
bestMask = index;
}
}
return [bestMatrix, bestMask];
}
The other masks yield a penalty score of, respectively, 495, 415, 575, 597, 579, 472, 779. So the best mask to apply is #2, with this final result:
And that's it! We finally have our final QR Code 🎉. But still, we assumed quite some things and cut corners in order to reach a result as soon as possible (and still needed 5-6 parts anyway!):
- the content is a plain Latin-1 encoded string;
- the length of the content fits in a smaller QR Code;
- no need to split the data in multiple code blocks.
In the next parts, we're going to solve these issue, so we can actually develop a fully functional QR Code generator.
Top comments (10)
Presumably rule 3 is to try to remove patterns in the QR code that might look like the finder pattern. Ie, if you take a 1-pixel-high cross section of the finder pattern it will look like 1011101. Note that in the spec it talks about ratios, not the actual pattern. So presumably you are also supposed to check for 11001111110011 and so on. For example there could be a huge scaled-up finder pattern in the middle of the QR code that would confuse the reader, but this rule would penalize that.
It’s not totally clear to me from the spec whether the quiet zone around the QR code is supposed to count towards the 4 modules of white space, but I think it would make a lot of sense if it did.
Oh and btw, yes, rule 3 iseems very much to be there to prevent finder pattern-like constructs. Section 7.8.1: "The module pattern 1011101 particularly found in the finder pattern should be avoided in other areas of the symbol as much as possible."
Good catch! I've updated the article accordingly.
Your reflection on ratios is very interesting. Given how that wording stands out compared to the rest of the formulations in the standard, I think you are right.
Interestingly, the author of Project Nayuki (nayuki.io/page/creating-a-qr-code-...) has taken the approach of including the quiet zone in the evaluation. It makes sense, as you say, but at the same time I think it would have been mentioned in the standard if it was deemed necessary. While things still pop up that I have missed, I have not yet seen anything hinting towards including the quiet zones being necessary.
Overall, I think the standard is too vague on a range of issues.
ETA: Although I think excluding quiet zones is totally justified as the standard to my knowledge never mentions them in regards to masking.
I didn't consider the quiet zone as part of the field to search for pattern, since it would produce 18 matches out of the box for any QR code, because of the finder patterns. Also because the quiet zone is there just to make it easier for reader to identify a QR code, nothing more. It delivers no information, and while it's specified that it should be 4 modules wide at least, common readers are fine with narrower borders.
But it might actually create legit matches of the pattern, soooo... Unclear? Fortunately it's all for just an optimization and doesn't affect the correctness of the resulting matrix.
Edit: thinking about it again, that specific sequence would actually match the middle part of the finder patterns including the quiet zone, and we don't want to find it in other parts. If that's the point, we don't want it to be anywhere in the final result except for the finder patterns, and that would actually include the quiet zone.
So, in the end, I think this article and the corresponding code need to be amended for that part.
It is unclear whether we should include the format (and version) information before we do the mask evaluations or not.
What speaks for it is of course that it makes sense to reduce unwanted patterns rising from the combination of format version bits and data/ECC bit.
On the other hand, section 7.1, in the description of steps 6 and 7, makes it clear that the format and version info should be generated after a mask has been selected.
Further, section 7.7.2 says that "Modules positions for the format information and version information shall be left temporarily blank" (when placing the functional patterns). The unmasked symbol in Figure 23 (as an example of masking) is also without format information.
There is also no mentioning of how to place the format information in between the section on data placement and masking. The specifics regarding format information comes after masking.
All in all, I (currently, ha!) believe the format information can (should) be placed after mask evaluation.
One question regarding rule 3:
Does'nt Row 2 Coloumn 14 - 24 (counting from 0) contain the reverse pattern? (green box)
I wrote my own code for this Rule in Java and it found a total of 4 patterns including the 3 from your example. And when checking manually the area on the QR-code seems to fit the reverse pattern.
Or did I miss something like that we are only checking for either for the normal XOR the reversed pattern?
I think this is because the code in the tutorial has a bug. The upperbound of
columnIndex
androwIndex
:By example of
version
= 2, i.e.size
= 25, thecolumnIndex
loops from 0 to 13. The last 14-24 part will miss. Try changing the code tocolumnIndex <= size -11
orcolumnIndex < size - 10
, androwIndex
as well.After adjustment the penalty score will be 580, 535, 415, 615, 677, 579, 512, 819.
The code definitely has bugs. I hope I'll fine the patience to come back and fix everything.
Another problem is surely cause by my (probably) incorrect interpretation of the role of the "quiet zone", so there's also that.
Skipping column 6 (the vertical timing pattern) prodcues the same result as the example in the standard, but to be honest I can't really justify it from the rules in the standard. Apparently the standard author has taken the same approach, but I don't understand which rule that specifies that behaviour. In my view, we could as well have a single file going downwards the side of the vertical timing pattern. Just like we can have single files going up (only up?) an alignment pattern. Very frustrating.