Explain what is DB indexing like I'm five

#discuss #explainlikeimfive

Top comments (9)

You like to play with your toys all around the house, and you have a big house. PS: now the parents regret buying all these toys, but is too late.

But how do you find a specific toy when you need it? Lucky for you, all the toys have GPS locators in them.

You have a smart watch (we are in the future of course), and you can talk with it. You can ask it to find your toys only using 2 criterias

type (puzzle, car, ball etc)
color

You can say to your watch:

where is the blue car? The smart watch will respond instantaneous "near the living room TV"

Your watch is smart, he knows where every toy is by using a simple technique, when a toy is moved in the house, he updates it's location.

Every time

your parents buy a toy (INSERT) - he adds to its memory with 2 labels (type and color)
a toy is repainted (UPDATE) - the "color tag" is updated
a toy is trashed (DELETE) - it is removed from the memory

Each toy has an unique identifier (like the serial number from the toy factory) that the smart watch needs to know in order to identify it.
The watch keeps this connections (ID - type, color, position) in it's memory and persistent storage.

smart watch - the database and database index
toys - records in a table database
type and color - columns that have index
searching for a toy - SELECT queries USING at least an index
a toy is moved in the house - the database storage system may move the data stored on the DISK around, or if is shard on multiple servers.

Extra: each toy can have it's own unique nickname, and you can search by it.
That would be an UNIQUE index, means that 1 nickname cannot be used for 2 toys.

Kate Komar • Oct 24 '17

That's gorgeous, thanks

Daniel J. Summers • Oct 25 '17

Yep - I was going to go back to my kids, and their Hot Wheels/Matchbox car cases where they could see the side of each car. This is better than that. :)

Ben Sinclair • Oct 30 '17

This doesn't really explain indexing. It's a prerequisite for it, though.

Adrian B.G. • Oct 30 '17

True, it explains more of what is it how is it used more than how it is implemented, but they are 5y old so ....

Like the index, the smart watch keeps a copy of the data (toys characteristics) in a way that is efficiently to search for a specific value.

We could add the there are many types of smart watches, you have to choose it depending on the toys you have and the queries you want to do.

Girish Budhwani • Oct 29 '17 • Edited

Imagine it’s 1930s and you are among the first few to buy a home telephone. Some of your rich friends have it and you keep a book to track their numbers as Telephone directory did not exist. You write the entries in the order they were introduced.

John 7354
Anna 3211
Natalie 2345
Bob 8739
.. ..
.. ..

Yeah I know… numbers were that short! In few years your list has grown to more than 100 and it has become painful to find the number. Then you think of rewriting the entries in alphabetical order.

Anna 3211
Bob 8739
John 7354
Natalie 2345
… ..
… ..

Suddenly, finding a number has become easier. Guess what you did there? You indexed the entries by name!!

Certainly, rewriting is not an efficient way to do indexing in the computer world. But the crux is we do indexing (on a column, in this case “name”) in some form so to make the search faster.

emptyother • Oct 24 '17

Every person in my country is registered in a database. This database is just a long list, sorted by a National ID. It's sequential, and starts with the persons birth date. But the NID isn't something other people usually know, and so everyone who wanted to call someone would need to look trough the entire list to find a phone number.

The "Telephone directory" is an index of all these people, but sorted by surname instead of national ID. Whenever someone gets added to the first list, they also gets added to the phone directory and sorted.

If you want to select a long list of people where their surname starts with "V", you can find them a lot faster by just selecting everything between page "V" and page "W". But if you want to find every person that was born in 2015, you would have to check the first list instead. And if you wanted to create a combined list of people which surname starts with "V" and was born in 2015, you would either have to make a new index or manually search trough the shortest of these two lists.