In databases, what’s an index and how/why does it improve performance? |
||
Imagine that you have a database table called Employee with 2 columns – Employee_Name and Employee_Age – and millions of rows, and suppose that the people in this table have very unique names. Now, what would happen if you want to find out the Age of an employee named ‘Jesus’? Well, once you run the query the database software would have to search through the entire table (by doing what’s called a full table scan) to find any employees with the name ‘Jesus’. But, if we have an index for that table, then the search for the employee name will be done a lot faster. So, what’s an index? Well, it’s basically a data structure (usually a binary tree) that stores the values for a given column or columns. By using a sorted data structure to store the database values, looking up those values becomes much faster. Here’s what the SQL would look like to create an index on the Employee_Name column:
We could also create an index on both of the columns in the Employee, as shown in this SQL:
A good analogy is to think of the database index like the index in a book. Rather than search through the entire book for a particular subject, you look at the index because it contains all the different subject titles of the book in a sorted format, and it gives you the page number. Obviously this is much faster than scanning the entire book. Similarly, as a book index contains a page #, a database index contains a pointer to the row containing the value that you are looking for in your sql. So, what’s the cost of having a database index? Well, for one thing it takes up space – and the larger your table, the larger your index. Also, whenever you add, delete, or update rows in your table, the same operations will have to be done to your index. Remember that an index needs to contain the same data as whatever is in your actual table. It’s worth noting that indexes are invisible to users, and that SQL has no syntax for referring to an index in a query. As a general rule, an index should only be created on a table if the data in the indexed column will be queried frequently. |
Subscribe to our newsletter for more free interview questions.
Follow @programmerintvwVaroon Sahgal, Author of ProgrammerInterviewon

