Do you have ranked, sorted or ordered data and are wondering how to save it in a database? That’s a common problem where the most common approaches are to use either an indexed or linked list.

I’ll briefly go through the different approaches and will list the pros/cons. Then I’ll introduce a hybrid approach that mitigates most of the cons.

I haven’t posted code this time, but let me know if you need a sample demonstrating the different approaches, I’d be glad to help you out.

The operations that I’m going to evaluate for each method are:

  • Add (at the end)
  • Insert (in the middle)
  • Delete
  • Read a sorted list
  • Read a filtered sorted list
  • Move an item

Indexed list

In this approach, you are saving the position of the item based on the order. The ID of the item never changes, but you can update the index to match the order that you want.

Sample data

ID Data Index
1 xxx 1
2 xxx 2
xxx xxx xxx
100000 xxx 100000
100001 xxx 100001

Evaluation

Add Easy 1 operation
Insert Hard You need to re-index all the items after the inserted one
Delete Easy 1 operation
Read a sorted list Easy 1 operation
Read a filtered sorted list Easy 1 operation
Move an item Hard You need to update the index and re-index all the items that are after the one that you moved

This is a relatively easy implementation and it’s widely used. If you are mostly adding items and not reordering a lot that’s an “ok” solution or if the database is not very big and allows easy re-indexing. For example, in SQL it can be as easy as:
UPDATE Table SET Index = Index + 1 WHERE Index > 500
However, in NoSQL databases, this will be an expensive operation, as you would have to update every document.

A common way to mitigate the re-indexing problem is to leave a “gap” between the indexes. For example, every index is by 1024 greater than the previous one.

ID Data Index
1 xxx 1024
2 xxx 2048
xxx xxx xxx
100000 xxx 102400000
100001 xxx 102401024

Linked List

The main benefit of a linked list is that the items can easily be inserted or removed without reallocation or reorganization of the entire structure. There is detailed information in Wikipedia

Sample data

ID Data NextId
1 xxx 2
2 xxx 3
xxx xxx xxx
100000 xxx 100001
100001 xxx LAST

Evaluation

Add Easy 2 operations (insert the new item and update the last one to point to the new one)
Insert Easy 2 operations (very similar to add, with the only difference that the new item should point to the item where it’s inserted)
Delete Easy 2 operations (delete the item and update the previous to point to the next one)
Read a sorted list Hard You need to get the items one by one to be able to recreate the list in the order
Read a filtered sorted list Very hard You need to get the items one by one while applying a filter to be able to recreate the list in the order
Move an item Easy 4 operations (combination of Delete and Insert)



A little bit harder of an implementation, but just as straightforward.
Linked List is a good solution if you don’t have a lot of data, but you reorder, insert, and move items often.
Another scenario could be if you are reading the items one by one or in small batches.

Hybrid approach

In this approach, I’ll combine the previous two. Additionally, to optimize the indexing I’ll use a decimal instead of an integer. That will improve the Insert operation, as there is theoretically always a number between any two numbers. Of course, that approach has limitations and you’ll need to see what they are in your database. However, re-indexing would not have to be done as often.
Keeping a pointer to the next item will help with both finding the position of insertion and re-indexing later on.

Sample data

ID Data Index NextId
1 xxx 1 2
2 xxx 2 3
xxx xxx xxx xxx
100000 xxx 100000 100001
100001 xxx 100001 LAST

Evaluation

Add Easy 2 operations (insert the new item and update the last one to point to the new one)
Insert Medium 2 operations (very similar to add, but in addition the new item should point to the item where it’s inserted)
Delete Easy 2 operations (delete the item and update the previous to point to the next one)
Read a sorted list Easy 1 operation
Read a filtered sorted list Easy 1 operation
Move an item Medium 4 operations (combination of delete and insert)

This is a little bit harder because you need to keep both index and link in sync. However, it’s a good compromise if you need to read, write, AND reorder fast
There are different ways to optimize reordering even further, similar to how you’d manage the indexes in a hash table. For example to evenly distribute the gaps only between the closest items.

Sample data

ID Data Index NextId
1 xxx 1 2
2 xxx 1.5 3
3 xxx 2 4
xxx xxx xxx xxx
100000 xxx 100000 100001
100001 xxx 100001 LAST

Let me know if you have any questions or comments on how I can improve the blog.

Thanks,
George

Categories: NoSQLQuick tipsSQL

Leave a Reply

Your email address will not be published. Required fields are marked *