This post is one in a series about stuff formally trained programmers know – the rest of the series can be found here.
In the previous post on Array, we saw that all read operations are Θ(1), which is awesome. An important reality of programming is everything is a trade off, so when you get fast reads with Array adding items when you don't know the collection size is expensive.
Lets say you create an array of ints, named X, and set the length to 5 (currently that is using 20 bytes). Now we want to add a 6th item, so the solution is to create a second array, named Y, with a larger length. If we just want to handle one more, it means Y is now taking up 24 bytes of memory. Then we need to do a bunch of copy operations as we copy items from X to Y, which is really slow. By the end of the process, just adding one item was really expensive.
The solution is to change the way we store the data structure in memory. With a Linked List which each value is wrapped with metadata and stored separately in memory (compared to an Array which stores all values in a single continuous block of memory). The reason each item is wrapped, is that it then gets pointer to the next item in the collection, so that you can still navigate through the collection.
The big advantage to Linked List is that since the values can go anywhere in memory the collection can be expanded indefinitely until you run out of memory for very little cost, either Θ(n) or Θ(1). The difference is if the collection implementation keeps a pointer to the final item in or not; if it does not then it needs to navigate through each item, Θ(n), and if it knows the location of the last item then it just needs just go directly to it and set its pointer to the next item.
Removing and reordering items is also much faster than an array since you just need to find the items before/after and change where their pointers point to.
What is the downside then? Navigation through the collection is slower than an array. For example if we create an integer array and I want to access the fifth item can be done with simple math:
(start of array in memory) + (int size in memory * offset) - that will give us the location of the integer value we want to read, basically an Θ(1) operation.
With Linked List though, I need to ask the first item where the second is; then ask the second where the third is; then ask the third where the forth is; then ask the forth where the fifth is. So Θ(n) operation for reading.
Linked lists also use more memory since you aren't just storing values, we are storing the values and one or two pointers with each value. This is marginal when storing types without a constant size, like a class since an array then needs to store the pointers to the values, but it is worth remembering.
The interesting thing about linked list compared to array is that it is very flexible in its implementation. The simplest version is to just have a pointer to the first item and each item in the collection needs to point the next item. This is known as a singly linked list, as each item is linked to one other.
The linked list may also store a pointer to the last item to make adding faster.
Most common implementations though use a doubly linked list where each item in the collection not only points to the next item in the collection, but also points to the previous item in the collection. At the trade off of memory (for the extra pointer) and potentially more expensive operations (like a insert now impacts two items and not just one) you gain the ability to navigate forwards and in reverse.
Java has a doubly linked list implementation with LinkedList