Skip to main content

MythBusters: Arrays versus Collections

Code now available at https://github.com/rmaclean/ArrayFighter. If you wish to suggest a flaw or make a suggestion to improve it please create a pull request OR give me a detailed comment explaining what is needed, why it will help/hinder etc... ideally with links.

The myths I want to tackle are:

  • Arrays are not collections
  • Arrays are faster than collections

These are two statements I have heard recently and I initially felt are wrong, so to check I myself I decided to research this myth. I am assuming when people say collections they mean classes in System.Collections or System.Collections.Generics.

Arrays are not collections

In computer science, an array type is a data type that is meant to describe a collection of elements

Source: http://en.wikipedia.org/wiki/Array_data_type

So we have established that they are collections but how to they compare to say .NET collection classes like List<T>, ArrayList & LinkedList<T>?

List<T>

Official Documentation: http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx

What it implements

The first aspect is what interfaces are implemented in each, and you can see below both implement the same interfaces that deal with collections and lists. Note that since List<T> is generic we have both generic & non-generic interfaces.

Where is the data?

Where does List>T? store data? In an internal field: _items which is just an array of T.

So List<T> is just an array? Yup! So how does it handle inserting new items? It increases the array size when needed by creating a new array and copying the pointers to the items into it.

Complexity of the operations

  • Add: O(1) operation. If the capacity needs to be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.
  • Insert: O(n) – where n is count
  • Remove: O(n) – where n is count

ArrayList

Official Documentation: http://msdn.microsoft.com/en-us/library/system.collections.arraylist.aspx

Let me warn you about ArrayList, if you are using .NET 2.0 or higher and you are using this class, I have the full right to smack you across the face with a glove. There is NO REASON TO USE THIS ANYMORE! Use List<T> instead, it is faster & safer.

What it implements

The first aspect is what interfaces are implemented in each, and you can see below both implement the same interfaces that deal with collections and lists.

  • Array implement: ICloneable, IList, ICollection, IEnumerable, IStructuralComparable, IStructuralEquatable
  • ArrayList implement: IList, ICollection, IEnumerable, ICloneable

Where is the data?

Where does ArrayList store data? In an internal field: _items which is just an object[].

So ArrayList is just an array? Yup! So how does it handle inserting new items? It increases the array size when needed by creating a new array and copying the pointers to the items into it.

Complexity of the operations

  • Add: O(1) operation. If the capacity needs to be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.
  • Insert: O(n) – where n is count
  • Remove: O(n) – where n is count

Did you copy and paste List<T> for ArrayList?

Yip – because they are virtually the same, in fact the only core difference is that the internal storage is object[] versus T[]. That has side effects, like the Add methods taking in T versus object but that is it!

LinkedList<T>

Official Documentation: http://msdn.microsoft.com/en-us/library/he2s3bh7.aspx

So now we are onto something really different. An Array (and as we have seen ArrayList & List<T>) assign a continuous block of memory (number of items * item size) and then put the items in there are the correct offset.

This has the benefit for reading, for example if I need item 6, all I do is work out 6 * item size + start of array in memory and boom I am at the item. However, the problem is when I run out of space – List<T> & ArrayList solve this by making new bigger arrays! The cost of this is very high indeed (two arrays, having to copy all the items from one array to another, cleanup of the first array).

LinkedList is different, it knows about the first item in the collection, and that item has a property to the next item in the collection and so on. This means that I can add infinitely and never have issues as all I do is get the last item, and set it’s property to point to the my new item. Inserts are also way faster compared to arrays – rather than having to move all items down (as Arrays must) we just need to adjust the properties of the item before & the item I am inserting. So it is awesome? Nope – Reading is way slower.

If I need to get to item 6? I need to go to item 1, then item 2 and so on – moving along the entire way. It is slow, VERY much so. However moving backwards may be faster! This is because the node has not only a property that shows the next item but also a property that shows the previous item. So if I need to move one back it is very quick and easy, compare to arrays where I need to go back to the start and apply some math. Performance of moving backwards will really depend on how many steps backwards & the size, so in some cases array may still be faster.

What it implements

The first aspect is what interfaces are implemented in each, and you can see below both implement the same interfaces that deal with collections but not lists. So it is a collection, in the .NET sense, but not a list. It also contains the generic and non-generic versions.

  • Array implement: ICloneable, IList, ICollection, IEnumerable, IStructuralComparable, IStructuralEquatable
  • LinkedList<T> implement: ICollection<T>, IEnumerable<T>, ICollection, IEnumerable, ISerializable, IDeserializationCallback

Where is the data?

LinkedList<T> stores the first item in the list in a field called head which is of type LinkedListNode<T> which points to the first node in the collection. That is it, thee rest of the items in the collection are just linked to the head.

Complexity of the operations

Myth Outcome?

BUSTED! Arrays are collections, in both CS & .NET definitions.

Arrays are faster than collections

imageSo let’s see if they are with some micro-benchmarks & the code, if you want to dispute, is below.

I ran 6 tests on Array, ArrayList, List<T>, LinkedList<T> & Dictionary<T,K>

The tests were:

  • If we know we have 1000 items ahead of time – optimise and insert 1000 ints.
  • Insert 1000 ints, but do not preallocate any space more than 4.
  • Read every item in an collection of 100 and validate it.
  • Get the 100th item and validate it.
  • Get the 100th item, and then the 99th item and validate it.
  • Get an item based on a key
  • Insert an item at position 0.

Results

9 July 2012: As I added a new test I re-tested everything (so numbers & chart has been updated) – however the winner did not change on the tests.

Inserting into a known size?

Winner: Array

No surprise here.

LinkedList disqualified because you can’t preallocate.

Disqualified

  • Dictionary: Needs a key – we only have values in this test.

Inserting into an unknown size?

Winner: List<T>.

Interesting is that I would’ve assumed LinkedList to be faster & ArrayList to be closer to List<T>.

In hindsight it makes sense that LinkedList is slower – it needs to find the last item first as we we are adding at the end.

Disqualified

  • Array: Disqualified since it can’t magically expand like the others.
  • Dictionary: Needs a key – we only have values in this test.

Read every item?

Winner: Array

Disqualified

  • Dictionary: Needs a key – we only have values in this test.

Get Item 100?

Winner: ArrayList.

Interesting here since I would’ve guessed Array but these are all so close it likely is just background noise interfering.

Disqualified

  • Dictionary: Needs a key – we only have values in this test.

Get Item 100 & 99

Winner: Array

This test was designed for LinkedList, so very surprised it didn’t win but these are all so close it likely is just background noise interfering

No surprise – it is perfect for this situation.

Disqualified

  • Dictionary: Needs a key – we only have values in this test.

Find with a key

Winner: Dictionary.

No surprise – it is perfect for this situation.

Insert At Zero

Winner: LinkedList<T>.

No surprise – it is perfect for this situation.

Disqualified

  • Array: No way to insert at zero.
  • Dictionary: Needs a key – we only have values in this test.

Raw Numbers

Test Array ArrayList List<T> LinkedList<T> Dictionary<T,K>
Known Size Insert Speed 0,0067952 0,0251845 0,0125625    
Unknown Size Insert Speed   0,0322676 0,0178365 0,0324339  
Read Every Item Forward 0,0081916 0,0135402 0,0124702 0,03132  
Get Item 100 0,0023351 0,0021007 0,0021014 0,0033034  
Get Item 100 Then Get Item99 0,0020214 0,0021889 0,002163 0,0032026  
Find Item With Key 0,0059856 0,0042258 0,0063673 0,0067397 0,0026851
Insert At Zero   0,0047753 0,0034568 0,0024224  

Myth Outcome?

BUSTED! Arrays are fast, taking half the awards for speed, but they are not universally the fastest (depends on situation) and also the other options are so close that performance is likely not the issue when considering which structure to choose.

Where do I start to learn Windows 8?

I have been asked at the talks I do, where can I go to learn Windows 8? What material is available? There is a LOT of content available for Windows 8. For me personally I learnt initially from the Windows 8 Camp in a box and building my own test apps. However the ever amazing Bruce Nicholson provided me with a fantastic list recently (so go thank him):

Presentation Dump - Mid 2012: .NET 4.5, Visual Studio, psake, JavaScript & Coded UI

With 2012 flying along it is time for the bi-annual presentation dump, where I upload most of the slides I have done in the last six months to share with you! I say most, as some presentations are NDA and those, unfortunately, I can’t share out – but where I can upload slides I do! I will also sometimes edit slides to exclude information that is NDA – so if you have questions let me know!

In this presentation dump we have:

  • What is new in .NET 4.5: This talk gives a big brain dump & a few demos of what is new & shiny in .NET 4.5. Focuses on the framework so it is mostly ignoring the language features (like async) & tooling features.
  • Visual Studio 2012 Preview: I gave this talk when it was still Visual Studio “11”, so you will see that a bit. Really covers what is new, improved and removed from the next release of Visual Studio. Definitely not a complete look, more a high light reel!
  • psake: A look at the psake tool and how it fits in the build/ci/deployment landscape.
  • JavaScript proven practises: One of my favourite talks so far this year – covers 12½ suggestions to improve the quality of your JavaScript! I am using this a lot as a reference guide when I build my code!
  • Automated UI Testing with Microsoft Technologies: A very interesting talk I gave about what Microsoft offers from a UI automation perspective for testers.

It seems like quieter period than ever before but it was not! In terms of number of unique slide shows that I can share it is very small as there is a fair amount of ones I couldn’t (even edited) and I also already uploaded two this year and finally a LOT of talks have been around Windows 8 and based on the camp in a box.

You can get all the slides and details by clicking “read more” below!

What is new in .NET 4.5

Visual Studio 2012 Preview

psake

JavaScript proven practises

Automated UI Testing with Microsoft Technologies

Windows Phone 8 Announcement - My cheat sheet of what was announced!

Windows Phone 8 Announced
Source: http://live.theverge.com/windows-phone-developer-summit-2012/

Out by Christmas (America fall really)

Same core as Windows 8 (i.e. desktop & phone are one platform!)
+ Hardware-wise, should mean a wider ranges of form factors, price points, and capabilities.
+ That means a bunch of stuff: kernel, networking, multimedia, driver support — that stuff will be shared with the two platforms.
+ IPv6
+ Improved Bluetooth
+ Hardware accelerated Direct3d
+ Manufacturers will be able to re-use the same hardware drivers they build for Windows 8 on Windows Phone 8.
+ "Dual-core & more...." CPU support

New resolutions
+ 800x480 (what we have today)
+ 1280x768
+ 1280x720
+ All 7.5 apps will run without changes on all resolutions!
+ Can optimise if you want to - OPTIONAL

Removable MicroSD support
+ Phones, music, videos & apps can all be put on the microSD

IE 10 built in
+ Same HTM rendering as desktop version

Native game development
+ DirectX is mentioned
+ I am reading that to be C++ support
+ Same drivers as Windows 8 so games can be ported quickly & easily

NFC support
+ Be able to share between devices
+ Will have a "mobile wallet"
+ Credit & debit cards
+ Loyalty & membership cards
+ Mobile wallet can work without NFC for some scenarios
+ All marketplace purchases now in the wallet
+ In App Purchases Supported :)
+ can do operator billing
+ Needs carrier "secure sim"
+ Tap & share (contact, files etc...) between devices built in!

Nokia mapping technology included in in Windows Phone 8
+ It will use NAVTEQ data, offline map support, map control for developers, and Turn-by-Turn directions.
+ Includes My Commute - remembers your daily route and shows alternatives that can improve performance

Enterprise
+ Encryption & Secure Boot
+ LOB app dev
+ Businesses can run custom secure app stores
+ Device management
+ Same tools we use today for Windows can be used to manage the phone!
+ Familiar office apps
+ I read that as normal office

New Menu UI
+ No more right hand side (apps list)
+ Just the main screen
+ Icons can be "normal", double-wide & small (one quater size)
+ Any tile can be any size
+ double wide for custom apps finally :)

Windows Phone Upgrade Story
+ 7.5 will get a bump to 7.8
+ 7.8 will include new start screen
+ Updates sent via Windows Update - BYPASSING OPERATORS!!
+ All devices to get 18 months of updates
+ Can register to get updates early too!
+ Source: http://www.winsupersite.com/article/windowsphone75/windows-phone-78-pre…
+ No 7.5 device to 8 though :(

Dev Features
+ Built in SQLite DB!
+ Speech platform will be open to all!
+ Can integrate any app into voice search
+ Can respond with voice prompts (can you say siri?)
+ Can develop in C#, VB, C++ or HTML 5!
+ Can mix & match

Built in VoIP technology
+ Skype fully integrated, looks & feels like a normal call
+ VOIP apps, not just Skype, can run in the background

Apps that use location can run in the background
+ Think map (direction apps!)

Camera
+ Self timer
+ Action shot for burst mode
+ Panorama
+ Group shot (take multiple photos and swop the faces - get rid of those closed eyes)

Who can see my tweets to a friend?

A while ago I wrote a post about an interesting Twitter behaviour – if you start a tweet with an @<username> only people who follow you & that person can see the tweet (if you unsure see the post which explains it). The question I had today was to find out who is in that list – or put who follows us both on Twitter?

Oddly, I could not find anything (besides some tools that cost money) to do this?! So I built my own awesome little tester you can use below:

First username
Second username

    For the developers among you who want to see how this all works? Check it out on bitbucket: https://bitbucket.org/rmaclean/twittercoalesce

    Known issues:

    • My website uses an older version of jQuery, and I use a newer one in this code. If the loading gives an error or gets stuck on the screen, try refreshing the page (seems to solve it).
    • If you have too many people shared between the two names - it will break. I am looking into how to solve this. If this happens - loading will get stuck :|
    • Twitter limits clients to 150 calls per hour. If the rate limit is exceeded the loading will get stuck.

    Lightswitch & the HTML Client - what does this mean?

    Lightswitch Recap

    For you to understand the rest of this post it is vital you have a high level understanding of Lightswitch and how it works. Lightswitch is a RAPID development platform from Microsoft that makes development of line of business (LOB) apps really easy. The team at Microsoft often talk about citizen developers – i.e. people who are not full time developers, but are the “IT guy” in the department or company that need to put together a great looking solution. The team also talk about no-code solutions – where you can build great systems without code.

    imageBoth statements from the team are true and false at the same time. Sure your accountant can build a CRM system with no code in Lightswitch, but Lightswitches true value is that it is a professional development tool, and in reality unless it is a really simple solution you will need a touch of code.

    What is great is that Lightswitch allows the citizen developer to write a system that can be matured by professional developers later on – it’s power is that it does not lock you into being too simple or too complex a development system.

    For me the value proposition is that you get REAL rapid development, that citizen developers can put together and extend solutions that are well architected and that when the need is there a professional developer can extend that solution and hand it back over to the citizen developer – it is the circle of Lightswitch.

    Architecture

    When you craft a (avoiding the development term here on purpose) Lightswitch create a multi-tier architecture, that is either two tier (client & database) or three tier (client, server & database). Two tier is really three tier but the server & client are just one package.

    The database can be any support by Lightswitch, the middle tier is OData and the front end is Silverlight. The choice of front end has recently hurt Lightswitch because Silverlight is dying. However if you step back for a second and think about it Lightswitch provides the easiest and fastest way to build a complete (and I mean complete, authentication, methods, proper designed) OData solution… you could always ignore the client portion and build on top of the OData server.

    Making a HTML Client

    The HTML client mode for Lightswitch is a recently announced new feature that allows you to build a client that runs in a browser, and not just Internet Explorer on Windows (Dynamics CRM I am looking at your shameful behaviour) but pretty much any browser, say on an iPad:

    Deployed app on ipad -1

    This is possible because of two things, the OData server which allows really any technology to connect to it, and the second piece of the Lightswitch system the LSML file.

    I hope you have never heard of the LSML file, as it is not a nice place to go to – it is a MASSIVE (even simple demo’s I build are thousands of lines) XML file that stores ALL of the Lightswitch system in a “Lightswitch domain language”. This enables the team to take that information, parse it and produce output based on it. So the concept of producing a parser that creates HTML rather than Silverlight is really simple… just build the parser.

    What do we know about this HTML client so far?

    It is early days, in fact there are no bits available yet, but we do know some things from the demo’s and screen shots that are available.

    • Multiple themes will be supported (their is a dark & a light at least) – thanks to the jQuery Mobile that powers it.
    • It is a separate client – so you will have a Silverlight experience and then also have the HTML experience added in.
    • It follows the true Lightswitch model of being easy to build with no code, but if you need that little extra, the JavaScript can be edited.

    Light theme - 2Dark theme -  2imageCustomizing UI with JavaScript

    The Important Two Issues

    To wrap this up it is a very exciting point in time for the Lightswitch world with so much happening that I think it is important to take a step back and find a few key aspects about this amazing feature that will help position it. There are two that really stand out to me from all the announcements:

    Separate Client

    This is not a Silverlight to HTML generator – it is separate. This means that awesome Silverlight chart you use today will not magically work in the HTML client. This has both advantages and disadvantages, but if you think about the dying of Silverlight I am very glad that they have a whole new growth path.

    It also allows for the real scenario of supporting a rich experience in Silverlight in a company (where we control all the machines and know we can run Silverlight for a long time still) and having a mobile or companion experience in HTML for those people on the road. Sure they do not get the great sales forecast chart but they can still capture their sales on their iPad.

    Web Developers

    A recent did an survey of app developers looked at what they are building today, what they were building and what they intend to build in the future (future = one year in this survey). Interestingly there are only TWO platforms that are getting growth in the future? HTML & Windows Phone. Android, iPhone and many others are all expected to decline.

    If you think about those numbers and add in the MASSIVE investments in HTML development that are in Windows 8, it should not surprise you that web development is a MAJOR area in the future of all developers. It also means that web developers can start to have way more opportunities in the market outside of building websites & portals, and that is very exciting as that little garage web designer company today could be a major line of business developer in a few years.

    Visual Studio Extension Manager Error 417

    imageWith Visual Studio 2012 there is an increase importance placed on the Extension Manager component, not only does it provide a great integrated experience to the Visual Studio gallery for downloading and updating extensions but it will also be used to delivery updates for Visual Studio itself!

    At work it would constantly fail to work with error 417 – expectation failed, so working with our facilities team we were able to identify the problem as an issue with the proxy server we use, Squid.

    Squid seems unable to handle the HTTP status code 100, and will then fail with the error 417. To solve this you simple need to add the following to your squid.conf file: ignore_expect_100 on

    Windows 8 Boot Camp: Johannesburg 24 May

    imageYesterday Rudi Grobler & I had awesome fun with a full room of amazing people who took time off work to attend a full day of free Windows 8 training. The audience was amazing, breaking a lot of my expectations of how audiences react at free events, which really honoured Rudi & I to have most people stay to the very end of the day.

    5b

    For those people who attended the training, or those who didn’t but want the content too:

    File attachments
    Windows 8 Demos (551.64 KB)

    .NET 4.5 Baby Steps, Part 8: AppDomain wide culture settings

    Other posts in this series can be found on the Series Index Page

    Introduction

    Culture settings in .NET are a very important but often ignored part of development, they define how numbers, dates and currencies are displayed and parsed and you application can easily break when it is exposed to a new culture.

    In .NET 4, we could do three things:

    • Ignore it
    • Set it manually everywhere
    • If we were using threads, we could set the thread culture.

    So lets see how that works:

    // uses the system settings
    Console.WriteLine(string.Format("1) {0:C}", 182.23));
    
    // uses the provided culture
    Console.WriteLine(string.Format(CultureInfo.InvariantCulture, "2) {0:C}", 182.23));
    
    // spin up a thread - uses system settings
    new Thread(() =>
        {
            Console.WriteLine(string.Format("3) {0:C}", 182.23));
        }).Start();
    
    // spin up a thread - uses thread settings
    var t = new Thread(() =>
    {
        Console.WriteLine(string.Format("4) {0:C}", 182.23));
    });
    
    t.CurrentCulture = new CultureInfo("en-us");
    
    t.Start();
    
    Console.ReadLine();
    

    CropperCapture[2]

    You can see in the capture above lines 1 & 3 use the South African culture settings it gets from the operating system. What if I want to force say an American culture globally? There was no way before .NET 4.5 to do that.

    What .NET 4.5 adds?

    By merely adding one line to the top of the project, all threads, including the one we are in get the same culture:

    CultureInfo.DefaultThreadCurrentCulture = new CultureInfo("en-us");
    

    CropperCapture[1]

    File attachments
    Culture Info Demo (33.41 KB)

    .NET 4.5 Baby Steps, Part 7: Regular Expression Timeouts

    Other posts in this series can be found on the Series Index Page

    Introduction

    While the regular expression passing in .NET is damn fast, there are times where it can take too long for your needs. Until now there hasn’t been much you can do but wait. In .NET 4.5 we get the ability to timeout regular expressions if they took too long.

    Problem

    So lets look at a really silly example to start off with, checking a string fifty million characters (where only one is different) against regular expression which is looking for fifty million letters. As I said it is silly, but to get a truly slow reg ex is pretty hard.

    static Regex match = new Regex(@"\w{50000000}", RegexOptions.None);
    static void Main(string[] args)
    {
        var sw = Stopwatch.StartNew();
        Console.WriteLine(match.IsMatch(String.Empty.PadRight(49999999, 'a') + "!"));
        sw.Stop();
        Console.WriteLine(sw.Elapsed);
        Console.ReadLine();
    }
    

    This 13.5secs on my machine!

    Solution

    imageAll we need to do to take advantage of the new timeouts is modify the constructor of the Regex, by adding a third parameter.

    static Regex match = new Regex(@"\w{50 000 000}", RegexOptions.None, TimeSpan.FromSeconds(5));
    

    Now after five seconds a RegexMatchTimeoutException is raised.

    File attachments
    RegEx timeout demo (32.73 KB)