06 Aug 2013

GitHub: why it is right for open source and wrong for you

originalFor most of the software development projects in the world, GitHub is the wrong choice for their hosting needs. This is because for most of the world is NOT developing open source software and thus the pricing model for GitHub fights against you, as the GitHub pricing model is designed to charge the daylights out of those who do not develop open source software.

To help me explain this – let us look at two examples.

An open source example

imageWe will use Git itself on GitHub, as our example for open source. Here is a project that is open source and makes use of a three public repositories. Looking at git repository there have been 702 contributors to it. 

When you look at the GitHub pricing model this could be run on any plan, including the free one since we are only constrained to have no private repositories in GitHub (you can have unlimited public repositories) and unlimited contributors, which makes it perfect for open source.

When you apply the recommendation I mentioned in yesterday’s post about lightweight repositories to this model (with the free unlimited public repositories), it makes even more sense for open source projects.

A corporate example

The other side of the coin, is a company that is doing consulting or maybe in-house development work that needs to be stored somewhere and (for whatever reason), that work cannot be open sourced. Thus you will need private repositories & you will need a lot of them to follow the lightweight repository principals. Even if you do not intend to follow those principals, you may find that splitting up your repository in an organisational way still creates a lot of repositories.

For example: I worked at a small company (< 30 developers) who would deploy and develop on top of Microsoft CRM & SharePoint. There we split things based on customer (so each customer got a repository) and after 3 years we had almost 200 hundred repositories! For them to use GitHub by simply moving over would mean a custom price negotiated with GitHub, as GitHub doesn’t offer standard pricing to that level. If they changed their structure to properly follow the lightweight repository principle they could’ve easily have tripled the repository count!

So if not GitHub, then who?

In a company, you need to view the issue from the other way around – you need loads of repositories. However you also have another difference to open source, you have limits on something that open source cannot and really do not want to have limited: contributors. In a company you really are only going to have staff or people who are working on the project contributing – so limiting the number of people who are involved is useful. It also gives you an easy way to cost the hosting, because it really boils down to per head costing.

So who offers Git hosting, with unlimited private repositories and charges per head? There are two main players in this space and I have used both found both to be great solutions. BTW I also use GitHub & think it is great too – but for more for collaboration in open source.

BitBucket

Bitbucket-Logo-on-Mevvy_com_BitBucket, which started off life as the Mercurial hosting to GitHub’s Git hosting, now days offers Git too. Their model is pretty simple, unlimited private repositories and basically a cost of $1 per user per month. It is that simple.

TF-Service

4405.vs_heart_git_0670975AWho expected Microsoft to understand Git hosting do well?! They have understood it marvellously with their TF-Service offering which offers unlimited private Git repositories. TF-Service is currently free, but it won’t stay that way and my gut says Microsoft can’t compete with BitBuckets $1 per user cost.

The reason it can’t compete with BitBuckets pricing, is that that TF-Service offers more than code hosting. Microsoft offers build servers, work item tracking and more! Plus if you have Visual Studio Premium or Ultimate then you also have a MSDN subscription, which means you will be covered by your MSDN subscription so it will be free.

Summary

This post is not about knocking GitHub – it is an amazing service & has changed the way the world interacts with code for the better. What this is meant to do is show you that even though it is popular, it may not be the best solution for you!

The REAL number one issue to avoid GitHub

While this discussion is about the pricing model (which is often ignored when choosing a platform), I do need to say what the number one blocker for GitHub is. This is also the number one blocker for TF-Service & BitBucket. Simply put it is my customers. Many of my customers will not give permission for me to host outside the company and in a foreign country and thus we need to run our own infrastructure to accommodate those customers.

05 Aug 2013

Git: A happy repository is a lightweight repository

40330572I have had the joy of working with some great teams and learning how they use Git, but I have also seen it misused and the number one issue I helped teams resolve when using Git is what I call, the Fat Repository. The fat repository (not to be confused with the funny & NSFW Fat Git Enterprises), is an especially easy trap to fall in for teams who have come from the client/server source control world (SVN, TFS, CVS etc…) because in their world, a fat repository isn’t a harmful thing and even may be a good thing – but in the DVCS world the fat repository is a major problem.

Fat Repository

A fat repository can be one of the following two characteristics, or it can have both of them. I think both of these characteristics will be familiar to most developers who have used some sort of source control any period of time.

Characteristic One: Multi-tasked Repository

The multi-tasked repository is where you have a single repository which has multiple root folders and each folder is for a different customer or project. This is very common with internal focused items. For example a consulting company may develop a set of common libraries that are used in a variety of projects and put all the common libraries code in a single repository, so they are easily accessible for everyone in the company. Another example would be a department in a company, which has all the projects that single department has developed in a single repository.

Many people, myself included have multiple Visual Studio projects inside a single repository, that is no problem – where it becomes a fat repository is where those items/folders/projects are not related to each other from a delivery perspective. Often those projects only are lumped together, because they share some common organisation relationship.

The reason this is called multi-tasked repository, is because the single repository is responsible for multiple unrelated deliverables.

Characteristic Two: Multi-focused repository

The repository where it has not fallen into the multi-tasked trap can still become fat. This often happens when a single repository is used to handle many requirements beyond code of a single deliverable. A common example of this characteristic is when you have a documents folder or database backup folder in your code repository. Here the core issue is that while everything is related to the software development, not everything relates to the building of the code.

This is named a multi-focused repository because the focus is split between specs, database backups, virtual machines and the code (for example). I find this issue hits not only repositories, but tools like DropBox where you get a folder shared because you need one document and you end up syncing 100’s of other documents, backups, install files and what-nots that you don’t care about too!

Fat repository – Summary

In short a fat repository is one where there is more going on in a repository than just code & just the code that is needed for a single project or deliverable.

There is a simple test to identify a fat repository just ask: Is everything in here vital my deliverables? If there are items that are not vital – you have a fat repository.

Client/Server and the fat repository

In the client/server world of source control, like SVN or TFS, this isn’t actually an issue and it is very common to find fat repositories. In fact I would say if you are using a client/server source control – having a fat repository is an advantage since there is a single place to go to get everything you need.

The reason this isn’t an issue in the client/server world, like we will find out in the DVCS world, is because of two key differences:

  1. In client/server you have only a working set, not the entire repo. So you initially need to pull a smaller set down to your machine, compared to pulling down the working set & all the history in DVCS.
  2. in client/server because you are working with a server, you can checkout a single sub-folder or even a single file without needing the rest of the files in the repository.

DVCS and the fat repository

The DVCS world of Mercurial & Git works completely differently to the client/server world. Firstly the repository is comparatively cheap to create – just type git init and viola, a repo. In client/server you need a server admin to create one, this relative cheapness encourages you to create loads of repositories – some may last & get a remote version on a server and some may live short lives just on your machine – that is perfectly fine.

Secondly, when you grab a repository for a team you not only get the one folder you want in a working set version – you get EVERYTHING. Every folder, every file & ALL the history for it.

This is where it becomes painful to deal with a Multi-tasked repository or a mutli-focused repository because you are forced to pull down content that you do not care about.

Issues a fat repository causes

40330634Initial check out times that are ridiculously long. Most of the time this is a once off pain but it is still a pain. I say most of the time, because I went to help a team that had each deliverable in a separate multi-focused repository and while the working sets were small and the team members had no issue with the one long initial check outs – the senior developers that floated between the teams to help out found that pulling the repositories down was ruining their productivity, especially if they were just trying to help with a single bug. This teams repositories were so huge that most people couldn’t keep more than two around, even though the working sets were less than a gig.

Disk space usage. You developers want a SSD and while it will make them more productive it also means that space is a premium or you will be forking out a lot for those massive SSDs. With DVCS, they get everything in the repository including all the history. Check in a 1Gb database backup, then delete it… in the client/server world the working set wouldn’t be impacted, but in the DVCS world everyone will pull that 1Gb file down initially and that means a lot more disk space usage.

Solution: The Lightweight repository

So when using a DVCS, the solution is simple:

  1. Keep the repository as lightweight as possible – only check in the things you NEED. This means taking care about what is checked in & using functionality like .gitignore & using staging to ensure you only check in what you should.
  2. Since DVCS repositories are cheap, have loads of them – have one for documentation, have one for code, have one for artwork etc… That way those who need something can cherry pick the repositories they need.
  3. Avoid the Law of the instrument – just because you can put things in repo’s doesn’t mean you should. Do you need the history on every DB backup, not likely – a file share or DropBox maybe enough for those! Or why not use a tool that is better suited to Word documents, with the ability to index their contents so that they can be searched, for example SharePoint. TFS does this with the full version – where every repository gets an SharePoint site to put the documents in.

In short a happy repository is a lightweight repository!

30 Jul 2013

An Aphelion

Aphelion is the point in the orbit of a planet or comet where it is farthest from the Sun.

-- Wikipedia

Tomorrow, July 31st will mark an aphelion in my career – I am leaving BBD today after 5 fantastic years in the ATC team.

BBD

This was not a simple decision to make, BBD is not only an amazing company but some of my best friends work there. The ATC team has enabled me do a very special job where I did desk research, guidance to many customers & teams within BBD, key customer engagements & promotion of the company – all of that was amazingly fun and grew me in the most wondrous ways (including growing my waist - damn you free vending machines).

So if everything is awesome there, why leave? Because I have a unique and amazing opportunity to build something on a scale that being in a team that is internally focused would not ever enable me to have.

Aphelion

Where am I heading? Aphelion (see what I did there – loads of jokes that you didn’t get until the end, now re-read this from the start & laugh heartily with me… sigh, that joke isn’t as funny as I hoped, and doesn’t make much sense). I’ll talk more about the company & work I am involved in at a later date.

How does this change my involvement in x?

Where x is

It doesn’t – that all remains the same! In fact the people at Aphelion are regular speakers at events, run their own user groups & there is already a MVP working there, so I expect to feel very much at home there!

Tags: 
25 Jul 2013

The ungrateful behaviour award

452511_crybabyThis week’s award for more ungrateful behaviour goes to Greg Young, who seems to think that being a speaker at an event makes him special. The story is that Greg went to TechEd North America as a speaker, with multiple talks to present. His wife joined him on the day of his first talk, but the organisers wouldn't let her into his talk (to take photos of him) for free. He then cancelled all his talks at TechED NA and also cancelled his talk at TechEd Europe, in response to the on the ground staff not accommodating him.

All that is right

I FULLY agree with him that the TechEd NA people on the ground could've & should've been handled it way better (I would've just let his wife in) and his suggestions on how they could've done it better (find his wife and take to the speakers room/Starbucks for example) are totally correct.

As he said in his post, this was the last in a bad experience for him as a speaker:

  • Problems with flights (costs & times)
  • Booking for other events that didn't take into account flights etc...

And so on, and you know what? HE IS RIGHT, that was a bad experience, which no one (speaker, delegate or staff) should have to deal with. It seems that TechED NA was badly organised and I believe he is correct in his view that it is caused by it having many parts working independently and those parts not having authority to make things happen without 900 billion meters of red tape. His first three points of issues are spot on and ALL conferences should learn from him.

All that is wrong

If I agree so much with him, why do I have an issue with his behaviour? It is four points:

  1. Greg says "Speakers are not commodities, they are people who are giving much to help the conference" - I am sorry Greg, but speakers are commodities. We are there to help the conference succeed & serve the audience. Nothing is more important than serving the audience & cancelling your talk shows that you think your situation is more important than that. And if you not willing to do that, tough - you will be replaced... just like a commodity.
  2. Speaking at an event like TechED NA, plus talking on Channel 9 is an honour & should be approached as such. There are hundreds if not thousands that would put up with a lot more bad experiences to have that opportunity. Wasting it is a slap in the face to the many people that could've & would've done the talk.
  3. Being a speaker is about planning - talks that fail are mostly due to lack of planning. Greg has a responsibility to ensure his planning, which includes where his family will be during his talk, is done ahead of time with the right people & not with the people on the ground at the last minute. Planning is a corner stone to professional speaking. I am sure someone involved knew his wife was coming (they booked her flights) but that doesn’t mean everyone involved knew (a conference like this is massive, with hundreds of moving parts). I can also see the staff making the assumption that his wife joined him in the city but wouldn’t attend the conference. At the end of the day it is his responsibility to plan with the right people ahead of time to make sure everything works out.
  4. Now, the most important point of this entire post: Let's assume his only choice was to cancel that talk, as he had to do what is right for his wife and go and tell her (leaving her alone for 2 hours would be rough). I would not see that cancelled talk as an issue, and I would not have written this. The single biggest reason I believe his behaviour is award winningly bad is he cancelled the rest of his talks too! There is no reason other to do that, than to stamp his authority and view of being special. He has even cancelled his talk at a different Microsoft conference?! Who is that hurting? The people that are measured for the success & the staff of TechEd NA & Europe are different! That behaviour is unprofessional and deserving of an award. Cancel the one talk, sort things out with the wife & then go and serve the audience - that is what he should’ve done.

Final Thoughts

I speak at Microsoft events all the time but I have never spoken at TechEd NA. I have never met Greg and only know him from a few tweets, his amazing CodeBetter posts and this story and I think he must be amazingly bright if he gets the chance to speak at TechEd. I am not suggesting otherwise, I am talking about his behaviour in this situation. I am not even suggesting this is his normal behaviour, it may just be him grumpy & irrational from the jetlag & lack of sleep he mentions at the start of the blog post, but in that case rather admit you were wrong & don’t post a blog post like that.

24 Jul 2013

Windows Store App Logo Pack

Clipboard01When I develop a Windows Store app, one of the core things you need to do worry about is all the logo’s, splash screens, file icons & lock screen badges. It can be a pain to create these every time – so I have created a logo pack which provides you with ALL thirty three (33) of the images you need.

These images all (but 3 – for badges) have a bright pink background making it easy to see which of your logo’s are using these original place holders compared  to those you have updated. They also contain the details (type, dimensions & scale) on them, so you can easily see when one is being used. The three that are for badges are monochromatic as that is a requirement for the badge logos.

The idea is you will use them as place holders for your real logo’s and also as a set you can provide to your designer or graphic artist who now has a base to work with and doesn’t need to figure it out themselves.

The usage of the logo’s is totally free and there is no rights on there usage, so GO WILD! You can download the logo pack below the video, which gives you a quick tutorial on how to use these.

 

Clipboard02Clipboard03Clipboard04

AttachmentSize
Package icon Windows Store App Logo Pack 80.42 KB
11 Jul 2013

Amazing Lock Screen + Windows 8.1

When the Windows 8.1 Preview came out a few weeks ago I was in no place to test it, because I needed my machine stable for a bunch of conferences I was speaking at. However that didn’t stop a lot of amazing users of the Amazing Lock Screen app, trying it out & finding it did not work!

Since then I have gotten VS 2013 & Windows 8.1 and started to dig into the issue which has confused me for DAYS on end – but yesterday I found out that Microsoft have acknowledged there is a bug in lock screens with Windows 8.1. It is this bug directly breaks Amazing Lock Screen’s ability to changed the lock screen.

Unfortunately until Microsoft issues a patch or the RTM comes out there is very little I can do to correct the issue. This bug is NOT stopping development on the Amazing Lock Screen though & the next release will be the biggest release in a long time with a couple of brand new features in it!

In summary: There is a bug in the OS, Microsoft will fix the bug, the app is still being worked on, is not forgotten & will work on Windows 8.1 in the near future!

10 Jul 2013

Netflorist & the plain text problem

clip_image001Two years ago I used Netflorist to buy some flowers, but first I needed to login. I had forgotten my password, so I used the "Forgot Password" option.

In the email I got was my actual password - which shows a MASSIVE problem in the design of the system that Netflorist uses. The password is either:

  1. Stored in plain text
  2. Encrypted

Why am I talking about it now? Because after two years, Netflorist has not fixed it! They have had the time to fix it, so let’s talk about it & if this info helps some horrible person hack them (& am not suggesting people do that) then tough for them.

What is plain text?

This is plain text - it is just the text. Why is this a problem for passwords? The reason is that if someone gets access to the database (physically, remotely, via hack, restoring a backup etc...) they can see ALL the passwords.

This has low risk for Netflorist since credit card details are not stored but this has a MASSIVE risk for Netflorist customers.

The sad truth is that most people are lazy & reuse the same password across multiple websites, which means the details on Netflorist can be used to commit fraud & theft elsewhere.

Scenario

Netflorist, being a good company, keeps 5 year’s worth of backups off site. Someone at the offsite company accesses those files, restores the DB and gets all the email addresses & passwords for everyone. They then go to TakeALot and log in with those details. Since TakeALot's credit card provider stores credit card numbers the criminal then purchases tons of stuff!

Just imagine the damage that could be done if someone uses the same password for their email & their second factor bank authentication goes to email – all your money is stolen… thanks to Netflorist not doing it right. If that happens and since the bank wasn’t at fault you wouldn’t be able to get the money back from the bank!

Encryption is enough! Right?

So we going to get a little technical now, there is actually many types of encryption (2 way, public/private key) but the core here is that in all cases there is encrypted data & a salt (or key or password - they are all synonyms) is used to decrypt the data.

So if we store the password encrypted in the database, we also need to put the key somewhere so it can be decrypted when the mail is sent out. The issue here is if someone can get access to the database, there is a high chance to get access to the salt too. Once someone has the data & salt - it is plain text.

Yes, this is tougher to do than plain text - but tougher is not the same as impossible.

So how should Netflorist this be fixed?

This is not simple, because we are working with security & doing it correctly isn’t easy. Thankfully OWASP has created some guide to help with this:

In short they should do three things. Note this is the SUPER simplified version. If you are doing this, read the above documents for all the details.

  • We do not store the password in plain text or even encrypted. We hash it. Hashing can be think of a way one encryption – so we can take the password + salt and a result (the hash), but we can never go from the hash to the original password.
  • The hash relies on a salt too, so we should use a salt that is unique per user. This solves the possibility of rainbow attacks.
  • Lastly is “The Forgot Password” system, since we can never get the password back from the hash, it cannot send it to the user requesting the password. The solution is to have a password reset option, where a user puts in some unique info and using a secure channel can put in a new password when they forget their password.

What can you do?

First, communicate with Netflorist (Twitter, email) about this risk and hopefully they fix it. Second, you can lower your personal risk by never share passwords across websites. This can easily be done by tools like LastPass (which manages the passwords for you & ensures strong & unique passwords are used) or find a trick that enables you to easily remember a unique password for each website. For example use a pass phrase like:

  • Netflorist could be: This is netflorist222
  • TakeALot could be: This is takealot222

It is easy to remember the unique password plus it is a pretty strong password. This is just an example, so come up with your own & be sneaky!

02 Jul 2013

Presentation Dump: End of 2012 & First half of 2013 - POPI, JavaScript, Open Source & .NET 4.5 Async

It has been an entire year since my last presentation dump, so following that tradition – here are some of the talks I gave in the last year that were not immediately available:

Protection of Personal Information Bill (POPI)

Description

A short presentation that focuses on the proposed POPI law, how it impacts businesses, technology, IT depts & the cloud. It was based on a draft so some aspects may have changed.

Thoughts

This was a tough talk for me, because the law isn’t something I spend much time focusing on. I spent a lot of time reading the bill & analysis for it and really was impressed how approachable all of it is. This is definitely a law we need to be aware of, but for most companies (who do things correctly now) it will mean either no changes or a slight update to some documents.

Open Source Licensing

Description

This talk focuses on what open source licensing is, how it should be applied inside & outside companies. It also looks at how Open Source != Free.

Thoughts

This talk looks at how open source licensing works & more importantly how it applies to company projects. Once again a bit of legal focus but very valuable info in it!

JavaScript Toolkit

Description

This presentation provides a quick glance at a number of tools that make development with JavaScript easy, quick & bug free. Loads of tools & ideas in it :)

Thoughts

The JavaScript toolkit talk looks at a LOT of tools and libraries for JavaScript & as I do a lot of this day by day – it was easy to get ready & a lot of fun to present.

 

How to give a great presentation

Description

This slide deck was used to give students an overview on how to give a great presentation, especially a technical presentation. It covers aspects like purposeful movement, technology, slide creation etc....

Thoughts

This is a talk I gave to some students about how to do a great talk & it pulls a lot of my own learning’s & learning’s from experts like Hanselman into it.

.NET 4.5 Async

Description

A look at some of the complexities of .NET 4.5 Async

Thoughts

The newest talk here which covers the Async keyword. The core focus here was not on the simple scenarios but rather to dive into the more complex scenarios and areas of pain that can occur with this new keyword.

Pages