In this post, I’m going to cover the reasons why you should use source control, its advantages, and why if you’re using source control that you should consider Git over other options available out there. I’m not going to cover how you actually use Git or how to set it up as this is a whole separate blog on its own, but the first step in source control is understanding what it’s for and how it helps you do your job more smoothly.
Why Use Source Control?
A history of your changes
When you’re writing code, nothing hurts more than if you’re trying to fix some bugs but it seems the more that you do, the more bugs you introduce. Sometimes, in those situations, what you wish you could do is rewind time and see the code you had two hours ago, but that version’s gone from your computer and all you’re left with is a complete mess. That’s one of the reasons why source control is an attractive option.
Working with others
Let’s look at another scenario in which you’re working on a project with a friend and you both need to work on the same file at the same time. It’s all well and good to work on a shared file over a network, but once you’ve both started editing it, neither of you can save because only the last person to save will have their changes stored and will overwrite the other person’s changes.
Backups of your work
Finally, everyone knows you should be making backups of your work, but backing up work in progress can be a nightmare. What you really want is to only backup when you reach a time that the software actually works. Until then, what you’re backing up might be half-baked and not really ready for permanent storage.
However, with source control, you can choose when you want to backup and each backup is on a file-by-file level so that you can restore only the files you need. Another benefit is that each backup is cumulative so that the size of the backups are minimised. If you only add one line of text to a 10MB file, the second backup will only contain that one line and not the whole 10MB again.
Types of Source Control Available
There are many types of source control on the market, and each has its merits, but the most popular are SubVersion (sometimes referred to as SVN), Mercurial (often abbreviated to Hg; the chemical symbol for mercury) and Git.
SVN
I have been a long time fan of SVN, but since being introduced to Git I have become a complete convert. One of the biggest advantages I see Git having over SVN is that it is decentralised. In other words, SVN relies on setting up a connection with a remote server (referred to as a repository in Source Control lingo) and all the users have to download from that repository to a local folder on their machine.
They can make all the changes they wish to that local copy and then must send their new content back to the server where it gets merged in with everyone else’s work. This works really well – until the week when you find yourself offline and unable to connect to the main server. This is not a problem from the perspective of finally getting your code back onto the server, but it does mean that your whole week of work has no versions that you can go back to if something goes wrong.
Git
A local repository
Git, on the other hand, is still able to synchronise with a remote repository in the same way that SVN does, but the copy of the content on your own machine is a repository in its own right. In other words, when you’ve made some significant changes to your code and it’s working fine, you can commit a version to your local repository, fixing that version in stone so that if you make a mistake in the next few hours, you can still go back to it even if you have no connection to the internet.
Synchronising with a remote repository allows you to share that code with others, but can be done at a later date when you have access to the internet. At that point, all your changes will get merged with those of others, even if they are in the same files you’ve been changing, so long as Git is able to work out how those changes fit together. Additionally, if you want to commit that code to more than one remote repository because, say, you are working on a library that is used in several different packages, you can do so easily, whereas doing so in SVN would be very difficult. And what’s more, you can see how far behind each remote repository is in their updates.
Branching your code
Another advantage of Git is that it allows you to create branches with ease and merge those branches back together again. All good version control software should allow this, but it’s just that bit easier in Git than it is elsewhere. When you’ve got a solid base of code, you may want to start getting adventurous and know that there’s a high risk of messing things up, in which case, creating a branch may be a good idea.
Similarly, you may have a long list of todo items and need to work on them at the same time, but know that you’ll need to switch between their development when they’re not fully functional yet. Imagine working on one todo item, having to leave it until the end of the week and start on another todo item. That first one might not compile properly whilst you work on the second and in which case you’re forced to reach a point in the first, just so you can work on the second. Branches will save you here.
You’ve got a stable version 1.0 and you’re about to work on adding a help screen, so you tell the repository to make an independent branch so that you can work on the help, and you commit to that branch whilst you’re doing help related changes. But then your boss says “I want that offline mode done by the end of the day” and you haven’t even started on it.
The help code isn’t even compiling right now and you know it’s going to take another 2 hours to get it to a steady state, leaving you no time to implement the offline mode. No problem! Simply create another branch from the stable version 1.0 called offline-mode and work on the offline mode independently of the help in the other branch.
Both branches know they came from the stable 1.0 so when you finish offline mode, you can merge that code back into the main branch. Next day you finish help mode and merge that back in to the main branch as well. Git doesn’t care that you worked on all this code in an arbitrary order, it simply merges the code back and you’re left with the main branch containing all the code you’ve written and now help and offline mode work alongside each other.
Where can I get remote storage for my code?
So, I’m sure by now you have a pretty good idea why you should be using Git to source control your code, but how is storing all the code in a repository on your local machine any more safe than not using source control in the event that your computer dies on you. More to the point, how are you going to share the development work with others if you only have it on your computer? How are you going to synchronise your work with your colleagues or friends?
Your own remote server
Remote repositories allow you to drop your code on a server that’s somewhere else and therefore prevent you from losing your work as well as giving you and your colleagues a place to synchronise that’s not on your local machines. If you already own a server somewhere because you have a VPS or dedicated server that your company uses for providing services to clients or websites, then you can use this as a remote Git repository.
Simply have Git installed on that server and setup a folder somewhere for your project. Tell Git to use the folder to host a repository and then push your local code up to the server’s repository. Whenever someone pushes to the remote repository, Git does its best to mix all your code together in an intelligent way, but if conflicts exist because you’ve both made different changes to the same bit of code, then it will still rely on the person who was last to push their code to unravel what should be stored and what should be lost.
GitHub
For a long time, GitHub has been a service to host your code via a remote Git repository and collaborate with others. Once subscribed to the website for free, you can begin setting up repositories on their servers. However, GitHub’s central tenet is that code is shared in the community and so, whilst you can host unlimited code on their servers, you can only do this for free if you plan to make your code available to anyone and everyone in a public repository.
Effectively, your code is open source and any number of developers can join your team to help you improve the code if you let them. Additionally, anyone can make their own separate branch under their own account, without your permission, and make another package with your code as a backbone. Open source software is a fantastic way for innovation to come through, but it can also be the end of your opportunity to make money out of something if the source code is available online and anyone can just download and use it without your permission.
GitHub does have private repositories as well, but they are limited to paying customers and can work out quite expensive for small businesses just starting out.
BitBucket
BitBucket is a very similar service to GitHub and allows you to store your code in the same manner. The websites themselves are very similar, but BitBucket recognises that small companies and individuals don’t have the money to pay monthly subscriptions to host small amounts of code, so they take a different approach to GitHub.
When you sign up to a free account on BitBucket, you are granted any number of private repositories of any size so long as only 5 developers or less collaborate on them. This is ideal for startup companies where the number of collaborators will be small. Once the company is more established and you want to continue working on projects with 6 or more collaborators, then a payment is required.
In Summary
This has been a long blog and so I felt a summary was necessary to pull out the most important points:
- If you work with others on code, you need to be able to backup your code and/or you want to be able to go back to old code you’re written in cases of disastrous bug fixing or feature implementation, the source control is for you.
- Whilst other source control packages are on the market, Git offers a very comprehensive set of tools and is superior at both providing a local repository and branching, which are essential elements of a developer’s workflow.
- If you want to share with others and maintain a permanent and robust backup of your code, you should use a remote repository. This can be either on your own server, or you can use one of the many online service like GitHub or BitBucket depending on your requirements.