It is important to identify and distinguish versions of your research data files consistently. This ensures that a clear audit trail exists for tracking the development of a data file and identifying earlier versions when needed. When establishing a strategy for version control you can use the following tips:
This part mainly draws on Jiménez (2017).
Organising everything with version control is a very clever idea, throughout your research. And interesting and indispensable as these version control platforms are, you may not want to use them to control your raw research data, for instance; multi-gigabyte files are likely too large for these platforms anyway. That being said, your processed research data is an entirely different matter indeed.
Version control needn’t be restricted to just your research data. If, for instance, in your research you also develop custom scripts or software, then make sure you clearly document your code, document your project and put it on version control platforms just like you would with your research data. Mind, organising your code on these platforms does not necessarily mean that you are putting them on display for the entire world to see; you control access rights, which means you decide who gets to see your code, and who doesn’t.
Putting it on version control platforms just makes a lot of sense, not only for the obvious benefits of having several versions to roll back to if need be. While according to Fogel (2005), the longer a project is run in a closed manner, the harder it is to open it later, we do understand that many people may not be inclined to disclose their source code already from day 1 -this is, after all, a work in progress. At some stage, however, you may already feel a little bit more comfortable releasing your code. This, too, is a very simple story if you have your files on, say, Github: change the access settings of your repository, and you’re done. Open source in matter of seconds.
Opening code and exposing the software development life cycle publicly:
Use version control software such as Git, Github, TortoiseSVN, Subversion.
If you want to know all there is to know about Git and Github, then why not have a look at our introduction to Git & Github?