I am a Dropbox User and I Don't Get Git
Hey Ernie, I’ve looked at a few Git tutorials and they are all full of jargon: commits, hashes, and whatnot. I’ve been using Dropbox (and, occasionally, OneDrive and Google Drive at work) for years and I don’t seem to get what the case is for Git. Can you help me?
Sure. I know exactly where you are coming from. I’ll explain how things work in Dropbox, and how Git differs.
Dropbox is a Repository for your Files in the Cloud. Git is Not.
Dropbox may be understood as a folder in the cloud to which you add
files, and those files are replicated to every device connected to your
Dropbox account. For example, you may save
README.md to your Dropbox
folder on your Macbook laptop, and a few seconds later, this file will
magically appear on your Thinkpad laptop. If there was an existing
README.md on your Thinkpad laptop, that version would be
replaced with the most recent one. There’s no reason to panic because
all the overwritten files are kept in the Dropbox server.
In this sense, you can imagine Dropbox as a brain that lives in the cloud, that not only stores your files, but that also keeps track of their changes (including deletions). In Git, the “brain” is distributed and exists in every device that is used to work with the same folder or repository.
What do you exactly mean by “brain” Ernie? I don’t think this metaphor is helpful.
See, in Dropbox, if you are not connected to the Internet, and you accidentally delete or overwrite a file, you won’t be able to recover the deleted (or previous version) of a file, because you are disconnected from the brain, which, in Dropbox’s case, lives in the cloud. In Git, instead, the brain is in all of the devices in which you have downloaded (cloned in Git’s jargon) a repository. This means that you will have the history of all file changes as well as the capability of tracking any further changes you do, even whilst being disconnected from the Internet.
In short, you can imagine as though you were running the Dropbox server both on your Macbook and Thinkpad laptops. If you happen to use a Git “server” like GitHub or Bitbucket, these don’t have fundamentally a different brain than that which runs on your personal devices.
I find your assertion shocking Ernie. It can’t be true that when you synchronise a folder in Git, you get all the historical changes (including deletions!) to files in that folder.
It is called a repository in Git and you don’t synchronise a repository, you just make a copy; in Git’s jargon, you clone a repository. Take for example, the Terraform repository at GitHub. Terraform is the one of the most widespread tools to define Infrastructure as Code across cloud platforms. You get a copy of the repository, down to, say, your Macbook laptop, like this:
1git clone https://github.com/hashicorp/terraform.git
This command will create a folder called
terraform, in which you’ll find
all of the project’s files.
I’ve run the above command and indeed the project’s files are there, but surely, if a developer deleted a file, I would need to log onto GitHub.com to recover said file. Same like Dropbox. Isn’t it?
No need to log onto GitHub.com. The full change history is right there on your laptop. You use the below command to show all the changes to the repository, along to descriptions written by the developers about said changes.
1git log --oneline 2 3dc71cd449 (HEAD -\> main, origin/main, origin/HEAD) Merge pull request 4#28684 from hashicorp/radditude/org-read-error 58161bc3ab backend/remote: clearer error when org read fails 670fed23be lang/funcs: File hashing functions stream data from disk 742e098583 command: use -lock=false consistently in -help output 8ed121321c website: Revamp the "terraform state mv" page 9ea089d06f website: Revamp the "terraform state rm" page 100aa0e00fd website: Backend docs link to new .gitignore anchor 11874f1abb2 cli+website: -ignore-remote-version docs and other cleanup
Is this being fetched from the Internet?
No. The full history of changes is on your laptop now. The deltas are
tracked in a special folder called .git, at the top of the project. If
you want to see further details about each change, you can remove the
But I can’t tell which file has changed Ernie. Are the files represented by those weird hexadecimal numbers at the beginning of each line?
No. This takes us to the next topic.
Dropbox is a file-wise system. Git is Not.
The unit of change in Dropbox is the file itself. A file is first created, then modified (possibly multiple times), and then deleted. When you think about the “current version” of a file in Dropbox, you are thinking about the most recent change that you did in any of your devices attached to your Dropbox account. If you mess up, you go to Dropbox’s website, and request to show previous versions of a file so that you can recover it.
Dropbox assumes that a change to a file means a new version of it. This is why, as soon as you hit save, say in Microsoft Word, you’ll see the Dropbox icon displaying an animation in reaction to this action. Git is not like this. Git does not care about files you add, modify or delete unless you want Git to be aware.
So, when I save a file, it is not automatically in “Git”…?
No, it is not. Your new or modified file is in what is called your “working directory”, but as far as Git is concerned, such a file is not an existing file in any version. Moreover, the tracking of revisions is not at the file level, but a commit level. This means that what you have is not quite the version of a file, but the version of the commit in which the file was changed.
Too philosophical; say that I have changed the
README.mdfile from the Terraform project by adding “Hello world” to the end of it, and that I have also created a new file called
NEW.md, like follows:
1$ echo "Hello World" >> README.md 2$ touch NEW.md
How do I “save” the changes to Git so that they are not lost?. By the way, this happens automatically in Dropbox so I’m already losing my patience here.
Well. First you need to make Git aware of those files…
Losing my patience. Is Git so dumb that it doesn’t know that I have changed a couple of files?
Ok. Fair enough. Git can notice that you have made changes since the
last commit indeed. You can use the
git status command for this purpose.
1$ git status 2 3On branch main 4Your branch is up-to-date with 'origin/main'. 5 6Changes not staged for commit: 7 (use "git add <file>..." to update what will be committed) 8 (use "git restore \<file\>..." to discard changes in working directory) 9 modified: README.md 10 11Untracked files: 12 (use "git add <file>..." to include in what will be committed) 13 NEW.md 14 15no changes added to commit (use "git add" and/or "git commit -a")
As you can see, Git tells you that you have modified an existing file and that you have also added a new one.
Ok. How do I save the files to Git now?
As I said before, it is not that you “save” individual files. Instead, you select all the files that you want to be in the next commit. The process of selecting files for a commit is known as staging those files. This actually means moving said files to a “staging area”.
Which folder is the staging area?
It is not a folder, but a property of the files under consideration; you simply have to add the relevant file using
git add <file> like this:
1$ git add README.md 2$ git add NEW.md
If you now type
git status again, you’ll notice that these two files
are now ready to be committed using the
git commit -m <MESSAGE> command.
1$ git status 2 3On branch main 4Your branch is up-to-date with 'origin/main'. 5 6Changes to be committed: 7 (use "git restore --staged <file>..." to unstage) 8 new file: NEW.md 9 modified: README.md
You can also notice that you can remove files that you’ve added to the
staging area using
git restore --staged <file>.
Oh, wait a second, I don’t want to mess up with the Terraform project.
You won’t. Your changes are local until you push them.
Ok, but you are going too fast. Slow down. Easy, easy, easy Ernie. So far I got that I use a combination of the
git restore --stagedcommands to select files, and remove files, respectively that I want to include in a revision, which you nerds call a commit.
Now, before that, how can I see the difference between the original version of
README.md, and the one I have just modified by adding Hello World.
You use the
git diff <file> command, but if the file is in the
staging area, you need to add the
--staged flag, like this:
1$ git diff --staged README.md 2... 3 4+Hello World
Ok. I can see that I’ve added Hello World to
README.mdNow, let’s save the whole damn thing. What do I do to achieve that?
As I said before, your files are now in the staging area and what you ‘save’ is a bundle of changes, rather than each specific file. Said bundle is a commit; you do that as follows:
1$ git commit -m 'My saved files' 2 3[main 83ad90b3b] My saved files 42 files changed, 1 insertion(+) 5create mode 100644 NEW.md
-m flag is to describe the nature of the commit by the way. It
wasn’t that hard, was it?
It is not too bad, but my files aren’t yet on GitHub, are they?
No; Git is a distributed system; it is as though you were running GitHub locally. Remember the story about the brain? You got the brain running on your laptop. But, coming back to your comment, you are right, the GitHub in the cloud, visible to everyone, is not aware of your changes yet.
And how do I make it aware? You know that in Dropbox, all it takes is saving a file…
Yes, in the case of the Terraform project we are using as an example, because it was downloaded from an existing ‘origin’, in principle, if you had the required permissions, all you would need to do is as follows:
1$ git push -u origin master
This would be the case for your own projects, or the ones to which you have been assigned. If the location of the remote repository were not to be established, you would need to add it using something such as:
1$ git remote add origin https://github.com/hashicorp/terraform.git
Naturally, you would also need to register your credentials before all of this, but that’s a topic for another time. Hopefully, you now understand the key differences in workflow between Dropbox and Git.
I’m not yet convinced that Git is superior to Dropbox, but it is the thing that you people seem to use these days, so I have to figure it out sooner or later.
This wasn’t meant to be a Git tutorial per se, but a discussion about the key ideas that Git introduces, and that are hard to grasp, if having a traditional file syncrhonization application, such as Dropbox, as a background.
The next topic to learn would be how to handle conflicts when pushing from different laptops by the same user. Merging and branching, which people believe is what makes Git unique, should be the last topics to learn, at least for someone coming from a Dropbox (rather tham say, SVN) background.