Why we need git
source control with the ability to jump back in time with whatever kind of file it is that your're using.
primitive soruce control - keep multiple versions to go backwards in time. ex:using different name to record the file version
Directed acyclic graph
what does this mean to normal folks ?
commit
A snapshot in time of what your code looks like. Then you make some changes and you create another commit, another snapshot in time of your project.
The arrow represents those changes, the diff, as it were, between your old state and your new state
you sent it off to a friend. They would inherit that document, they'd inherit that undo stack, and then they would continue working, creating a new undo stack on the end.
I've got my version of the undo stack in my Word document, and my friend has got their version, which includes the first part of mine and then the stuff they've added on the end.
Subversion or Git or Mercurial, you can merge those things back together, creating a merge commit, which then takes all of the things that I did, all of the things my friend did, and joins them together.
Sum up (1)
- commit : the snapshot in time of your work
- diff : moving to one snap shot to another snap shot. it's a set of changes
- branch
- merge
- repository
subversion and git difference (1)
subversion store the diff, git store the commit. 對 subversion 而言,我想知道過去某個 state 我需要回到一開始然後 replay all those diffs.
Git uses some really clever stuff underneath the covers that allows it to really compress that space down. (見進階版)
Cloning a Repo
when do we interact with Git
There are two main ways that you're likely to want to start interacting with Git, one of them is you've already got a project on your local machine and you want to turn that into a Git repository, so you can start tracking the history, and the other is somebody else has already started a Git repository, it's stored somewhere remotely, and you wanna get it down onto your machine so that you can work with it.
Git repository
It does that through a hidden .git directory. The entire history of that repository is stored within it, so in actual fact, you could just zip up your Git repository and email it to somebody.
More commonly, you'll be using a central store for your Git repositories, for example, GitHub, or GitLab, or BitBucket.
In fact, all of the machinery exists within Git, for you to use remote repositories, that are just stored on a server that you have SSH access to.
in order for pull requests to work, you have to do something called forking the repo in the first place. What that means is you take an entire copy of that repo, and replicate it in your own personal user space. You obviously have right access to this.
so now you can work on it locally, having cloned it, push your changes back up to GitHub to your fork of the repository, and if you want to share them with the original repository, you can do what's called a pull request,
git command
git | folder | content |
---|---|---|
git --version | cd | ls |
git clone url (確定 url 開頭是 https) | mkdir | ls -la |
git log | rm -rf 資料夾名稱 | |
git remote -v | ||
git init (create a new Git repository) | ||
git status | ||
git add / git add . | ||
git commit |
sum up
- folk
- Cloning:copying the entire repository, from the cloud, from GitHub, from BitBucket, from GitLab, down into your local compute. In other word, moving from the data the act is called cloning.
- Cloning a Repo:getting hold of an existing Git repository.
- Git repository:a directory that tracks its history
- GitHub: a big store for these Git repositories.
Creating a Repo
You can turn pretty much any folder, any directory on your machine into a Git repository. And by doing that, Git will create a hidden .git directory within that directory.
It uses that directory for all of its metadata, and its object storage to maintain the history as you move forward.
One of these things that you can layer on top, is the convention that you should start a repository with a readme and a license. It's good hygiene for every repository to include in.
ex
git config --list:this will list all of the configuration options that I've currently got set.
git config --local --list:show you the local configuration.
git config --global --list:show you the global configuration.
git config user.name 新的名字
git config user.email 新的 email
Creating a Remote
You'll take that repo that you created yourself, and discover how to push it to a remote like GitHub so that you can share it with other people.
A Git repository is just a directory that has a hidden .git directory within it.
Within Git, it has some transport protocols built into it, that allow two repositories, to synchronize with each other. We use this with the concept of a local copy, or a clone, and a remote.
when you cloned it in the first place, is called pulling.
command
- git remote -vv :that shows you all of the remotes that's currently set up
- git remote add origin url
- git push --set-upstream origin master:the set upstream argument ensures that my local branch will track the remote master branch.
Committing Changes
Remember that a Commit represents the state of the tree, or the directory, at a particular point in time.
Each Commit has one or two parents, and the transition from parent to child is often represented by a DIFF.
DIFF means the difference between the parent and the child. What are the changes that I made when I created that Commit.
git diff
: see the actual changes that you made.
git log -p
: shows the DIFF between each of the Commits,
the status message isn't showing you a new directory, is because GIT only tracks files. It doesn't understand directories.
It knows quite a lot about files, including things like their path and their contents. But the only way it knows of a directory structure, is from the path of each of the files it stores.
Therefore, if you want to create an empty directory in GIT, it can't be empty, and there's yet another convention where you put a hidden file in there, often called .keep, or .gitkeep.
ex : touch tutorial/.keep
// that allowed any files inside the tutorials directory.
git add tutorial/*
The Staging Area
What does GIT add actually do?
Git has three overarching conceptual areas.
- working directory : You can think of this as what's currently checked out, what's in the directory that I'm working in ? As you navigate through the different commits in the repository, checking each of them out, the contents of the working directory updates to represent that commit. You do your work in the working directory. You add it and then you commit it.
- staging area : you can think of staging area as building your next commit. When I type git commit, what exactly will go in as the contents of that commit? You've staged that commit in the staging area using git add and then when you go ahead and actually commit it, then it gets saved into the repository, it may also be known as the index. You've then created a commit, a point in history that you can then reference and access later on. And in fact, if you go and check out one of the commits inside the index, inside the repository, what that does is it updates the working directory to match that commit.
- repo
The staging area is actually incredibly useful. It allows you to do things like staging just a few lines of a file.
I might have changed 80 lines in a file, but I just want to do a commit that only has ten of them in. Because of the staging area, you can do exactly that.
ex
git diff shows you those two line changes that you made
Type git add . to add that change to the staging area and git status now shows that you've staged the modifications in the book_ideas file.
type diff will see blank, because there's no difference between the working directory and the staging area.
if you want to see the difference between the staging area and the index, either repo the most recent commit. Type git diff --staged
To unstage that change that already exists, type git reset HEAD books/book_ideas.md. This mean if you check status, you're back to the stage where you've got the modified file but it's not stage-ready for a commit.
use interactive adding, git add -i
. this launches you into a session that allows you to curate which things you want to stage.
At the top you'll see a list of all of the files that are currently being modified and whether or not they've been staged.
Below that you'll see a menu of the commands that you can use to highlight it for each character shows you how you can action each of the options.
- Typing s will show you the status
- The update menu allows you to stage files. It shows you a list of all of the possible files and you can select each one of them.
- The patch command allows you to select subsections of each of the files. This command will split the files that you've selected into hunks. It presents you with your first hunk and asks you what you want to do with it. Typing ? expands the list of all of the options available.
This hunk contains both of the changes and you actually wanna split this so that you can commit each of the lines separately.
Type s for it to split the hunk. It'll then present you with the first newly created split hunk. Type y for yes to stage this hunk.
Type d to see the diff of the file that you've staged.
sum up
Git diff shows you the difference for the unstaged changes
git diff --staged shows you the diff for the staged changes.
By renamed it means that the path of that file has changed. GIT doesn't understand moves because all it knows about files are their paths.
git checkout -- .
That resets the working directory back to the state of the index.
git log -p
shows the diff of you deleting that file.
Ignoring Files
why would you want to ignore files? ex:機密資訊 (api key, password);build artifacts;operating system generator files
use environment variables for all of the secrets needed for the application. Locally, these are stored in a file, which is ignored by Git, so that when we commit it, it stores everything apart from those secrets themselves.
your build process creates a load of metadata that it only needs for the linker. Well, you don't need to track that cause it can regenerate it next time it runs the build process.
MacOS creates the dot DS store file and there's a whole host of them across all different operating systems.
.gitignore
This is a simple plain text file.
And within it, you list a load of patterns for the paths that you want git to completely ignore.
*/*.html
This syntax means that it will ignore all files that match the glob that aren't in the top directory. All html files that aren't in the root directory.
*.html
ignore all html files irrespective of the directory.
!/*.html
This means don't ignore this specific file that has been caught by a previous ignore rule. The slash determines this is the beginning of the path, ie don't match it in every single directory, match it from the beginning of this path.
ex1
Type git config dash dash global core dot excludes file to find out the location of your global git ignore.
ex2
github.com/github/gitignore and find the global git ignore file that is suitable for your operating system.
Viewing History
git log
git log giving you all kinds of different views onto the history in the repository.
git log --number
: how many commit you want to seegit log --oneline
: a shorter summary of the log. That shows you just the short hash and the first line of each commit message.git log --decorate
: adds branch names if you don't already have them. (可以顯示包含標籤資料的歷史記錄)git log --graph
: gives you some kind of ASCII representation of the branches within the graph.git log --oneline --graph --decorate --all
: will show you ASCII representation of the entire repository showing branches, including remote branches and merges.git log -p
: To see the diffs between consecutive commits, typing this command. This shows a summary for each commit followed by the familiar diff representation with lines added and lines removed.
short log command
this is really useful for generating change logs
git shortlog
: It shows you the first line of each commit message grouped by author.git shortlog -4
: You can once again use the dash four to limit it to just the most recent commitsgit shortlog origin/master..HEAD
: Git shortlog origin/master..HEAD you can see all of the commits that have occurred between origin master and where we are now, i.e. HEAD. You can use any branch names, tag names or indeed hashes in these commit refs.
What about if you want to search for something? You can do that using yet again Git log.
You can filter by author, you can filter by commit message, you can filter by the content of the commit itself and you can look at the history of just one file.
Let's take a look at each of those in turn.
usign git log search for something
git log --author="s103071049"
:To see a log of just the commits that a particular author wrote,git log --grep="fix"
: To search the commit message contains keyword "fix"git log -- 檔案名稱
: to see the log for just one particular file. The dash, dash doesn't actually do anything here. All it means is that you're telling Git there are no more options. The thing after this is the name of a file.git log -- README.md
will show you a list of all of the commits that pertain to the read me file.git log -S"search term name" -p
to search the contents of the commit itself, i.e. the change set, you can use the dash S argument, so to find all the commits that include the term fortran.git log -S"Fortran"
This shows you the commits but it's not easy to tell that they include that term so rerun that will dash P on the end
### ex
first up then is to find all commits that check off an item on the to-do list and you need to do that by searching the commit itself.Next up to search for a commit message that includes the term streaming.
To search for all the changes that affect a particular path then that's where you need to use the dash, dash,
This time I want to search the contents of the commit for the term Windows 10. Without dash, dash all, the log will only search for direct ancestors of your current commit. Since this commit is on another branch, on remote branch so it's unable to find it.
Branching
What actually is a commit in Git?
First up, a commit represents a particular state of the tree, the tree being the directory itself. What do those files look like? What's the contents of them?
Another aspect of a commit is some metadata that goes along with it.
Finally a reference to its parent. Every commit has either one or two parents. So as part of that commit, you wrap in with that a reference to its parent.
All of this stuff gets wrapped up inside the Git repository and given a unique reference in the form of a hash using SHA1.
This hash allows you to refer to a particular commit within your repository.
Every single one is unique, so you can easily identify a particular commit using its hash.
A branch is just a label associated with a particular commit.
A branch, in fact, is implemented as a file containing a SHA1 hash. Then as you create more commits, that label gets moved forward, updating as you create new commits on the branch.
By default, when you create a Git repository, Git creates a branch for you and calls it the master branch.
git branch myBranch
: create a branch. All this does is create a new file in .git/refs/heads called myBranch that contains the SHA 1 hash of the code commit. You can confirm that by taking a look heregit log --oneline -1
that it's c3788 and that is indeed the same hash.git branch
: Type git branch to see the list of all of the branchesgit checkout myBranch
git branch - d myBranch
: delete that branch that you were working on
Now the repo that you're using is a clone of a remote repo that, way back when, you forked. Because of the special relationship between the local repo and the remote repo, your local repo understands that there are branches in the remote repo.
If you want to start working on a remote branch, you first have to check out a new local branch that is set to track the remote branch.
What this means is when you create commits on your local branch, Git knows that it should push it up to a particular remote branch.
And similarly, if new work appears on the remote branch it knows which of your local branches it should be putting those commits on the end of.
git branch --all
: To see a list of all the branches including remote onesgit checkout --track origin/clickbait
: To check out a new local branch that will track that remote branch, use git checkout --track origin/clickbait.
git status will show you that you're on branch clickbait and you're up to date with origin/clickibait. Since you're tracking it, it knows relative to the remote branch whether you're ahead or behind.
And git log --oneline --graph will show you where you are on the graph, confirming that you are indeed at the same point as the remote clickbait branch.
Now actually there is a shortcut to do this checkout. Had I used git checkout clickbait right at the beginning, Git would have automatically checked out a new local branch that was tracking the origin/clickbait branch.
ex- delete a branch that will delete the commit
You saw earlier how to delete a branch, but Git's clever, it won't let you delete a branch. By losing a commit, all it means is you no longer have a reference that will allow you to access that commit, i.e, a branch that references that commit.
Number one, you need to go ahead and checkout master.
Number two, create a new branch called myBranch and check it out
Number three, make a change to the readme file and commit it
Number four, switch back to master and try to delete myBranch.
solution
That -b is shorthand for creating a new branch and then immediately checking it out.
git branch -D mybranch
Merging
why would you use branching and what would you use it for?
What are the most popular conventions associated with branching?
And in fact what we do here at Razeware is to treat the master branch as this is what is deployed in production.
- red branch is master:master branch is tracking origin i.e. it's tracking the remote branch. This is what is in production. All it consists of is merge commits. Those merges come from a branch called development.
- develop branch : defficult to track. you can track that back following the purple line briefly and then it continues down this pink line here. Our development branch is what is deployed to the staging system which in our world in the web is the staging server.
When somebody wants to come along and create a new feature we branch off development, make some commits to create the feature and then at the end we merge that branch back to development.
Because the commit graph is well defined in Git. it allows Git to use a three way merge which makes merging a lot less error prone.
At the point I want to merge the yellow branch back into the blue branch Git will perform a three way merge.
It looks at three commits. It looks at the two commits right at the end and it looks at the common ancestor i.e. the point at which one branch left the other one.
ex 1
First switch to the click bait branch by typing git checkout clickbait.
Take a look at the commits that you want to merge in to master by using Git log. And then switch back to the master branch using git checkout master.
You use the merge command to perform the merge with git merge clickbait
. You specify the branch that you want to merge into the branch that you are currently on.
take a look at the log with git log dash dash oneline dash dash graph dash dash decorate. You'll see the graph now merges that clickbait branch into the master branch right at the top with that merge commit that you just created.
You can also see that file that didn't used to exist on the master branch the clickbait ideas file now exists with cat articles slash clickbait ideas.
finally delete the branch using git branch -d clickbait
fast forward merge
You saw before that you've got two branches and you merge them together it creates a new commit. Now that's not always necessary.
The important part here is the master has not had any new commits on it.
I want to merge mega feature back into master. And what I can do, instead of creating a new commit, is actually to take the master label and just move it to the end of the mega feature branch.
That's the equivalent of merging those two branches together. And when you do that, that is called a fast forward merge.
All you're doing is moving where that label goes but it doesn't leave you with a merge commit.
ex
To demonstrate the fast forward merge you're first gonna create a new branch off master. So make sure you're on master.
And then run git checkout dash b readme improvements to create and checkout a new readme improvements branch.
Here you can see that master is the second line down and readme improvements sits directly above it.
git merge readme_improvements
If you now do git merge readme improvements you'll notice it says fast forward cause it was able to do a fast forward merge. And then jumping back into the log you'll see that all it did was move master from the second line up to the first line at the same point that the readme improvements exists.
You didn't create a new merge commit. That's a fast forward merge.
ex
I want you to create something that could be fast forwarded but you select don't fast forward it. don't fast forward it.
HINT:
- Number one, jump back to the readme improvements branch. It's already there you've not deleted it.You're perfectly welcome to carry on working on it.
- Number two, add a signature at the end of the readme file.
- Number three, commit the change.
- Number four, merge that into master but don't fast forward it. And then confirm that you've done that by taking a look at the log.
- You go ahead and do that, delete the branch at the end
SOL :
this time adding the dash dash no dash ff to tell it that you don't want a fast forward merge. This will jump you straight into text editor so that you can provide the commit message.
you'll see a new merge commit has indeed been created even though it wasn't strictly necessary.
Syncing with a Remote
And in fact, because Git keeps the entire history of the repot in its own .Git directory, there is no requirement for you to use a remote or server.
However, if you're working with somebody else it's highly recommended that you use a remote rather than sending around patches or repositories.
Even if you're not working with somebody else I would always use a remote for data integrity reasons. If my hard drive blows up I've always got that remote copy on GitHub or elsewhere I can then pull down on my new computer.
There are two fundamental processes to working with remotes, helpfully labeled pushing and pulling.
You push your changes to the remote repository and you pull changes from the remote to your local copy.
ex 1 - pushing
If you've cloned your repot, then pushing will probably just work.
pushing doesn't always work because you can have more than one remote within your repot.
ex2 - push not work
before I get a chance to push my commit back to the remote, somebody else comes along and creates a new commit, the red one.
That new commit on the end there means that I can't push my commit, where would it go?
My local branch tracks the remote branch. But where would this purple commit go? I mean it could go here, so I'd put an arrow on the end there and put my new purple commit in there.
But then out of the end of this blue one I've now got a purple one and a red one, well that doesn't really make any sense. I don't want to create a new branch. My branch should continue.
How to do with this situation?
You fetch that red commit from the remote repot into your local repo.
Now you've got your master branch and then down here this is the remote origin master branch.
You need to update your master branch so that it goes off the end of the origin master branch.
And to do that, you can do a merge. You merge origin master into your master creating a new commit which then you can push back up to the remote repot.
ex1
pushing the changes that you've made on your master branch back up to the origin.
Git remote dash v will list all the remotes that are associated with the current repot
Git branch dash vv will list the branches and the remote branches that they're tracking, so here you can see master is tracking origin master.
Git branch dash vv dash dash all will show you all branches including remote branches.
Type git log dash dash oneline to see a list of all of the commits.And you can see how far behind origin master is from master.
To push your changes up to the origin, up to the GitHub repot, type git push origin master. It might well ask you for your user name and password, or it may well have remembered it.
pulling
and pulling is actually a conglomeration of fetching and merging. Now if your branch tracks the remote branch then pulling will do all of that automatically for you.
But there are situations as you'll see right now where that won't work.
You can have more than one remote associated with your repot. By default and quite often you'll only have one and that will be the origin.
However, if multiple people are working on the same thing and you want to grab somebody's commits before they get pushed to origin. You can add their fork as another remote.
Each time you add a new remote it just adds extra remote branches to your existing repot.
ex 1
in this example you can see how you can add an extra remote to your local repot and then see how you can merge extra changes in that came from that other remote.
add a new remote to my local repot with git remote add and then the name that I'm going to refer to this remote as which is iwantmyrealname and then the URL for that repot, which is https://github.com/iwantmyrealname/rwTODOs.
Now if I do git remote dash v you'll see I've got new remotes in there which is excellent
However if I run git log dash dash oneline dash dash decorate dash dash all dash dash graph you'll see that there's no change, I can still only see my local and my origin. And that's because while I've added the remote I've not requested to fetch any of its commits yet. You do that with git fetch iwantmyrealname, that there is the name of the remote again.
There it's gone and it's pulled down two new branches and if I run that log command again I can see them, there's iwantmyrealname master about half way down the screen and then towards the bottom is iwantmyrealname clickbait. You can see that that is the same as origin clickbait but with an extra couple of commits on the end.
Well want to merge those commits into my own local repository.
Git remote show gives you some details about a particular remote, origin and iwantmyrealname are the two remotes that I've got and you can see some details there.
Now I'm gonna try and check out my local clickbait branch.
Ah, I've deleted it so I need to recreate that with git checkout dash dash track origin clickbait. That's now created the new local clickbait branch that tracks the origin clickbait branch.
Now I want to merge those commits from iwantmyrealname's log into my local clickbait branch.
And checking out the log again you'll see that there's a fast forward merge that moved my local clickbait branch on a couple of commits in the iwantmyrealname clickbait branch.
Now I'm gonna merge that into master finally. So I'm gonna switch back to master with git checkout master. And then do git merge clickbait which will throw a text editor so that I can provide the merge commit message.
you'll see that my local master branch now has the local clickbait branch merged into it which, because of the merge we did before, includes those commits from iwantmyrealname's clickbait branch.
Now you may have noticed when you were looking through the logs that there are actually some additional commits on the master branch on the iwantmyrealname remote. In addition to the ones that you've already merged in on the clickbait branch.
ex 2
now got two divergent master branches and your challenge is to merge the commits in from the iwantmyrealname remote into your master branch and then to push that back up to your origin.
SOL
first I want to make sure I've got the most up to date commits from the iwantmyrealname remote with git fetch iwantmyrealname.
I'm already on master and I can directly merge a remote branch into my local branch with git merge iwantmyrealname slash master.
Taking a look at the log you can see that the local master branch has moved forward a commit because it's merged in that remote branch from iwantmyrealname.
I can go ahead and push that up to my origin with git push.
Pull Requests
Fundamentally, a pull request forms some kind of review process around a merge, and that's it. It's just a merge with a load of other stuff bundled on top of it.
You can merge between branches, and you can also merge between one fork and another.
If you want to submit your own patches to open-source software hosted on GitHub, you'll find that the workflow involves you forking the repository, making your changes, committing them to your own fork, and then creating a pull request from your fork back to the original repo. This pull request is just a merge of your code back into the origin.
You can also use a pull request to merge from one branch to another within the same repo.
A pull request forms a forum for discussing changes, for adding continuous integration, testing, and code review.
Although they were invented by GitHub, they are available on other code hosting solutions
ex 1
Ensure that you're on master with git checkout master. And then create a new branch and check it out with git checkout -b status update.
You're gonna create a couple of commits on this branch.
push this branch up to origin, up to remote, with git push --set-upstream origin status_update. That tells you you're going to use the origin remote
and you're gonna create a new branch on it called status_update.
You can confirm that with git branch --all -vv and you'll see that you're on the status update branch, and that is tracking origin status_update.
Because you just pushed up a new branch, it thinks you might want a pull request, but more usually, you might head off into the pull requests tab and click create pull request.
In here, you get to choose what's the base branch and what's the branch you're merging into it.
Note that just because you've created a pull request doesn't mean you can't continue to add work to it. You just keep working on your branch locally and pushing those changes up, and the pull request will get updated appropriately.
Head back to the terminal and check out master. I want to make sure that I've got that new merge commit on my local repo.
First of all, I need to do fetch to make sure that I've got all of the latest information from the remotes.
Then if I do git status, you'll see that my branch is behind origin master by three commits and can be fast forwarded.
To see exactly what that means, I'll use the standard log message, git log --oneline --graph --decorate --all, and you'll see that where master is a little bit behind origin master.
If I do git pull, that will fast forward my local master branch to the origin master, and checking the log again
I can now delete my local version of the status update branch with git branch -d status_update.
However, if I check the log again, you'll see that the remote version of status_update still exists, and that's because I've not told the local repo it should check to see whether any branches have been deleted.
You can do that with git remote prune. you'll see that that branch has now disappeared.
git remote prune origin
ex 2
in addition to pull requests between branches in the same repo, you can also form a pull request from a fork into the original.
Create the pull request, add a nice description
learning from raywenderlich