Setup git
on Anvil
Many times you’ll use an application like GitHub Desktop to interact with GitHub.
However, occasionally you’ll need to work with GitHub through the terminal. Anvil is one of those cases.
The first time steps you need to follow in order to get your Github account connected to the repo include:
-
Make a Github account using the same email as you used for Access ("Making an Account on Github")
-
Setup your SSH keys and connect them to your Github account ("Setting up SSH Keys")
-
Cloning your teams repository ("Cloning the Repository")
You only need to do the steps above once, then you can then go on to learn how to use the git
CLI to make changes with Anvil. There are additional steps below to do advanced branching, if you are interested.
0. Working with git
on Anvil
To open a terminal session, start by launching Jupyter Lab from ondemand.anvil.rcac.purdue.edu.
After you’ve connected to Jupyter Lab, click the blue plus sign in the upper-left corner. This will list the different launchers that are available to you.
Scroll to the bottom of the "Launcher" page, and under the "Other" category you should see the option for a "Terminal".
If this is your first time working with a terminal, some helpful commands are included below:
-
ls
- this will list the contents of the directory that you are currently in. -
cd
- this will change locations to a new directory based on what is passed to it. -
mkdir
- this will make a new directory that can store files.
We’ll use these commands to setup our GitHub repository.
1. Making an Account on Github
If you don’t already have a Github account that uses the same .edu email you used to sign up for Access, you should make one now.
2. Setting up ssh
Keys
In order for Github to authorize us to clone the repository to Anvil, we’ll need to setup ssh
("Secure SHell") keys.
ssh
uses public key cryptography to securely send messages. Two keys are generated; one public, one private. The public key is shared with others in order for messages to be encrypted (so that only you can read the message). The private key is private (you should not share it with others) and is used to decrypt messages that were signed with your public key.
To begin, open a terminal (iTerm on Mac, CMD Prompt on Windows, etc) on your local computer, and enter the following:
#mac or linux
ssh-keygen -a 100 -t ed25519 -f ~/.ssh/id_ed25519 -C "My Anvil Key"
#windows
ssh-keygen -t ed25519
ssh
requires specific security permissions in order to operate; run these commands to update the permissions to the ssh
keys we created (Mac/Linux only):
chmod 700 ~/.ssh
chmod 644 ~/.ssh/authorized_keys
chmod 644 ~/.ssh/known_hosts
chmod 644 ~/.ssh/config
chmod 600 ~/.ssh/id_ed25519
chmod 644 ~/.ssh/id_ed25519.pub
We want to share our public key with Anvil, so we use cat
to output the contents of the .pub file we just made (.pub as in public; the private key has the same name in this folder but with no .pub extension).
#mac or linux
cat ~/.ssh/id_ed25519.pub
#windows
type C:\Users\username\.ssh\id_ed25519.pub
Now go to Anvil and login like usual: ondemand.anvil.rcac.purdue.edu/pun/sys/dashboard
Click the "Clusters" drop down menu, and select Anvil Shell Access.
Run these commands in the shell, copying the contents of your .pub file we got a few steps ago into the authorized_keys file. We also need to set the security permissions as well:
mkdir ~/.ssh
vim ~/.ssh/authorized_keys
#paste the contents of your keys to a newline, and press
#Ctrl+c, and type ":wq" to save and quit.
chmod 700 ~/.ssh
chmod 644 ~/.ssh/authorized_keys
Head to the settings page for Github Keys.
Make sure you are logged in to the Github account that has the same .edu email address as the one you signed up with Access for. |
Here you can click the green New SSH Key button in the top right corner.
-
Call it anything you like, maybe My Anvil Key.
-
Make sure the key type is set to Authentication Key.
-
Once again, using the contents of the public key that we copied already to Anvil, copy the public key here in the Key field.
Our ssh
keys are now set up!
If you interested in learning how to use ssh locally or connect to Anvil via ssh , you can follow our more in depth instructions at the ssh page.
|
3. Cloning the Repository
Each team will be setup with a GitHub repository at the start of the project.
There is an example repository that I have setup for your reference.
On each repository, you’ll see a green <> Code
button in the center-right of the screen. If you click this button, you’ll see different options listed under Clone
.
We’ll use these options when we clone the repository in Anvil.
Now that we have all the pieces, we can start the steps below:
-
In your Anvil terminal that we launched above run the command
cd /anvil/projects/tdm/corporate
and then runls
.-
This will move you to the "/anvil/projects/tdm/corporate" directory and list all the contents.
-
-
Find the name of your team’s project and run
cd <project_name>/project/<access_id>
.-
Replace the
<project_name>
portion with the name of your team’s directory and the<access_id>
with your ACCESS ID.
-
-
Git is already installed on Anvil by default. We can see all of the commands available to use by running the
git --help
command. -
Now we are ready to clone the repository. First, navigate back to the website for your team’s GitHub repository and click the green
<> Code
button. Under theClone
options selectssh
and copy the text provided. -
Navigate back to the terminal on Anvil and run the command
git clone <copied_text>
. Replacing the<copied_text>
with thessh
link that you copied from the GitHub website.-
This will clone the repository into your directory on Anvil.
-
GitHub Structure
The Anvil directories are a shared environment. This is good, because it means that you can easily share code and data with others on your team.
However, this can be challenging when working with GitHub because of the way it handles different branches.
The Data Mine team provides some guidelines for how to structure your project environment below:
-
Have a single data folder that everyone works from.
Don’t copy your data into other folders. This can create confusion and be hard to track.
For example, if I have a set of data in
david_working
and there is another set inkevin_working
it can be hard to ensure that they are aligned. -
In the team’s directory on Anvil The Data Mine team will create a GitHub directory for each user.
-
The directories will be located in
Projects
and will be named for your ACCESS ID. -
This is where each student will work on their code and contribute to the main repository.
-
The directory will be setup for you, but you’ll setup the repository as part of the steps below.
-
-
The
main
branch of the GitHub repository should be updated once the code is working and ready for other members of the team to use.-
The Data Mine team should be able to check your repository and see branches for code in development as well as finalized code in the
Main
branch.
-
Working in GitHub
Now that we have cloned the repository, we are ready to create our first branch and push the change.
It should be noted that there are different techniques for working with GitHub. This is how The Data Mine plans to work for the projects, but it isn’t the only workflow.
-
In your directory run the
ls
command to see the name of the new repository that you just cloned. -
Run
cd <repo_name>
. Replacing the<repo_name>
with the name of your cloned repository. -
In your cloned repository run the command
git checkout -b "first_branch_<your_name>"
. This will create a new branch for you to work in.-
You should see a message that says something like
Switched to branch <branch_name>
. -
Branches are your own development area. When working on new code changes they should be in a branch. Once the code is finished up it can be merged into the
main
branch to become part of the core project.
-
-
Each repository should be created with a
README
file. These files are ongoing documentation for how to interact with the code in the repository. In this case we are going to make a change to theREADME
file and merge it into themain
branch. -
To add your name to the
README
file, follow the steps below:-
Run the command
vi README.md
. -
Hit
i
on your keyboard to enter insert mode. -
Using the arrow keys and enter, navigate to a new line and type your first and last name.
-
Once your name is typed, hit the
escape
key to exit interactive mode. -
Finally, type
:wq
and thenenter
to save the changes and exit.
-
-
Now that we have made a change in our branch, we can push it to make it public to others.
-
Run
git add .
in the terminal to stage all your changes. -
Run
git commit -m "Adding my name."
. -
Run
git push
to push the changes.If you get an error that looks like:
fatal: The current branch tdm_test has no upstream branch. To push the current branch and set the remote as upstream, use git push --set-upstream origin tdm_test
Copy the last line into your terminal and run it.
For example, in the error above I would copy
git push --set-upstream origin tdm_test
and run the command.
-
-
For the last set of steps, we can navigate back to the website for our GitHub repository.
-
On the website just under the name of the repository we can see a
branches
term with the number of branches listed next to it. -
If we click on
branches
, we can see the different branches that are active for the repository. Including one with the same name that we created above. We can also see a button on the right-side that saysNew Pull Request
. Click that button for the branch that you created. -
At the bottom of this screen, we can see the changes that we made in our files. We can also add comments regarding the code changes at the top of the request. Add a few comments about the code you changed and why you changed it and then click
Create Pull Request
.-
Many times, you’ll hear pull requests referred to as a
PR
. -
It’s good to add a bit of detail in your PR comments so that others can easily know what the PR contains.
-
-
This will bring up the final screen which is your PR. If everything looks good, you can click the
Merge pull request
button at the bottom of the screen.-
It’s a good idea to have other team members or your TA review your code changes.
-
You can you use the comments settings or the PR settings to add potential reviewers or notes.
-
Merging the pull request will make the code part of the
main
branch, which is the core of the code repository.Sometimes you will see that the branch has conflicts. This means that there is other code that has been added to the repository that is different from what you are adding.
It can be helpful to review the GitHub documentation on merge conflicts for help.
-
-
After your branch is merged into
main
it will automatically be included as part of the core files for the repository.-
In this case you should see your name appear at the top of the repository.
-
Using GitHub for the Project
You did it! You’ve now cloned a repo, made a branch, and merged your first change.
Now how do we use this going forward?
-
Create a branch for the things that you are working on.
-
Once your code is ready, review the changes with a teammate and then merge your changes into
main
. -
Other people can also work on the same branch if you are collaborating with a teammate.
-
The goal at the end of the year is to have all your code and documentation in the
main
branch of the repository.