First Notes from a UNIX, Git and R Workshop

Though I had expressed in my first notebook, in late 2018 at nerve.fancy.chained, that I wanted to "learn terminal" (I obviously didn't know what "terminal" was...), and though I had used ample Excel over the summer of 2018--2019 working in Sue Schenk's lab., I didn't actually do any "proper" programming until relatively recently.

I only really went to it for R, but I am so glad I did. I guess it was the push-start I needed. After this, I talked to D. about Git and we ended up starting to make a Minecraft mod together. The rest was history, I suppose. I started using Julia in late September, 2019, after using Java with D. for a month or so. Those were great times. It was getting warmer, after August, and D. and I were still at university together. We would come home and work on the mod, sometimes with a drink of Coke, and stay doing so till it was dark and we were hungry. During that time, I did a lot of programming in Bash. I created my scripts repository, which hosted all of these very buggy programmes written in bash. It was in this playground where I also learned a little about other languages: namely Perl, Ruby, Rust, and Python.

Transcribed here are my initial, very shorthand notes from a workshop I went to at V.U.W. on UNIX, Git and R.


UNIX, Git, and R workship —: Day 1

Bash (Bourne Again Shell)

Bash is programmable!

Relative vs. absolute paths

Note: A note on spaces in path names: they need to be "escaped" using a backslash:

cd Victoria\ University/

E.g.

  • sort -n lengths.txt | head -n 1 Sorts lengths.txt and then of those, prints the first line

Note: After day 1 of the tutorial, I actually went home and changed my bash prompt. The following code, I put in my .bashrc:

# get current branch in git repo
function parse_git_branch() {
	BRANCH=$(git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* \(.*\)/\1/')
	if [ ! "${BRANCH}" == "" ]
	then
		STAT=$(parse_git_dirty)
		echo "[${BRANCH}${STAT}]"
	else
		echo ""
	fi
}

# get current status of git repo
function parse_git_dirty {
	status=$(git status 2>&1 | tee)
	dirty=$(echo -n "${status}" 2> /dev/null | grep "modified:" &> /dev/null; echo "$?")
	untracked=$(echo -n "${status}" 2> /dev/null | grep "Untracked files" &> /dev/null; echo "$?")
	ahead=$(echo -n "${status}" 2> /dev/null | grep "Your branch is ahead of" &> /dev/null; echo "$?")
	newfile=$(echo -n "${status}" 2> /dev/null | grep "new file:" &> /dev/null; echo "$?")
	renamed=$(echo -n "${status}" 2> /dev/null | grep "renamed:" &> /dev/null; echo "$?")
    deleted=$(echo -n "${status}" 2> /dev/null | grep "deleted:" &> /dev/null; echo "$?")
  bits=''
	if [ "${renamed}" == "0" ]; then
		bits=">${bits}"
	fi
	if [ "${ahead}" == "0" ]; then
		bits="*${bits}"
	fi
   if [ "${newfile}" == "0" ]; then
		bits="+${bits}"
	fi
	if [ "${untracked}" == "0" ]; then
		bits="?${bits}"
	fi
	if [ "${deleted}" == "0" ]; then
		bits="x${bits}"
	fi
   if [ "${dirty}" == "0" ]; then
		bits="!${bits}"
   fi
	if [ ! "${bits}" == "" ]; then
		echo " ${bits}"
	else
		echo ""
	fi
}

# make prompt pretty
PS1="\n\[\033[0;31m\]\342\224\214\342\224\200\$()[\[\033[1;38;5;2m\]\u\[\033[0;1m\]@\033[1;33m\]\h: \[\033[1;34m\]\W\[\033[1;33m\]\[\033[0;31m\]]\[\033[0;32m\] \[\033[1;33m\]\`parse_git_branch\`\[\033[0;31m\]\n\[\033[0;31m\]\342\224\224\342\224\200\342\224\200\342\225\274 \[\033[0;1m\]\$\[\033[0;38m\] "
export PS1

Also in this time I decided on a colour scheme for my terminal. Choose something nice to look at that makes you comfortable.

Loops

cd ~/Desktop/Git_Unix_R_Workshop/Day_1/data-shell/creatures
for filename in *.dat
do
    head -n 2 $filename | tail -n 1
done

"Don't name your variables cheesecake; you'll have no idea what they're doing when you come back to it in three weeks' time."

Note that the following code excerpts are NOT equivalent:

for datafile in *.pdf; do ls *.pdb; done
for datafile in *.pdf; do ls $datafile; done

git

Version control system: track changes of file over time

Git has become popular in version control; and scalable! Originally developed by Linux kernel guy.

Note: since this workshop, I have learned that, the Linux guy (whose name is Linus) has two tools named after him: the Linux operating system (similar-sounding to his name), and git, because he is one (self-proclaimed; I am not insulting him)!

(Android = Linux kernel! So yes, you have heard of Linux).

A kernel is the part of the operating system that talks to the device

We first need to configure out git environment, for our terminal to know our git credentials. We run

git config --global user.name "<name>"

If colaborating, be carefule of differing operating systems; line endings can cause merge issues.

Now we want to say

git init

to initialise the repository.

Running ls -la will give us a long listing and include hidden files within the directory (files beginning with .). Some files need to be there but aren't useful to use (only the computer), so they often remain hidden.

We can run

git status

to see the commits.

Note: You can actually write a git repo anywhere you have access.

rm -rf .git

Will remove any trace of the git repository, recursively and forcefully.

Tracking changes

If you are using the terminal-based text editor nano, Ctrl + O = write Out = save.

git add <filename>

The previous command gives tracking file. It tells git that yo u are "staging" that file. Staging is the area where you're telling git to track.

So what we do:

  1. We write some changes to our code
  2. git log
  3. git diff
  4. git commit -m "commit message (something helpful to read later)"
  5. git log

But what if you have lots of these changes? This is where hash codes are useful!:

Recall that touch creates an empty file. nano .gitignore creates a hidden file (by .).

An important note: git ≠ GitHub.

These are some notes on changing my bash prompt, and other things, after day 2... To change your prompt:

sudo nano /etc/bashrc # or use your favourite text editor

This is actually wrong, in hindsight. You change the $HOME/.bashrc file... Then type

export PS1="<desired prompt>"

To list colour codes in their respective colours, I ran this loop:

for colour in {1..255} # this is a sequence of integers from 1 to 255 inclusive
    do echo -en "\033[38;5;${colour}m38;5;${colour}\n"
done | column -x

Let me attempt to explain this. echo prints "". The -n option for echo tells the command not to print the trailing new line characters. The -e option for echo tells the echo command that within the argument there is an escape code. In our case, our escape code is \033, which in turn tells bash that whatever succeeding that, between [ and m, should be ignored as a string. In our case, we get that

echo -en "\033[<u>           </u>m<u>           </u>"

This is our text formatting code, which tells whatever follows after m and before the closing " what colour to be. Finally, (*) writes out the colour code [sic].

We have this embedded in a loop for all numbers in 1–255.

The \n at the end of echo creates a new line.

Note: I now realise that the -n is redundant when we are adding \n anyway...

I also discovered the command

tput

for colours. I'm not sure how this works with bold, but I have the following:

for colour in {1..256}
    do echo -en "$(tput setaf ${colour})\$(tput setaf ${colour})\n"
done | column -x
echo

Git continued (with Wes Harrell, now)

GitHub allows you to share your changes with other people.

Note: these are some notes I made about the night before, and changing my prompt

I had a bit of trouble last night when trying to change .bashrc1 to include PS1` by

nano .bashrc

and restarting terminal, it would only update typing

exec bash

So then I went into

/etc/bashrc

Top