phd-comics

Hi All! today I thought of writing about something that relates to being a Teaching Assistant or a TA. To anyone who is pursuing a PhD, TA-ing is a necessary but sometimes cumbersome task, that sort of comes with the job. To anyone who doesn’t know much about–well what it’s like pursuing a PhD, think of it as a kind of apprenticeship. We (PhD candidates) choose a professor to be our mentor, and we essentially work under him/her and eventually make a meaningful contribution to our field at which point we are given the title of a Doctorate.

While PhD candidates are measured on their contribution to the field, most candidates get a stipend during this time which helps us stay afloat financially. This stipend either comes as a Research Assistant where we get paid for helping out with research work or as a Teaching Assistant where we help out with teaching and grading. As a Computer Science PhD candidate most of the grading I’ve done involves coding assignments and projects. So to make my life easier I made some bash scripts to automate the process a bit and thought of sharing it with everyone else.

To cut it short, if you are a TA that needs to grade programming assignments or projects this will really help you out. I have provided some bash scripts to,

  1. Download code from github-classroom
  2. Check late submissions (or other repo information)
  3. Crawl through all repositories/folders and run code

Script 1: cloning GitHub classroom repositories

Firstly, this script clones all the GitHub repositories from the github-classroom. Now if you don’t use github-classroom then this script won’t matter much. I have declared a variable called names, fill it with the GitHub IDs of all the students. Each student name must be in "" and each entry must be separated by a space. Also fill in the GitHub source link in the variable GitHub. To run a bash script in a terminal type,

#I am assuming this code will be stored a file called gitclone.sh
$ chmod +X gitclone.sh
$ ./gitclone.sh
#!/bin/bash

declare -a count=1
# github IDs for all students
declare -a names=()
#the github source link, eg:- https://XXX@github.com/Repo-Name
github=""

echo " clone started"
for gitsource in "${names[@]}"
do
    echo "------------------------------------------"
    echo "($count) : $gitsource"
    count=$((count+1))
    git clone "$github-$gitsource.git"
done

Script 2: checking submission info

This script is essentially about extracting useful submission information for a given GitHub submission. Again, you will have to fill in the variables namesTag and names. Here namesTag refers to the git-IDs of each student and names refers to their email-IDs associated with them. The reason being most of the time I noticed that a students git-ID is very different to his/her real name, so having both together helped a lot. This script will provide the information of,

  1. All the git-IDs that have commits in this repo
  2. The email addresses associated with them
  3. The final sshKey
  4. The time difference between the first and final commit
  5. The time of the last commit made

To run this script, use chmod +X scriptName.sh followed by ./scriptName.sh as in Script 1 above.

#!/bin/bash

declare -a count=1
# github user IDs, each entry in the list should be in ""
declare -a namesTag=()
# university email IDs (I am assuming each student will have
# some email ID they use associated with the github account)
declare -a names=()

# -- initializing variables
declare -i h=3600
declare -i difference=0
declare -i timeV=0
declare -i firstTime=0
declare -i lastTime=0
IFS=$'\n' #makes new line the only separator

# change the course name accordingly
# eg CS-XXX
courseName='course-1'

for ((i=0;i<${#names[@]};++i))
do
    gitsource="${names[i]}"
    foldersource="${namesTag[i]}"
    #now the code will cd to CS-XXX-studentName
    cd $(pwd)/$courseName-$foldersource/
    sshKey="$(git log --author="$gitsource" --pretty=format:%H | sed -n 1p)"
    firstTime="$(git log --author=$gitsource --reverse --pretty=format:%at | sed -n 1p)"
    lastTime="$(git log --author=$gitsource --pretty=format:%at | sed -n 1p)"
    lastCommitTime="$(git log --author=$gitsource --pretty=format:%aD | sed -n 1p)"
    emailAdd="$(git log --author=$gitsource --pretty=format:%ae | sed -n 1p)"
    nocommits="$(git shortlog --author=$gitsource -s -n)"
    difference=$(($lastTime-$firstTime))
    timeV=$(($difference/$h))
    cd ..
    echo "($count) :  $nocommits - $foldersource| $emailAdd | ssh: $sshKey | commit-diff(h): $timeV | last-commit: $lastCommitTime"
    count=$((count+1))
done

Script 3: running the code

The final grading script here gives you the option to run a given piece of code multiple times without too much trouble. Again, make sure you fill in the folder-names of all the students in namesTag. So the code below focuses mainly on running C code, hence the make which I am assuming is present with targets to run and clean code. If you are running code of a different language change the lines with make, to the correct program invocation you need. When you run this script, for a given student the script will prompt if the code should be run, and after each run will prompt again to continue to the next student or re-run the previous code again. This feature is very helpful if you are testing code that changes behavior between different runs. (i.e. maybe it has random numbers or something)

To run this script, use chmod +X scriptName.sh followed by ./scriptName.sh as in Scripts 1 and 2 above.

#!/bin/bash

declare -a count=1
# git IDs of students
declare -a namesTag=()
declare -i lastTime=0
IFS=$'\n' #makes new line the only separator
echo "started"
projectName="lab06" #or anything you are grading

for ((i=0;i<${#namesTag[@]};++i))
do
	foldersource="${namesTag[i]}"
    echo "$count : start run for $foldersource (y/n) ?"
    read value
    if [ "$value" == "y" ]
    then
    	echo "running ..."
    	cd $(pwd)/$projectName-$foldersource/
    	make
    	echo "----- re-run for $foldersource (y-yes/n-next):"
    	read input
    	while [ $input == "y" ]
    	do
    		echo "re-running..."
    		make
    		echo "----- re-run for $foldersource (y-yes/n-next):"
    		read input
    	done
    	echo "running finished"
    	make clean
    	cd ..
    	echo "--------------------------------------------------------------------------------"
    else
    	exit 1
    fi
    echo "moving on..."
    count=$((count+1))
done

So there you have it, three bash scripts that have made my TA work considerably easier. If you are a TA like me, I hope they help you out too.

So until next time,
Cheers!

Next: Time series prediction with Recurrent Neural Networks (RNNs)
Prev: Adding git support to Terminal