What is Jetstream
Steps to do during the course/workshop preparation
Steps to do right before course/workshop
Steps to do after course/workshop
Useful notes during the workshop
What is Jetstream?
If you are not familiar with Jetstream, it is an NSF funded cloud computing resource that provides access to preconfigured virtual machines (VMs) with root access. Using VMs helps with the transition of non-computer science background professionals to the command line, software installs and onto running analysis on LINUX environment (as most HPC clusters). For more on Jetstream, go here.
Jetstream to teach courses/workshops
NCGAS has been using Jetstream to teach some of our workshops like R for Biologists (link to workshop information here), HPC Onboarding for Biologists (link to workshop information here), PonyLinux: a fun game to learn basic bash commands (link to the code here), and Mining the Sequence Read Archive (SRA) (link to the workshop information here). The advantage of providing Jetstream VMs for these workshops is that we can install all the programs and packages prior to the workshop. This ensures:
- that all the participants are working on similar environments,
- easier for us to help with issues during the workshop,
- less time spent installing programs during the workshop. Let’s face it, how many of us actually install all the packages prior to the workshop even when we were suggested to do so :D.
This post walks through things the steps we take to configure a Jetstream VM for our workshops.
Note: The steps here is if you plan to spin up VMs for all the students in your course/workshop, and provide the username and password to then login to the VMs. There is also another option where you can simply add all the students to your XSEDE educational allocation, this will be explained at the end of this post.
Note: Jetstream is NOT HIPAA aligned, which means no identifiable data should be used on Jetstream!
Steps to do during the course/workshop preparation
- Becoming familiar with Jetstream resource – Here is another blog post on getting started on Jetstream (link to blog here). The summary of this blog is
- Get an XSEDE account, link here
- Request an educational allocation on XSEDE, link here. This does require the researchers to submit a list of documents, including an abstract, list of PIs and their CVs, resource justification and syllabus. Be prepared to spend some time to write this up. More information is available in the following link as well, click here.
- Spin up a base image you would like to build on with the software for the course/workshop. Jetstream has a list of pre-configured VMs available that already have a few programs installed. If you would like to build on one of these instances, then spin up one of these VMs and install the remaining programs here. Before you begin installing these programs, here are a couple guidelines to remember (link to these guidelines here– HIGHLY RECOMMEND YOU READ THIS NOW, trust me will save you lots of time later!). Make a note of some of the common mistakes we have made.
- The programs have to be installed in /usr/bin or /opt. This is important to remember since the next step requires you to request this build of the image to be imaged (more on this in the next step).
- Test your installations thoroughly, to make sure all the programs are downloaded, and in your path.
- We also recommend adding a username for your students, as well as an admin user. This way, the students can log into the VMs using their username and password, as well as an admin user for your team to help debug any issues during the course/workshop. We have a blog post on this, along with a script available here.
- Request the VM to be imaged. This step basically creates a snapshot of the image and makes it available either publicly or to specific XSEDE users as necessary. So now when you spin up this image, you won’t have to install the programs again, it’s already ready to go!
-
-
- However, once you request this, a ticket is generated to the Jetstream team who in the process will delete /home, /mnt, /tmp and /root, and rewrite the files in /etc, /root and /var/log. The complete list is available in the link here, highly recommend you read it.
- Additionally, the VM you are requesting to be imaged has to be smaller than a medium VM. If you have a VM that is larger than a medium, highly recommend sending in a ticket to the Jetstream team to help.
-
-
- Test the new image generated again. Highly recommend spinning up the newly imaged VM, and test all the programs, student/admin username, and passwords, to make sure everything is working as it should. If there is something that needs to be fixed/updated or deleted, do this and request the VM to be imaged again. This is an iterative process, and give yourself enough time to do this.
Steps to do right before the course/workshop
Note: Skip the step “Spin up n number of the VMs”, if the plan is to have the students spin up their VMs from their XSEDE accounts. Replace this step with the section below (link to below section here)
- Spin up n number of the VMs. n is the number of VMs to start, and this would depend on your course/workshop.
- For instance for our R for Biologists workshop, since the material doesn’t require lots of computing (memory or storage), we spin up a small VM and have three workshop participants share one VM. Each participant will have their own username and password, so their work can be resumed where they left off. While in other cases, such as the Mining SRA workshop, we spin up medium instances, with only one user per VM – this is because the workflow requires a lot more memory and storage. You can determine this when you are testing the image during preparation.
- How many VMs should I spin up? This would depend on the workshop. For instance, If you have 15 participants, and each participant will need their own VM, spin up 17 VMs. 15 VMs for the participants, and 2 extra for testing/extra in case there is trouble with one of the 15 VMs.
- Some more last-minute testing. Here are some quick tests we do,
- make sure the VMs have user and admin accounts. Login to the VM, and look at the /home to see if the other accounts are listed. Also, try to login to the VM as the user to make sure the password is correct. If we have 100 VMs, then, of course, we don’t do this for all 100, just random sampling in this case :D.
- It’s highly possible that one or a few participants will continue to type the incorrect password more than a couple times causing them to ban themselves from the VM. If this happens here is the set of commands to unban them:
- How do you know if they are banned? Likely see an error similar to “connection refused”.
- login to their VM using Jetstream atmosphere or with your XSEDE username (the person who spun them up). Use the unban utility and more information is available here. Use the web shell and do the following:
#will show you all currently banned IPs.
unban -l
#Find your IP in the list and do:
unban -i your_ip_address
-
-
- In some cases, if its a really small workshop like 4 hours, we just stop the fail2ban client altogether. If you are running a long course/workshop – for weeks or an entire semester do NOT stop fail2ban.
-
- How to login to the VM. In the first few minutes of the workshop, spend some time explaining how the students can log in to the VM. This has included,
- downloading PuTTY for windows users, showing how to use the terminal in Mac
- The command to ssh into the system
- provide the students with IP address and password, and also the command to change their password
Steps to do after the course/workshop
- After the course/workshop, delete all the VMs. If you do not delete these VMs, they will continue running and using SUs, which is limited per allocation. So if you run out, then you will need to request for more!
- Once the VMs are deleted the data cannot be retrieved, make sure to point this out to the students, so they have a chance to back up any scripts/data they went over during the course/workshop.
Some useful notes for the students during the workshop
- Using screen– Use the command screen to start a new terminal in the background to run time-intense steps. This is useful when you are teaching a course, where the command has to run even after the class timing. Running a command, in screen will make sure the command continues to run in the background, even after the student logs out of the VM. To start a screen, run the command
In the screen run the command, then once the command starts to run successfully, to exit out of the screen
screen
Click Ctrl^A then Ctrl^D (might change to Cmd^A then Cmd^D on Mac)
To go back to the screen, run the command
screen –ls #lists all the screen information
screen –r <screen name listed>
- Command to move data in and out of the VM is shown below, this is very useful for small files. For larger files, we suggest using Globus and here is a link to post with more information (link).
scp <source of file to move> <destination of file to move to>
Adding students to the XSEDE allocation
As mentioned earlier, there is the option to add the students to the XSEDE allocation, so that they can spin up the workshop VM in their own Jetstream accounts. This step would have to be done prior to and during the workshop, skipping the step to Spin up n number of the VMs.
There are certain advantages to this,
- the students can log in to the VM using the atmosphere account- Web desktop and Web Shell, without having to learn how ssh into the system.
- if they use the Web Desktop and We Shell from Jetstream atmosphere, they will likely not be banned from the VM as well. So no time is wasted trying to figure out fail2ban.
If you decide to go this route,
- make sure to have an admin account set up on the VM to help the students troubleshoot any errors. If there is an admin account setup, you can log in and help the student without having to physically be present, or exchanging passwords.
- Require all the students to have an XSEDE username, so some time will have to be spent to make sure all the students are added to your XSEDE allocation and can access Jetstream atmosphere. After being added, sometimes it takes a few hours for the students to access Jetstream and spin up their VMs. So based on time restrictions this may have to be done prior to the workshop.
If you do need any help with any of these steps, contact us at help@ncgas.org. We are happy to help you walk through this tutorial as necessary.