Why use volumes
Volumes are like remote hard drives to attach to virtual machines (VM) to save input data, outputs, and important results.
Few scenarios where having this option is awesome! and spares you a lot of time
- I have run into this problem (several times) where I spin up a large VM which provides 30GB memory and 60GB disk space. This VM provides enough memory for running my analysis, but I have close to 60GB of data or more to run through the analysis, leaving no space to save my output files. Instead of now having to kill this VM and spin up a larger VM, there is another option. Spin up a volume instead, it’s like a hard drive, where you can save your input files and output files and use the VM to run the analysis.
- Using Jetstream VM to share data between collaborators. No analysis needs to be done, just sharing data. Spin up a small VM, attach a volume with data to be shared. Give collaborators access to the VM.
The advantage to using a volume is that
- Setting up volumes does not use up your allocation (or allocated SU’s).
- Each user can get 10 volumes up to 500GB total storage and can request for more space by sending the Jetstream team a mail, without having to alter your allocation request.
- You can spin up a volume after starting your analysis on the VM and attach it then.
Note: A volume can be attached to only ONE VM at a time.
Spin up a volume and attach to VM
Follow these steps
- Make sure the VM to whom you would like to attach the volume to, is active
- Spin up a volume (above figure), MAKE SURE that the volume is from the same provider as the VM. In the image below the purple boxes show where the service provider information. There are two providers – IU and TACC, you cannot connect a volume at IU with VM provided by TACC, and vice-versa. Also, notice that the volume status says ”Unattached” in the above diagram
- Click on volume, and select “Attach” on the right corner of the screen There will be a pop-up screen with a list of all the active VM’s in your account and you can select one. Now wait for the status to change to “Attached”
Accessing the Volume and moving data here
- Using the Web Desktop, open “File Manager”, then select “Filesystems”, there should be a “ vol_*”, as shown below.Note: The reason I use * instead of ‘ b‘ in vol_b above is that sometimes, the volumes can be named vol_c as well. In this case, the volume named is “vol_b”.
- In WebShell or ssh into the VM
If you are not sure what your volume name is, the type
#lists out the files, and there should be directory names vol_*, as shown in the below image
- You can move data here to this volume from your VM using commands like mv or cp bash commands.
Data transfer to the volume directly
- Make sure volume is attached to the VM.
- Use any data transfer commands like rsync, scp, ftp to move data from cluster/laptop to volume directly. For more information on these commands go to here
- There is always Globus, if you are new to Globus – follow this blog post first
Steps to setup Globus and transfer files on a Jetstream VM:
1. Sign-into Globus using your XSEDE account log-in information.
2. Select the “Endpoints” tab on the left, vertical banner.
3. At the top, right of the screen, select “Create a personal endpoint”.
4. Right-click on the link that reads “Globus Connect Personal for Linux” and copy link location.
5. This step is a little tricky. You want to download globusconnectpersonal onto your VM from the Web Shell using the wget command. Open your VM’s Web Shell and notice the pink note at the bottom of the screen for copy and pasting. CTRL + ALT + SHIFT to open your Web Shell’s clipboard, paste the copied link location for globusconnectpersonal, then close your clipboard with another CTRL + ALT + SHIFT.
6. On the command line, type wget, add a space, then right-click. Your copied location should paste to the command line. See the example below:
7. Unpack the .tgz file.
tar zxvf globusconnectpersonal-latest.tgz
8. Navigate into the Globus directory and setup your VM’s endpoint. Follow the instructions that appear on the command line. You should only have to set this up one time for your VM.
Note: You may have trouble opening the Globus link provided on the command line. Highlight the link (don’t try to copy, just highlight), open the Web Shell clipboard by typing CTRL + ALT + SHIFT, then highlight link from the clipboard and open in an new tab. Proceed with the setup instructions on the link. You’ll have to use the clipboard again to copy and paste the authorization code into the command line.
9. Once Globus is setup, the path to your volume must be added to the Globus configuration path file.
This opens a file editor nano, you can use your text editor of choice. Type the following line in the text editor, make the change to include the correct volume name.
Save the file in nano – Ctrl and O (at the same time)
Exit text editor in nano – Ctrl and X (together)
More information about configuring accessible directories in Globus can be found here.
10. Once your Globus setup and path configuration is complete, you’ll need to start the connection before you can transfer files. When you are finished with your Globus transfer, you can stop the connection. If the connection is disrupted or dies unexpectedly, you can restart the connection by simply stopping and starting again. Type the following commands into your command line.
./globusconnectpersonal -start &
Note: The & runs the command in the background.
./globusconnectpersonal -start &
11. Navigate back to your Globus profile online and select “Endpoints” on the left, vertical banner, then select “Administered By You”, and you should see your VM endpoint lit up in green with the name you assigned it during the setup process.
12. Navigate to the “File Manager” on the left, vertical banner. Notice the screen is split. One side is used to navigate your source space and the other is for your destination space.
In one of the “Collection” bars, select your source space. If your source space is your personal computer, you will have to setup another “personal” endpoint on your computer in the same way you set up a personal endpoint on your VM. If your source space is on an HPC, it is likely that a Globus endpoint is already established. In this case, the Globus endpoint is not a “personal” endpoint since you are not the owner. You will have to find the name of your HPC account’s endpoint and type this into the Collections bar. A commonly used endpoint at Indiana University is “IURT – Slate”. This connects Slate accounts and Slate-Project accounts. Other IU HPC account endpoints can be searched in the Collections bar by typing “IURT”. You may want to reference IU’s Knowledge Base for more information.
To connect to your VM’s endpoint, double-click in the other Collections bar and select your endpoint under “Your Collections” tab. Note that your endpoint’s icon should appear green. If does not appear, or appears but is not green, your endpoint’s connection was likely disrupted and you will have to restart the connection.
13. Now enter the source file path and the destination path into the “Path” bars. When you’re ready to transfer, click “Start”. You will receive a notification when your transfer has successfully completed.
Note: Sometimes the transfer fails unexpectedly. If this happens, it could be that you’ve run out of memory in your destination. There is also a “Help” tab on the left, vertical banner where you can submit a ticket.
Deleting VM and Volume
First, make sure to back up data (input files, outputs) on the VM/volume to your computer/data archive. Once you delete the volume and VM, data cannot be recovered.
- Next, Detach the Volume from VM
- Then delete VM and volume
If you have any questions, send us an email at firstname.lastname@example.org