What is ansible and why are we talking about it
As NCGAS has been working more and more in cloud computing (all of our great browsers, galaxies, etc!), it has become a bear to make sure everything is up to date with security patches. Hand updating a dozen virtual machines is a pain! So, we learned ansible. And it was really easy to work with, so we figured we’d share it with you!
It is important to keep your virtual machines up to date, especially for research machines! There have been several vulnerabilities found in recent months and Jetstream takes their security seriously. If something is a threat to the system, they may only give you 24 hour warning before blocking your site from serving your page! To avoid that, we want to make updates as easy as possible, especially if you have multiple virtual machines.
The endpoint of this whole post is the ability to do one command:
ansible-playbook -u <xsede name> -s update_js.yml
and have all your machines up to date. You could even set this to run regularly using a cron job or scheduled event – but that is a topic for another post ^_^.
A good resource for further info
Just upfront – this is a great thing to learn if you are going to deal with multiple VMs. I would recommend this tutorial:
https://serversforhackers.com/c/an-ansible-tutorial
Installing Ansible
Install Ansible – Ubuntu
sudo apt-add-repository -y ppa:ansible/ansible sudo apt-get update sudo apt-get install -y ansible
Install ansible – Mac
Ansible uses Python and fortunately Python is already installed on modern versions of OSX.
Install Xcode
sudo easy_install pip sudo pip install ansible --quiet
Then, if you would like to update Ansible later, just do:
sudo pip install ansible --upgrade
Install ansible – Windows
You will need a bash shell. if you already have a bash shell through an installed Virtual Machine or Cygwin on your computer, follow the commands under “Install Ansible-Ubuntu” on the bash shell. If you don’t have a bash shell readily available, turn on the option for “Windows Subsystem for Linux”.
First, figure what Windows version and build you have. Look under System Information -> version
- If you are using Windows 10 and haven’t updated to build >= 16215, follow this link to set up beta version of the “Windows Subsystem for Linux” under developer mode and install ansible.
- If you are using Windows 10 build >= 16215, you can install “Windows Subsystem for Linux” as an application from the Microsoft store directly- here is a link to walk you through the steps. Once you have the bash terminal installed follow the commands in this link, under “Installing Ansible” section.
Make a hosts list
Once you install ansible, go to /etc/ansible. You will see a file called hosts – I’d move that to host.orig, and then make your own file. As a note, you can also make a host file wherever on your computer and point the command to this file using the option “-i”.
The host file should look something like this:
[ubuntu]
149.165.168.x
149.165.168.x
149.165.169.x
149.165.156.x
149.165.169.x
[centos]
129.114.104.x
You can split them into whatever groups you want. I have ubuntu machines split into Drupal and non-drupal so I can update that software as needed without mucking with the other machines. But for updating jetstream instances, these two categories are sufficient. If you don’t have one (usually Centos) just leave it off!
NOTE: the x’s should be real numbers – the ip or even hostname (gmod.ncgas.indiana.edu or whatnot). I just blocked the listing of my real machines.
Check connection
To test your installation and host list, let’s do a VERY simple first ansible command – ping the systems! This will confirm they are all online any time you want to do that as well!
To do this, use the following command:
ansible all -u <xsede name> -m ping
i.e.
ansible all -u ssanders -m ping or
ansible all -u ssanders -m ping -i /home/users/hosts #if you saved your host file elsewhere on your computer
all – let’s you run the command against all the machines in the host file, regardless of the category they are in (both ubuntu and centos in this case).
-u – defines a username. The default is the name you have on your computer – which is rarely the same as the one on your xsede account.
-m – module, in this case, the ping module -which allows you to ping the machines.
Results:
129.114.104.x | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
149.165.156.x | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
149.165.169.x | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
149.165.168.x | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
149.165.168.x | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
149.165.169.x | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
129.114.104.x | SUCCESS => {
“changed”: false,
“ping”: “pong”
}
If you get a permission denied
- check to make sure you have your XSEDE name as the -u
- If you see this line in the permission denied error “msg”: “Failed to connect to the host via ssh: Host key verification failed.\r\n” – Make sure your private key to your Jetstream public key is saved in ./ssh/ of your computer.
Simple updates
You can now send commands to your machines as needed. For instance, if you want to update all the ubuntu machines, you can use the following command:
ansible ubuntu -u <xsede name> -s -m apt -a 'update_cache=true'
or, in the case you saved your file path elsewhere (not in/etc/ansible/)
ansible ubuntu -u <xsede name> -s -m apt -a 'update_cache=true' -i "file path to hosts file"
Some of these parameters look similar to what you’ve seen before, but the module is not more complicated. Let’s talk about that.
-s run as sudo
-m is still designating the module you want to use, in this case apt for ubuntu’s apt-get functions.
-a is for arguments. Ping doesn’t have any arguments, but you can do a lot with apt, so you need to define what you want it to do. In this case, we want all the machines to have the cache updated (same as apt-get update). In ansibleyou define what you want the end state to be, not what you want it to do. This is a great aspect – it allows you to run the same command over and over, and it won’t change anythign if the state is already what it should be (called imdempotence).
For centos (which uses yum instead of apt-get) you can do this:
ansible centos -u <xsede name> -s -m yum -a 'name='*' state='latest'
This is very similar, but the yum module has different input – it wants a name of the package you want to update (you can specify just one if you want) and what state you want the package to be in “latest” being most up to date. It will do what it needs to make that happen for you!
Playbooks
If you have only ubuntu or centos machines – nothing more is needed. But if you want to have one command to run updates to both of these sets, we can write a “playbook”. A playbook is simply a set of tasks to run. I’m only going to give you a very simple one, but see the tutorial at the top for a much more in-depth description of all the cool things you can do!
The playbook we use is called update_js.yml – and it looks like this (the — is part of the file!):
—
– hosts: web:ubuntu
tasks:
– name: apt-get update
apt: update_cache=true
– hosts: centos
tasks:
– name: yum update
yum:
name: ‘*’
state: latest
It is a file that lives in the /etc/ansible directory (same as where hosts are) and simply breaks the two commands above into a slightly different format, with one addition – each step has a name. These will help indicate where the script is in progress and if anything fails, where it failed.
By the way, if you want to know the formats for anything else, googling “X in ansible” will usually bring it up – there are a LOT Of great examples and it is a large community with lots of documentation.
To run this playbook, the command is simply:
ansible-playbook -u <xsede name> -s update_js.yml
You will see the following:
PLAY [ubuntu] **********************************************************
TASK [Gathering Facts] *********************************************************
ok: [149.165.168.x]
ok: [149.165.169.x]
ok: [149.165.156.x]
ok: [149.165.168.x]
ok: [129.114.104.x]
ok: [149.165.169.x]
TASK [apt-get update] **********************************************************
changed: [149.165.169.x]
changed: [149.165.168.x]
changed: [149.165.156.x]
changed: [149.165.168.x]
changed: [129.114.104.x]
changed: [149.165.169.x]
PLAY [centos] ******************************************************************
TASK [Gathering Facts] *********************************************************
ok: [129.114.104.x]
TASK [yum update] **************************************************************
ok: [129.114.104.x]
PLAY RECAP *********************************************************************
129.114.104.x : ok=2 changed=1 unreachable=0 failed=0
129.114.104.x : ok=2 changed=0 unreachable=0 failed=0
149.165.156.x : ok=2 changed=1 unreachable=0 failed=0
149.165.168.x : ok=2 changed=1 unreachable=0 failed=0
149.165.168.x : ok=2 changed=1 unreachable=0 failed=0
149.165.169.x : ok=2 changed=1 unreachable=0 failed=0
149.165.169.x : ok=2 changed=1 unreachable=0 failed=0
Playbooks do have the advantage of a better explanation on which machines updated, changed, failed, etc. You can see where the names come in – they are the headers for each task now!
Other things you can do with Ansible
There are lots of other uses – you can download the same modules to all your browsers, you can update to a newer version of a software on all the virtual machines, the options are basically limitless. If there is something you’d like to do on several machines – let us know, we can help make it happen!