Difference between revisions of "Intervention report 2017-01-12"

From Pumping Station One
Jump to navigation Jump to search
(Created page with "# Description Login was broken on all machines of the space and ps1auth was complaining a lot about other stuff. Took over someone that was trying to correct it but had to le...")
 
 
(3 intermediate revisions by one other user not shown)
Line 1: Line 1:
# Description
+
= Description =
  
 
Login was broken on all machines of the space and ps1auth was complaining a lot about other stuff. Took over someone that was trying to correct it but had to leave.
 
Login was broken on all machines of the space and ps1auth was complaining a lot about other stuff. Took over someone that was trying to correct it but had to leave.
 +
At 2017-01-12T20:00:00
  
# Intervention by
+
= Intervention by =
  
 
- bjonnh
 
- bjonnh
  
# Problem
+
= Problem =
  
 
- Seen some libraries problems on the ps1auth VM. Tried to update them. Then started the clusterfuck.
 
- Seen some libraries problems on the ps1auth VM. Tried to update them. Then started the clusterfuck.
  
# Resolution
+
= Resolution =
 +
 
 +
Ended at: 2017-01-12T22:00:00
  
 
- Did a full update of arch on ps1auth.
 
- Did a full update of arch on ps1auth.
 +
 
- Only some stuff got through
 
- Only some stuff got through
 +
 
- Had to disable GPG signing of packages for that using sed to modify pacman.conf, because nothing else was working
 
- Had to disable GPG signing of packages for that using sed to modify pacman.conf, because nothing else was working
 +
 
- Did the update
 
- Did the update
 +
 
- Rebooted bob, resolved the initial problem of machines not logging, but I had to do that two times
 
- Rebooted bob, resolved the initial problem of machines not logging, but I had to do that two times
 +
 
- Had to reconfigure network because interface name changed
 
- Had to reconfigure network because interface name changed
 +
 
- As python was updated had to recreate the virtualenv of ps1auth
 
- As python was updated had to recreate the virtualenv of ps1auth
 +
 
- Had to update some packages (beautifoulsoup4, kombu add django-ldap something) because they failed with the new python
 
- Had to update some packages (beautifoulsoup4, kombu add django-ldap something) because they failed with the new python
 +
 
- Rebooted the machine to be sure everything was ok
 
- Rebooted the machine to be sure everything was ok
 +
 
- Got home.
 
- Got home.
  
# How to avoid the problem in the future
+
= How to avoid the problem in the future =
 +
 
 +
- Well I had to learn how things worked on the fly. We plan on updating the infrastructure to more maintenable systems (arch was a big problem here as it was not updated for a long time).
  
- Well I had to learn how things worked on the fly. We plan on updating the
 
  infrastructure to more maintenable systems (arch was a big problem here as it
 
  was not updated for a long time).
 
 
- Have emergency clone VMs to restore quickly (if one has to leave before finishing upgrade/maintenance etc)?
 
- Have emergency clone VMs to restore quickly (if one has to leave before finishing upgrade/maintenance etc)?
  
Line 34: Line 45:
  
  
[[Category:Intervention report]][[Category: Systems group]]
+
[[Category:Intervention report]][[Category: Systems Group]]

Latest revision as of 21:42, 13 October 2017

Description

Login was broken on all machines of the space and ps1auth was complaining a lot about other stuff. Took over someone that was trying to correct it but had to leave. At 2017-01-12T20:00:00

Intervention by

- bjonnh

Problem

- Seen some libraries problems on the ps1auth VM. Tried to update them. Then started the clusterfuck.

Resolution

Ended at: 2017-01-12T22:00:00

- Did a full update of arch on ps1auth.

- Only some stuff got through

- Had to disable GPG signing of packages for that using sed to modify pacman.conf, because nothing else was working

- Did the update

- Rebooted bob, resolved the initial problem of machines not logging, but I had to do that two times

- Had to reconfigure network because interface name changed

- As python was updated had to recreate the virtualenv of ps1auth

- Had to update some packages (beautifoulsoup4, kombu add django-ldap something) because they failed with the new python

- Rebooted the machine to be sure everything was ok

- Got home.

How to avoid the problem in the future

- Well I had to learn how things worked on the fly. We plan on updating the infrastructure to more maintenable systems (arch was a big problem here as it was not updated for a long time).

- Have emergency clone VMs to restore quickly (if one has to leave before finishing upgrade/maintenance etc)?