Difference between revisions of "Intervention report 2017-01-12"
(Created page with "# Description Login was broken on all machines of the space and ps1auth was complaining a lot about other stuff. Took over someone that was trying to correct it but had to le...") |
|||
Line 1: | Line 1: | ||
− | + | = Description = | |
Login was broken on all machines of the space and ps1auth was complaining a lot about other stuff. Took over someone that was trying to correct it but had to leave. | Login was broken on all machines of the space and ps1auth was complaining a lot about other stuff. Took over someone that was trying to correct it but had to leave. | ||
+ | At 2017-01-12T20:00:00 | ||
− | + | = Intervention by = | |
- bjonnh | - bjonnh | ||
− | + | = Problem = | |
- Seen some libraries problems on the ps1auth VM. Tried to update them. Then started the clusterfuck. | - Seen some libraries problems on the ps1auth VM. Tried to update them. Then started the clusterfuck. | ||
− | + | = Resolution = | |
+ | |||
+ | Ended at: 2017-01-12T22:00:00 | ||
- Did a full update of arch on ps1auth. | - Did a full update of arch on ps1auth. |
Revision as of 17:17, 12 January 2017
Description
Login was broken on all machines of the space and ps1auth was complaining a lot about other stuff. Took over someone that was trying to correct it but had to leave. At 2017-01-12T20:00:00
Intervention by
- bjonnh
Problem
- Seen some libraries problems on the ps1auth VM. Tried to update them. Then started the clusterfuck.
Resolution
Ended at: 2017-01-12T22:00:00
- Did a full update of arch on ps1auth. - Only some stuff got through - Had to disable GPG signing of packages for that using sed to modify pacman.conf, because nothing else was working - Did the update - Rebooted bob, resolved the initial problem of machines not logging, but I had to do that two times - Had to reconfigure network because interface name changed - As python was updated had to recreate the virtualenv of ps1auth - Had to update some packages (beautifoulsoup4, kombu add django-ldap something) because they failed with the new python - Rebooted the machine to be sure everything was ok - Got home.
- How to avoid the problem in the future
- Well I had to learn how things worked on the fly. We plan on updating the
infrastructure to more maintenable systems (arch was a big problem here as it was not updated for a long time).
- Have emergency clone VMs to restore quickly (if one has to leave before finishing upgrade/maintenance etc)?