Minutes from Wheel Meeting held 23rd of May 2020
Wheel Meeting Agenda - Saturday 2020-05-23 14:00
- VENUE: https://meetings.ucc.asn.au/
- https://meetings.ucc.asn.au/b/bob-yrk-uy6
Meeting opened 14:08
Attendance
Present
- [BOB]
- [NTU]
- [333]
- [THA]
- [MPT]
- [TPG]
- [CFE]
- [TPG]
- [MTL]
- Wings (Guest)
Late
- [TEC], not too late though. Just 30 minutes
Absent
- Everyone else
Schedule next meeting
- Schedule/delegate reminders of next meeting
- 20 June is immediately after exams
- Let's do 27 June, same time and place
- https://meetings.ucc.asn.au/b/bob-yrk-uy6
- Curate agenda.next
- Much bikeshedding about how to automate meeting reminders
ACTION:
- [TPG] commits to hacking something up
ACTION:
- [TPG] to look after next meeting reminder
Standing items
Visibly reinduct members new (and old?) with the "Wheel Group Ethical Guidelines"
- Examining an Ethical Guideline, e.g. asking:
- What's an example situation?
- [NTU]: Forensics scenario? Disc space management?
- [BOB]: Coming across user images in a set of recovered webcam images (from recovered filesystem)
- Discussed "forgetting" private information that has been accessed
Status check: Regular updates, monitoring
- e.g. Debian oldstable 9 "stretch" -> Debian stable 10 "buster"
- discord-irc.ucc.asn.au could use a rebuild
ACTION:
- [333] to do in-place upgrade
- murasoi
ACTION:
-
[MPT] to do dist-upgrade
-
Pay attention to firewalling (iptables vs nftables) and logging
-
molmol
-
[NTU] was hoping for benchmarking before and after, will help
-
No immediate volunteers
Mission Control (ocsinventory, uccmonitor)
- ocsinventory, uccmonitor (see https://wiki.ucc.asn.au/MissionControl) for an overview
- Add molmol and/or get a NFS latency-under-load benchmark
- [DAA]: thought I'd asked before but can someone take me out of the root crontab MAILTO on murasoi
- Getting errors from the script that updates the rancid backup so perhaps it is directly defined there, I forget
- murasoi:/etc/cron.hourly/11rancid errors if mussel is down
- lard is responding to pings from murasoi but rancid cannot connect
- abe is not responding to pings and rancid is also complaining
- Who's watching hostperson?
- Would other uccmonitor notifications work better? as well?
Active Directory logins
- [MTL]: has been playing with the UCC grafana setup to add Active Directory logins.
- UCC's AD doesn't return a mail attribute for LDAP queries.
- [MTL] is going to investigate an alternative
- Will set up a SSO system for UCC which can give the right information to grafana on signup
- [MTL] will also set up a healthchecks.io UCC install on his VM
- Once this is working, will look at migrating to UCC
Status check: Backups
- [NTU] We should have done that offsite file-restore demo
- Discussion about backup solutions
- zzdailybackup going to ???somewhere??? in the ether
- rsync.net
- backblaze
- [NTU] can we get a small monthly budget from committee for this sort of thing?
- An ongoing budget for offsite cloud services of the order of $20 per month
- Full disks
- here comes murasoi:/var - live lvextend(8) demo?
- Dead disks
- As previously mentioned, mollitz can only take 2TB disks, so there's no capacity for expansion
- Encourage members to undertake housekeeping of their home directories
- Consensus that wheel wants the club to purchase 2x 7200RPM 12TB disks, with an approximate budget of $650 each
Status check: Password/Key rotations
- https://en.wikipedia.org/wiki/Pro_re_nata ((As needed))
Expiring wheel keys in UCCPass
- \[MPT\] Editing files keeps failing when keys expire
- Fixing:
- `cd /home/wheel/bin/uccpass/keys/wheel`
- `gpg2 --fingerprint | grep -C2 <first 4 of fingerprint of failing key>`
- `mv <failing>.gpg <failing>.gpg.expired`
- `uccpass reload`
- Keys of long-standing wheel members are starting to expire in quick succession
- Regen keys as required
-
gpg --gen-key
-gpg --export -a "John Hodge (UCC Wheel Group)" > uccpass.pub
-cp uccpass.pub /home/wheel/bin/uccpass/keys/wheel/tpg.gpg
-uccpass reload
...then New wheel members, additions, nominations
- Welcome to wheel!
- Read /home/wheel/docs/WelcomeToWheel
New Matters
Power Outage
- Short power outage Thursday 2020-05-21T23:46
- Team effort on remote powerup
- 4G link came back, 5 minutes sooner than the fibre uplink?
- mooneye didn't fully boot
- Pinged, connection refused on services, port 22, 25, ...
- Long fsck? then failed to mount a filesystem?
- Serial console not available?
- Power cycled with ipmitool -> success!
- Later:
systemctl restart mailman
- motsugo, molmol NFS servers not exporting properly - accidental DNS dependence?
- Cluster hosts running, but most VMs, containers down: no nas-vmstore from molmol
- Could not login to WebUI because of...
- Samson AD server VM running, but needed:
systemctl restart samba-ad-dc.service
- Some VMs not set to "autostart" - Some VMs set for "autostart" needed manual start: needed nas-vmstore?
Fixing UCC DNS
- This will be an issue with the UWA network/firewall changes
- The main issue is that mooneye didn't boot cleanly
- Need to break out DNS between authoritative and resolvers
- Need to go through UCC hosts to ensure they are all configured similarly, and also to use our local resolvers
- Need to tease out things that use ucc.gu.uwa.edu.au and move things away to ucc.asn.au; use ucc.asn.au in configs
- Avoid split horizon
- Move to stuff we control
** ACTION:**
- [333] to look into getting SoL (Serial over LAN) working for Mooneye's IPMI
** ACTION:**
- [MPT] and [333] to integrate new Magikarp/Mudkip disks into ceph and upgrade ceph
- https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus
Matters arising previously
Network Changes
- Update on network nightmares with UCC/UWA DNS
- [MPT] has an email about the change is being delayed until June
Network Security
- Current security of remote management had been brought up 2020-04-18_wheel.txt
- Still need to decide whether we need to:
- Lock down the 192.168.2.0 subnet further (ie. no access from motsugo)
- e.g. some management controllers are positively ancient, and are unpatched
- Need to look into applying stricter firewalling between UCC's LAN + the management network
- [333] moves to defer to next meeting
Annual account locking
- If it has not happened by now, it's overdue
- Are the rejoining member/password reset/new member account procedures going OK?
- Pausing until clubroom is accessible?
- What's the latest we can leave it and still get 2020 renewals in?
- Also need to tidy deleted/obsolete accounts?
- [BOB]: People will balk at paying new account fees early in 2021 if they are left too late in 2020
- [MPT]: It's probably best if account locking happens after Semester 1 exams
- Address next meeting
PREVIOUS ACTION: Purchase of 2x 1TB SSDs
- [333][THA][TEC]: Buy 2x 1TB SSDs for magikarp+mudkip Ceph?
- Passed by committee on 2019-10-04.txt
- Done 2020-05-12 [NTU]
- Installed 2020-05-22 [MPT]
mooneye, mailfish, UCC SOE
- [MTL] gives update of status
- [MTL] requests permission to move mail off mooneye
- General consensus that's okay
4G Backup Link
- [MPT]: 4G backup link maintenance
- Signal issues, external antenna now hooked up
- Comes back up automatically on power loss
- Verified by fire on Saturday morning
- Need to document and automate configuration (Ansible SOE)
Meeting closed 16:29
Current Action Items
Wheel Group Ethical Guidelines
- Add in example situations regarding ethics into Guidlines
- Or when inducting new/returning wheel members
[MPT] + [333]
- Integrate new Magikarp/Mudkip disks into ceph and upgrade ceph (With [MPT])
- https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus
[MPT]
- To do dist-upgrade
[333]
- To do in-place upgrade (murasoi)
- To look into getting SoL (Serial over LAN) up for Mooneye's IPMI
[TPG]
- To look into set up of automated meeting reminders
[committee] + [wheel]
- Emails about Account Locking to be sent by the end of the month
- Should have a date for clubroom reopening by then
- Account Locking to happen after Semester 1 exams