Archive for the ‘Outages’ Category

Unexpected Mail and Xen Server Downtimes

Saturday, February 13th, 2010

At around 8am we noticed (different) hardware issues with our SMTP server wiggis as well with one of our Xen hosting servers. This also affected the IDL license server. The problems have been temporarily solved and since approx. 10:30am both systems are working again as before. Expect though maintenance downtimes for those services within the next week.

Maintenance downtime of a group server

Friday, February 5th, 2010


Monday morning, February 8th 2010, we have to replace a broken system harddisk on one of the group fileservers. This will cause a downtime between 07:00 and approximately 07:30.

This will result in a short service interruption for all groupdrives!

To protect you from losing or corrupting any of your files, it is best to close all open files on the group directories.

eGroupWare update

Tuesday, December 8th, 2009

Tomorrow Wednesday Dec 09, starting at 0730, we will upgrade our eGroupWare collaboration software (https://groupware.phys.ethz.ch/) to a new revision. The service will be down for about 30 min. The update addresses several known issues and should restore the SyncML functionality.

Migration Home Directories

Thursday, October 22nd, 2009


To get more free space for the home directories we need to move to a bigger fileserver.
This will be done on Thursday, 29. October, between 18:00 and 21:00.

During this time the home directories (winhome, machome, unixhome), the mail services and some websites will be not available.

To protect you from losing or corrupting any of your files, we strongly recommend you close all open files on the home directories.

Since we have switched to generic names for our services, the home directories will still be accessible the same way as before, so you don't have to change anything.

Update 21:05: The migration is finished and everything should work again! In case of problems please contact the ISG Helpdesk (3 26 68)

Hardware failure – again

Tuesday, October 6th, 2009

Today at 09:50 a crucial server died in our HIT server room. It took us about 20 min to move the affected services to other machines, during which time most of our machines weren't usable. We're sorry for the inconvenience. We're working hard to get rid of this fault-prone hardware.

Serious power outage

Wednesday, September 23rd, 2009

On Wed Sep 23 around 01:00 a major power outage hit Höngg and Affoltern. Our servers on the Hönggerberg campus switched to battery power, but since power was gone for more than 3 h, the batteries eventually drained and all servers switched off. Around 07:15 we began rebooting our infrastructure servers, but it wasn't until noon that the most important services (home, web, mail) were back up. Please bear with us while we still iron out the last remaining issues.

Unexpected downtime of authentication and printer servers

Monday, September 7th, 2009

A security-update on one of our clusters failed unexpectedly and caused a downtime of our authentication server and printer server which needed manual intervention. All services should be back again by now. We apologize for the inconvenience caused by this. Update 18:00: The thin clients work again, too.

Roundcube webmail upgrade

Monday, September 7th, 2009

Tomorrow Tue September 8, starting from 0730, we will upgrade our Roundcube installation. Expected downtime: about 15 min.

Update Tue 0734: upgrade done.

Service interruption

Friday, August 14th, 2009

Today at 07:04 a software upgrade of a core router performed by Informatikdienste caused an avalanche of errors in our server cluster, leading to an outage of printing and some other non-critical services. It took us till around 08:00 to get the systems back to normal. We apologize for any inconvenience.

plimpy DOS

Tuesday, July 21st, 2009

Today at 16:41 a process on our terminal server plimpy freaked out and consumed all memory + swap, causing a complete halt of all other processes. We had to reboot the machine. We're currently looking into ways to prevent this problem from appearing again in the future. Sorry for your inconvenience.