(Chiming in because storage and OS nerding) Given the use case (safety > perf), perhaps the route server storage would be best matched by something like a sync-mounted file system, raid1 with no cache enabled, or read cache only with a forced write-through? A pair of SLC-type SSD's with ext3/4, or XFS mounted 'sync' atop a write-through raid1 should still be entirely awesome and performant for the job of booting an OS and loading route-server binaries, and perhaps even soaking up a few syslog outputs...while being as 'safe' as the hardware allows. If that sounds tempting, tell me where to fedex the ssd media & caddy adapters :) -Tk
On Mar 27, 2015, at 1:26 AM, Doug McIntyre <merlyn@IPHOUSE.NET> wrote:
On Thu, Mar 26, 2015 at 06:24:03PM -0500, Brady Kittel wrote: As an alternate option I have several HP dl380 g7's that are maybe a year old and still under warranty I could provide. They have licensed ILO on them as well and I could provide some hot spare drives for them.
I think with the current PE R320 and PE R805, that we are probably set, but thank you for your offer. The R320 is a pretty new box, hmm, pulling the serial # and going to Dell, it looks like it is only 16 months old. Jeff mentioned they had some more coming out of cycle, and that people were looking to get a hot spare at the UG meeting?
I think the specific problem I was seeing with the old hardware was due to a very specific configuration. With the PERC6/i battery backed up RAID controller, and some combination due to the age of the battery, and that specific design, which periodically (every 2-3 months or so) does a deep drain and charge cycle on the battery to make sure it can hold its' intended charge for the writeback log, during this cycle it switches the writeback log off and then on again, and sometimes the OS glitches on this sudden change in disk latency. My guess is the batteries were below a certain threshold point that exacerbated the writelog switchover, since we have been running on the same hardware for quite some time without issues, but not below the threshold which would alert the BCM on the RAID battery being bad.
Both of these new systems do not have a battery backed up RAID system, which really isn't needed in this use case anyway, and are newer hardware, which I think would serve us well.
I have both inhouse, with the OS installed, I just have to copy the configs back over, which I can get done pretty quickly, as it isn't extensive, and haul them down to 511. I'll have to schedule up some sort of switchout time, how much lead time would a graceful changeout of the hardware be required?
-- Doug McIntyre <merlyn@iphouse.net> ~.~ ipHouse ~.~ Network Engineer/Provisioning/Jack of all Trades