Force10: A Breakdown of My Break Downs

A lot of people see me on Twitter trashing Force10 and wonder what my issues are with the company that Dell saw enough value in to purchase and start promoting as a perfect replacement for ABC (Anything But Cisco) shops. At Cisco Live this year I had the opportunity to explain some of the major ones to a few people and once I started explaining the multitude of cases and issues, they understood. What they didn’t understand is why Dell/Force10 isn’t trying to fix it or even attempt to shut me up, and frankly neither do I. I’ll break this into three areas; features that didn’t or don’t exist, issues that keep coming back, and misconceptions about Force10 in general. This is all geared specifically at the S50n and the S50v in a fully functional L3 edge role, which is exactly what was sold to my school… you can judge if they delivered or not.

First off, missing features or newly introduced features that shouldn’t have been an afterthought.

Loopguard: When I started in October of 2010, there were constantly issues in our dorms of loops bringing down the switches. First thought was "Why haven’t we implemented loopguard?" The answer was because Force10 didn’t have it. It was added in a release since then, but what person saw that feature and said "We can wait on that one, it’s not a common problem."? A hint that maybe these switches aren’t meant for what they were sold.

IPv6 L3 Support: C’mon guys, really? The choices offered up to me if I wanted to run dual stacked were to flatten my network or have static routes.

VLAN Autostate: Not a common request, but a feature that should be easy enough to implement. This would also help with a huge bug that we are still dealing with.

Outbound Interface ACLs: Again, not common, but why not support it? If you’re going to compare yourself to the big boys, you should have all the features.

Now to the bugs, this is where it gets fun.

OSPF Memory Leaks: So my switches will seemingly randomly stop routing traffic, drop management interfaces, then a few minutes later reboot. The only thing that would show up in the log was a block allocation failure in the RPM. For the longest time we had a case open with Force10 on it and got no reason, just a request to upgrade. This is a biggie. I have not gotten an official bug ID from Force10 but I did get a very brief explanation. When the switch received an LSA it would allocate some memory just as any other process. Once it was finished processing the update, it wouldn’t release the memory, eventually filling up all free RAM and crashing the switch. We have experienced this issue in two versions of code.

Remember the VLAN autostate feature I want? That would fix this. We run a layer 3 edge and when all the ports in the VLAN shut off, so does the VLAN, creating an LSA flood. Another fix was to upgrade, the issue was addressed in the newest release.

Upgrade issues: When running L3 services on an S50v or S50n, the memory utilization is not properly calculated by the upgrade process. This means unless you strip the switch down to a basic config you can’t upgrade. That fix I need in the newer code version to alleviate the reboot issues can only be had by bringing the switch down, wiping the config, running the upgrade, then rebuilding the switch. Fantastic.

Linecard Resets (C300): A PoE linecard will seemingly randomly drop and come back a few minutes later. It has no bearing how much power is being used. Still hoping for a resolution on this, but they keep closing my case and asking me to monitor.

PoE Compatibility Issues: Random issues with PoE devices (not just Cisco in case you’re wondering). We get LLDP and/or trunk mismatch errors and then the devices are powered down, but the same device will operate fine if running on another power source. I’ve gotten word that my team may have resolved the issue and I’ll update if that’s the case.

Finally, misconceptions you may hear from your sales rep or Dell.

Force10 is the cheapest 10Gb switch on the market: Sort of. Until you add the stacking modules, stacking cables, 10Gb modules, licensing, and support. Once all that is thrown in, they’re the same as everything else.

Support is superb because they’re small: Small before Dell bought them yes, but not great. Absolutely attrocious since the acquisition. Several tickets opened have all been closed without verifying anything. A snippet of responses:

I have confirmed with engineering and unfortunately there is no workaround for memory leaks but to either reload or upgrade the stack. Please let me know if you have any other questions/concerns.

Case closed before I was given a chance to respond.

Ryan after going through the logs it was saying that the card was unresponsive so a hard reset was sent to the card. This can happen for a various reasons unfortunately logs did not have a specific cause. Please monitor this and let us know if it happens again.

Case closed before I was given a chance to respond.

What is the error in relation to with multicast? Reloading the system may fix the symptom but I need to make sure this doesn't happen again.

Case closed before I was given a chance to respond.

All issues are closed immediately and tagged with the resolution of "This is a known issue and a resolution will be issued."

We recently started flattening our network under the assumption that Force10 will never fix these problems. I have put a new policy in place on our campus, Force10 will no longer be purchased. I can’t afford to waste time working on these switches. A new vendor hasn’t been chosen yet, but I can’t wait to get something in that doesn’t give me headaches. This is all I could put together on a Saturday morning after Cisco Live, I’ll add a second post if I can remember the rest.

Leave a Reply

Your email address will not be published. Required fields are marked *