Log in

No account? Create an account
SCADA and MTBH - Whizistic's Lair [entries|archive|friends|userinfo]

[ website | never working right seemingly ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

[Links:| arstechnica.com the-whiteboard.com userfriendly.org ctrlaltdel-online.com slashdot.org ]

SCADA and MTBH [Jun. 10th, 2006|08:44 am]
From a systems design book written in 1995:

Operations can no longer afford the luxury of dedicated stand-alone systems. Hardware and software of the modern SCADA system myst be incorporated into the enterprise-wide, management information systems strategy to maximize the benefits.

Translating from PHB speak: Run your SCADA system over your enterprise network, rather than have an independent out-of-band means of communication.

From a systems design document from BART, written slightly more recently:

The station computer itself is divided into two systems. First is the Non-Vital Station Computer (NVSC). ("Vital" in the context of railroad safety can be interpreted to mean that the function is critical to the safety of the system.) The NVSC is a fast, flexible computer. The hardware is reliable enough to meet performance-driven Mean Time Between Failure (MTBF) and Mean Time Between System Shutdown (MTBSSD) goals but not safety-related Mean Time Between Hazard (MTBH) goals. These goals are defined precisely below. Second is the Vital Station Computer (VSC), which is slower but reliable enough to meet safety-related MTBH goals.

Now, I realize I'm comparing apples to oranges here. The first quote is referring to SCADA controls for power distribution, designed to keep the power flowing, while the second quote is talking about SCADA controls for people-moving and keeping people-movers from becoming airborne or hitting eachother. Clearly the risks involved in a failure of SCADA controls in these two examples, be it via computer virus or lightning strike, are not equivalent. Still, the thought of running what we term as "safety sensitive" systems over your typical enterprise network[1] makes every one of our controllers blood run cold. And yet, that's where things are moving to. Some SCADA salesman: "Yeah, the PLC just has a rj-45 jack on it, plug it in and it'll get a dhcp address..."

I need a bigger cluebat.

[1] Well, maybe your enterprise network can meet a MTBH goal of no hardware faults in 10^9 hours. Mine certainly can't.

From: johnreen
2006-06-10 08:00 pm (UTC)
In this regard, my favorite revelation about the A380 is that it uses "normal" Ethernet and CAT5 to transmit commands from the flight controls to the (hydraulics controlling the) control surfaces... over—you guessed it—UDP.

Well, I guess if it's good enough for DNS...
(Reply) (Thread)
[User Picture]From: whizistic
2006-06-10 08:23 pm (UTC)
heh. Damn engineers and their KISS principle.

I guess I can understand why. cat5 is less suseptable to crosstalk then what they used before, presumably. And as long as UDP is a session at an application level, I can understand skipping all that tcp overhead, as long as there still is a conversation involved; i.e.

controller->controls : hey! flaps 30!
controls->controller : I heard flaps 30, that right?
controller->controls : goddamn right! flaps 30!
controls->controller : doing flaps 30 now!
controls->controller : flaps 30 done!
controller->controls : go away! err, kthxbye!

Unrelatedly, do you really want everyone on their cellphones in a plane, even if safty-speaking the risk is neglegable (i.e. the plane has been hardened)? Because if I was on a flight and forced to listen to some guys phonecall to his insurance company regarding the emergency room visit to have his cockring dremeled off, I'd demand a refund.
(Reply) (Parent) (Thread)
[User Picture]From: jgp
2006-06-10 08:58 pm (UTC)
Ack. In more ways than one.
(Reply) (Parent) (Thread)
From: johnreen
2006-06-10 10:50 pm (UTC)
cat5 is less suseptable to crosstalk then what they used before, presumably.</i>

Presumably... but I've never had much faith in Airbus to "get it right." In fact, reading that Wikipedia page made me even more amused/afraid:
The microchips control the A380's cabin-pressurization system; Mangan has stated that the combination of TTTech's microprocessor and a new architecture of valves could cause the A380 to undergo rapid decompression. This sudden drop in cabin pressure could cause the flight crew to lose consciousness and pose a major hurdle to safe flight.
The A380 was initially planned to do away with thrust reversers as it has more than enough braking capacity. The FAA disagreed and Airbus elected to fit the 2 inboard engines with them.

I can understand the use of UDP over TCP, but... I'm just surprised they didn't use a more... realtime, domain-specific protocol. Also, despite the domain-specific nature of the use of UDP, what happens when you have:

controller->controls : hey! flaps 30!
controller->controls : Hey... lazy asses... FLAPS 30!
controls->controller : I heard flaps 30, that right?
controller->controls : goddamn right! flaps 30!
controls->controller : I heard flaps 30, that right?
controller->controls : WTF?!?! flaps 30 now or we all die!
controls->controller : I heard flaps 30, that right?
controller->controls : Oh, Jesus Christ... I'll just come down there at do it myself!!
controls->controller : Hello? Hellooooo???! Did you still want flaps 30?!!


And given Airbus' history, I'm not real confident that they can get the "corner cases," as we euphemistically say, correct.

Unrelatedly, do you really want everyone on their cellphones in a plane, even if safty-speaking the risk is neglegable (i.e. the plane has been hardened)?

I'm fine with the current restrictions (especially for the reason you mention), but I don't think they're as much a threat to safety as everyone says, for the following reasons:

  • Flights are conducted every day with electromagnetic interference, from laptop computers and cellphones, without incident. People always tell stories of forgetting to turn off their cell phones. And do you really think most technology illiterate adults know how to "turn the transmit function" off on their cellphones, to say nothing of turning off Bluetooth and Wifi on their laptops?

  • The restriction against cellphone use in the skies comes from... the FCC, not the FAA. The FAA couldn't care less, as long as it doesn't impact flight safety.

  • The Mythbusters had an episode about this, and were unable to make a glass cockpit show any interesting deviations.

Having said that, I probably wouldn't want to have everyone on their cellphones in hard IFR conditions while trying to execute an instrument approach... and some pilots have reported (with Airbuses, especially) that sometimes, the plane's flight controls "wig out" when there's electromagnetic interference... hence the phrase "If it's not Boeing, I'm not going!"
(Reply) (Parent) (Thread)
[User Picture]From: whizistic
2006-06-10 11:20 pm (UTC)
Oh yeah, great article there. :) Airbus has always seemed a bit wonky to me, and reading that A320 report lends credence to what I've been thinking re: government/industry collusion. My favorite line in the wikipedia article:

The NSS has enough inbuilt robustness to do away with onboard backup paper documents.

Well that's nice. So long as you don't *actually* do away with the onboard backup paper documents. I'd hope there's a rule about that.

The comments of switching from copper to aluminum scare me. I don't recall my electrical physics enough to say why, but the specs for cat5 performance on copper conductors exist, while specs for aluminum conductors doesn't exist [1].

I'm sure it was really a cost-cutting move.

Re: cellphones, I suppose the only thing preventing the skies from turning into a cacophony of noise is (1) the oppressive price per minute for skyphones and (2) the laziness of the FCC.

[1] A search on google for aluminum cat5 shows only cables which are shielded. The conductors are still copper. I'm sure Airbus is engineering cable specs from scratch anyway.

P.S. Edge case testing? what's that? :)
P.P.S. Boing like totally rulz lols !!1!
(Reply) (Parent) (Thread)
From: johnreen
2006-06-11 12:26 am (UTC)
(2) the laziness of the FCC.

Hey now... it TAKES A LOT OF WORK to sort all those frequencies out, man!!

I will say this: if I ever lose my radio, I won't hesitate a bit to call up Palo Alto Tower from the cockpit, and get a clearance to land via cellphone... it uhh... wouldn't be the first time.
(Reply) (Parent) (Thread)
[User Picture]From: jgp
2006-06-10 08:58 pm (UTC)
What do you mean, your network can't meet an uptime of 114,000 years? Disappointing.
(Reply) (Thread)
[User Picture]From: whizistic
2006-06-10 09:37 pm (UTC)
Actually, I can; you can't get hardware faults if the network doesn't exist. :)

Well, the rule actually says no hardware faults which cause a Hazard (i.e. possibility of train derailment or collision), which can be avoided through redundency, but then you get into those more systemic network issues like broadcast storms and routing loops, which apparently don't count. Gotta love the feds and their wording from 1970's relayville.
(Reply) (Parent) (Thread)