Friday, December 08, 2006

Nothing wrong with agents

This post is mostly in reaction to a post from Thomas Ptacek on the Matasano
Blog
, one of my favorites. Tom in turn says his was in reaction to post from Alan Shimel, who is replying to an article by Ray Wizbowski. That gives you some idea where in the food chain I am. But it doesn't get interesting for me until Tom's post, so that's where I start.

I'm not trying to feed any kind of Matasano/BigFix war but hey, they started it. (For the seriously humor impaired, I'm kidding. I've known most of the Matasano guys casually for years, and Amrit has worked with Tom, and so on. We're just disagreeing with each other using technical points, that's how it is supposed to work.)

If you're just running across this post via some other blog, I'm the QA Manager at BigFix. We are a vendor of agent-based systems management software. Though, I'm sure our lawyers would like me to point out that I'm providing my own opinions here, and I'm not a company spokesperson. Anyway, probably because of where I work and what I know, I'm a fan of the agent-based approach, and naturally I think we do a fine job, our stuff is secure, and we can do it all. So if you're thinking "Ryan is biased", well duh.

Onto the bloggery. I make a few comments of little substance on Tom's blog entry, and he emails me to politely suggest that if I want to disagree, I should quit making snide jabs, and get to the point-by-point.

Here's the premise I start from:
  • You have a large number of machines (an "enterprise")
  • You wish to have mass control over them (you want "management")
  • The software that comes with the OS is insufficient for this purpose (you're going to buy some "software")
In other words, let's assume that the built-ins like WU, UP2DATE, YAST, ports, Software Update, and so on are not going to cut it. Point being, you have decided you have to add something on, and not having extra software isn't an option. If you disagree with this, then you probably have less than a few thousand machines, and the rest of this will be quite boring.

Tom's assertion is that agent-based software is bad, m'kay? and you should avoid it. To be completely fair, I'm seriously summarizing and putting words in his mouth. But take a look at the title of his post I'm responding to "Matasano Security Recommendation #001: Avoid Agents" and this slide which says "Enterprise Management Applications - Threat or Menace?", and you sense a theme. Yes, Tom is quite fair in the details, and will tell you he can only make claims about stuff he has tried, which does not yet include BigFix.

I understand good storytelling, yet I'm getting covered by these blanket statements. So hopefully it is understandable if I feel it necessary to respond.

So, you need some enterprise management software. Your basic choices are agents, and scanners, because I've already ruled out any kind of one-by-one method as impractical by the time you get to a certain size. I'm of the opinion that you can only get so far with pure scanners. For example, they can only determine, they can't change. If it can change, then what you have is a scanner-driven part-time agent. Yes, they push an agent onto the box long enough to do their business, and then get off again. And yes, there is value to only having the agent on there the absolute minimum amount of time, so I don't want to totally dismiss that benefit.

Let's check Tom... OK, he's not advocating scanners. In fact, if I'm not reading too much into it, he's not even saying you shouldn't run agents at all, he just wants fewer. But wait, are we talking per machine, or what?
  1. Minimizing the number of machines that run agent software.

Do you mean that some machines shouldn't run any agents? Then how are you going to manage them? Nothing wrong with hand-maintaining a small number of critical machines, of course, but I don't think that is what is being suggested. So this might be basic choice number 1: Are you at more risk by not having management of your machines, or by having an agent, even if it is a "bad" one? I still have to go with agent. Simple math will get you there. Count all the various threats out there, and only a small handful of them have been aimed at agents.

2. Minimizing the number of different agents supported in the enterprise as a whole.
I think this point is far more central to Tom's message. And I don't disagree with him. Again, no huge surprise, since BigFix replaces a number of other agents. See next point.

Endpoint agents are programs that run silently in the background, usually as Windows Services or Unix daemons, which communicate back to a central management system. Well known examples include:

  • Systems Management (BMC Patrol, CA Unicenter, Microsoft MOM)

  • Antivirus (McAfee, Symantec)

  • Patch Management (Novell ZenWorks, SDS, BigFix)

  • Data Leakage Prevention

Good definition, I agree. But not on the categories, not for BigFix. We do systems management, AV management (we manage something like 6 or more AV vendors' code and signature updates), AV & antispyware engines (OEM'd), Patch, software distribution, power management, inventory, etc... We do NOT do the HIDS functions ala Blink and Determina. That would be an example of someone else's software we would manage.

So it's incorrect to stick BigFix just in patch. It's a common mistake, that's all we used to emphasize up until a few years ago. And, hey, not Tom's job to make sure our marketing is properly conveyed. But I make a big deal out of it precisely because BigFix is exactly the kind of thing he's calling for to help reduce the number of agents running around.
Agent-based architectures are a severe security risk.
So now Tom makes one of these leaps I object to. He's drawing mass conclusions based on (significant) experience actually looking at a bunch of agent systems. But you can't make a factual statement about all N software products by looking at N-M of them, if M is greater than zero. You can only state generalities.

He gives specifics class examples. While I still owe the world a BigFix architecture document (I know you're all anxiously waiting), let me give some short previews as responses.
Listening Network Services on Agents
You can disable the BigFix notification protocol, and go full polling if you want. We are client pull. Even with the listener in the default listening state, the protocol is simple. It's just a 12-16 byte (payload) UDP packet. It suggests to the agent that there is something upstream that it should check for.

Listening Network Services on Management Servers

OK, got me there. We've discovered that either the agent or the servers need to have something listening on the network, as a general design principle. Are you suggesting that people go without management at all again? I'm pretty sure any alternative to an agent will want a listener, too.

Client of Agent Service on Management Server
That's our default, but it's not necessary if you don't want it. We use the agent on the server to do software upgrade on the server. But you can do it manually if you choose. I, for one, expect a software distribution system to be self-upgrading. But here, you're implying that the server is security-critical. I.e. crack the server, and you have the agents. BigFix doesn't work that way. All the security is in the signing keys.

Confidentiality and Integrity of Agent/Server Protocol
Ah, this is where we're especially awesome. Everything the agent pulls down has been signed by the private key of an administrator, and is verified by the agent before it will save it or look at it. We use OpenSSL and zlib libs, and of course track vulnerabilities in those, and re-release when they re-release.
Web Application on Management Server
I'm suspicious that we're talking about different animals here, but we have an optional Web Reports component that can be run on the server or on its own server. And it can do SSL if you like. It will be there if you do an install taking all the defaults. And again, getting the server for BigFix doesn't get you the agents.
Javascript on Browser Client of Management Server
This is what makes me think we might be talking different animals. We don't have a web-based management interface. Or rather, to be completely up-front, we use the IE libs in our MFC app which is our Console, and everything is run in restricted zones or comes from signed content.
Listening Network Services for Management Clients on Management Server
Isn't this one a dupe?
Middleware Frameworks and RPC
I think, from having listened to your Black Hat talk, this is referring to complicated protocols between the agents and server. We use a subset of HTTP, and move files around.

Client of Management Server Service on Agent

What? Can't parse.

Display Logic for Agent-Sourced Data on Management Client

Ah, we could potentially suffer from this class of problem if we have bugs there. You got one.

Confidentiality and Integrity of Client/Server Protocol

Isn't this a dupe? If not, which Client and Server are we talking, if not the agent and server? The management console? Ours speaks the minimal HTTP and TDS (the MS SQL Server protocol. (Well, the Sybase protocol, but now I'm just being pedantic because I used to work at Sybase.))

Databases

Yep, got one of those. You can't compromise our agents if you get the database.

Agents tend to be installed en-masse. Attacks that offer uniform compromise of all installed agents provide attacks with thousands of hijacked machines.

Yep. True of any central management system, if you find a flaw that allows control of the endpoints. How is this particularly the fault of agents?

Even in the absence of an exploit that compromises agent software directly, it is impractical to ensure the security of thousands of endpoints. But every machine running an agent must be secured if the management components are to be shielded from attacks.

Ah, you assume that only agents can attack the server? Not so for BigFix. Unless the customer has done some extra firewalling, anyone with IP connectivity can talk to the server. Attack away.

In a majority of surveyed agent-based systems, compromise of a single management server allows code execution on every agent, exposing the enterprise to a single point of failure.

For our system, let's call this "stolen keys". Yes, if you steal some keys (and the passphrase), you can act as the owner of those keys. That's why we have key revocation. We've got a whole PKI built-in, it works quite well. Something can always be stolen, spoofed, or impersonated. We went with what we felt has the best security, and has attestation to boot. Our financials customers love the audit trail.

This class of problem is true of any central management system. Steal the important authentication thingy, and you control the endpoints. Why is this particularly an agent problem? Do you prefer some sort of scanner thing that gives the admin creds to every IP it hits? Are you proposing no central management again?

Agent implementations are often substantially homogenous, even across operating systems, enabling uniformly effective attacks against desktops, Windows servers, and Unix servers.

We prefer to think of it as uniform management, but guilty as charged. So yes, if you steal some keys, we have a cross-platform language you can use to command the agents with. Admittedly, for other central management systems, you would have to craft your commands in a number of different shells.

Workstations of management operators are high-value IT targets, and compromised agents can inject poisonous data to exploit a myriad of clientside and XSS-style attacks to hijack their machines.

This is a potentially viable technique if we have bugs in that area. But like I said, you needn't be an agent to attack there, go for it. One of the points from your Black Hat talk was that apps that weren't Internet-facing didn't have to survive those attacks, and were weaker for it (my wording.) So far, our customers with Internet-facing servers and relays where attackers could try feeding bogus data haven't fallen over. Maybe we're just enjoying some obscurity.

[Section on the kinds of things Matasano has found elsewhere.]

No doubt about it, Matasano is good at what they do. I'm looking to have more outside auditing Real Soon Now. I have no illusions that we'll have a 100% flawless clean bill of health when we put it in front of someone of Tom's caliber. What I AM confident about is that we will do far better than the others Tom talks about (but can't name, because they don't have their patches out yet.)

First off, my programmers can beat up your programmers. Second, our architecture is designed to eliminate huge swaths of problems. That thing where we have everything that hits the agents be signed? Right. It means you can't throw attacks at the agents unless they are signed. You have to find flaws in OpenSSL or zlib to try attacks before that stage. While not perfect, we use those libraries for a reason. Third, when something is found, we get our patches out in a timely manner. The last big thing we had? 3 days. And since our system does software distribution and patch management, you could be fully patched about 10 minutes after that, or as soon as your change management allows.

[Tom's mitigating factors]
If I did point-by-point here, a lot of it would be redundant. Hopefully, some summarizing will suffice.
  • I fully disagree with removing the most important assets from management.
  • You don't need to segregate classes of managed machines if it's not important to "be an agent"
  • My protocols are as simple as can be. They just move files around. The files are all signed though, that's going to cause the attackers some trouble.
  • Suggesting "use SSL" alone is a boondoggle. They key point, easy to miss, is Tom is suggesting that agents sign reports. While that has value, and BigFix will likely offer that as an option in the future, it shouldn't be key to the system surviving.
  • Use third-party auditing? Actually, I agree with you there, and I will be doing more. But is that recommendation a huge surprise, given Tom's job? ;)
[Full-snark mode, Tom's conclusions]
Agent-based architectures are incredibly convenient and can be a significant cost-saver for IT operations teams.
You forgot: And if you don't have one, or even some other central management system with the exact same class of problems, then you are in FAR, FAR worse shape than having a few agent holes to deal with.
In all circumstances, enterprises should seek to minimize the number of agent installations within their enterprise.

Indeed. And BigFix sales people are standing by.

In all circumstances, enterprises should seek to minimize the number of different agent-based vendors their enterprises must support.

Still right with you there.

Agent-based software should be treated as a high-risk target for attacks. Agent software warrants intensive security testing and analysis and rigorous access control.
Treat us that way if you like, we won't hold it against you. And then we will replace all the vendors who didn't hold up to scrutiny. After all, Tom's talking about our competition.

8 comments:

Thomas said...

I have no problem with the argument that concedes my points ("if you have to deal with agents, minimize the number of vendors") and then tries to establish a particular vendor as "the most secure, most viable" choice.

Competition is good, and competition on security is sorely lacking in this space.

Steve said...

Ryan,

Thanks for taking the time and effort necessary to explain things to this level of detail. Very clear.

Steve Larsen

Amrit said...

Very nice write-up

Alan Shimel said...

Ryan- nice write up. Two things though. First of all, other agents are bad, but yours is Ok because it is superior? Is that at least partially what you are saying? Secondly, how does the whole agent thing deal with unmanaged devices? This is really the heart of my position, that it is not one size fits all. You need both agent and agentless options, depending on what you are trying to do.

Ryan Russell said...

Alan,

I tend to think BigFix's agent is better in terms of security. It was a specific goal from the beginning, and I think we've done a good job. I haven't done any competitive analysis of everyone else's stuff. To the degree I assume we are more secure, I think a lot of that comes from what Tom says.

But even though I take Tom's word for it that most of the agents are "bad" (meaning, they have easy-for-Tom-to-find vulnerabilities), I wouldn't go without a central management system. If you don't have one, you get owned now. If you have a vulnerable one, you maybe get owned later, and then you can (hopefully) fix it. So if I were the IT guy again, I would take a BigFix competitor over nothing.

I prefer a primarily agent-based system.

Going with the highest-level categories, your other choice is something scanner-based. A scanning solution has the inherent advantage that it should find unmanaged devices, and alert you to them. But that's the only advantage I see to scanner-based.

The advantages I see for agent-based are better information, better ability to change and fix the endpoint, ability to act when the machine is off-network, ability to get out through the endpoint firewall, it's always on when the endpoint is on, and is generally faster and scales better.

So in our case, we can have our agents do distributed scanning (we have licensed nmap) to find unmanaged machines and devices that we can't load a traditional agent on (printers, routers, switches, etc...) We are considering some future enhancements such as having the agents act as local information gathering points for their network segment, and the possibility of using the agents to manage other devices that can't have an agent themselves.

So we are an agent-scanner hybrid, heavy on the agent.

Whereas scanner-based solutions will tend to put an agent on an endpoint long enough to cause some change. Even if it's something as simple as a short script, or calling the system APIs remotely to twiddle the registry or stop and start services. So they are scanner-agent hybrids, maybe heavy on the scanning.

I figure most of the vendors must be a hybrid of some sort.

I see you like NAC setups. NAC still needs decision support, whether its agent or scanner. I have some doubts about how well a scanner works with NAC, my NAC integration guys tells me that you have 500ms to give the Cisco software an answer. Maybe that's just for the way we do it.

We integrate with NAC to be the health-check piece. We've gotten some good feedback from customers and Cisco people on how well we work.

Anonymous said...

I think you need to look at the market space and rethink who you identify as competition... There are several up and coming vendors.. There are many including Elemental Security, Looking GLass SYstems and Secure Elements... Looking Glass was a system built to do these things on purpose and has been penetrating the federal space very successfully. Secure Elements is another one with perhaps focus more on policy..

Bigfix is gaining traction for sure... but their marketing is mickey mouse.

Ryan Russell said...

Anonymous:

I'm not the best one to ask about who our competition is. That's a better question for sales. I personally am vaguely aware of Elemental Security because of Farmer and Van Rossum.

If the gist of the question is why didn't I list them in the competitors list? That's because it's Tom's list.

As to our marketing, I like it. But I have no idea how effective it is.

camobel said...
This comment has been removed by a blog administrator.