cURLing Your Way To Redfish Idiosyncrasies
2024-03-19
Many modern data centers actually have two separate networks. Most people only ever interact with the primary, "in-band" network. This is the network that servers use to talk with one another and share information. But there's second network, the "out-of-band" (OOB) network. This network is used exclusively for managing the data center itself.
One of the ways the OOB network is used to manage a data center is to talk with servers' BMCs. BMCs are little computers that sit on the back of servers and can be used to run diagnostics, collect telemetry, and perform various administrative functions. The primary method of interaction with BMCs for most people is a protocol called IPMI. IPMI is...not great. It's essentially a custom network protocol which means if you want to write software using it you've got a lot of work ahead of you.
In 2014 we got something "better": Redfish. Redfish has two primary improvements over IPMI:
- Redfish is a REST API that can be used with any HTTP client.
- All the data sent to and from the BMC is formatted in nice, easily readable JSON.
I've been working on automating some of our server management at work, and I was excited by the prospect
of using something different from IPMI. While it's far from perfect, Redfish seemed like a step up.
What's more, since everything is just standard-ish REST, I could use curl
to test out and explore
various bits of the API before writing any real code.
Redfish Examples
Let's say we have a BMC sitting on our OOB network with the IP address 10.91.148.139
. We can query
the "redfish root" by doing a simple curl
on that host with the path /redfish/v1
:
|
{
}
I've elided some of the data here, but as we can see, there's a lot of useful stuff. But if we want
to look at any of it, we're going to need to establish an actual session. This can be done by doing
a POST
to the Session's link we received in the above response, i.e. "creating a session" in REST
parlance. Let's login as an administrator:
|
{
}
; charset=UTF-8
The headers we received with the response (output to headers.txt
) included an X-Auth-Token
header. We can use this token in all future requests to act as the administrator. To any one with a
passing familiarity with REST, this should all be very familiar. This is all very standard which is
great! For example, using one of the links we got in our first response, we can now get information
regarding all of the accounts on the BMC:
|
{
}
If we wanted to, we could query the individual accounts returned here. By now, you should be
noticing a pattern. Redfish responses typically include links to further information. Also, as one
might expect, if we wanted to create a whole new account we could do it using a properly formulated
POST
to the /redfish/v1/AccountService/Accounts
path. Standard REST, it's a wonderful thing.
Idiosyncrasy 1: Failed Logins
So far, all the commands I've been running have been on a Quanta server. However, when I started
working with a Dell server, I started to notice some...unexpected deviations. For instance, one of
the forms of automation I've been creating is ensuring that certain accounts are present on the BMC.
Since I'm doing this as part of a bootstrapping process, I actually don't know if I'm able to log in
and do any of the introspection I've shown above. Absent the ability to actually log in, the
easiest way to determine if an account exists is to attempt to login as that account. If I get
back a 401
response, it's safe to assume that the account doesn't exist on the BMC yet. At least,
that's what I would expect from having worked with REST before. On the Quanta machine, this is
exactly what happens:
|
{
}
; charset=UTF-8
Let's try the same thing on a Dell machine (note that for Dell, the Sessions path is different, which is very much permitted by the Redfish spec).
This is very different from what we got with the Quanta server and is less than ideal for a number
of reasons. First of all, we only get back a 400 Bad Request
. This error code is pretty generic
and could mean a lot of different things (as Dell's documentation says). In
addition, instead of getting back a JSON response, we've gotten back a pretty generic HTML page. In
short, there's no way to know for sure that problem here is a bad username and password. I couldn't
really find a good solution to this. Since I've actually ran my automation code and confirmed it
works for valid credentials, I figured it was safe to assume that if I get a 400
when I'm doing
auth on a Dell machine it's because of auth problems, not because of, say, a malformed request. If anyone
from Dell is reading this, I would love to chat about how we can make this better. I'm pretty sure
just having a JSON response with a well-defined error code would do the trick.
Idiosyncrasy 2: Account Creation
The next step in my automation journey was to actually create accounts that were missing. On Quanta,
this works just the way you think it would. You curl
the accounts path with a POST:
|
{
}
We could even remove the account with a DELETE
on the returned id:
But with Dell, things are completely different. The accounts path doesn't support a POST
or
DELETE
, it only supports a PATCH. What? How are we supposed to create or
delete accounts? After some digging, it turns out that all accounts a Dell BMC can have are already
created, they're just all disabled, have their role set to None
, and have blank credentials. So to
"create" and account, you update the account like so:
To "delete" the account you actually need to make two separate requests, one to disable the account and the other to clear out it's credentials (you'll get an error if you try to do both in the same request):
This is...super weird. I'm sure it's due to some implementation details on the Dell BMC and perhaps related to backwards compatibility with Dell's proprietary management, iDRAC. But still, I'm sure there was something they could've done to paper over this. This is hardly what I'd call proper REST.
Can we do better?
When all is said and done, I'm not sure how much of a win Redfish is.
The ability to have fairly standard responses over HTTP with wide language support is indeed better than what we had before.
Furthermore, all of the curl
I've show above was super helpful in exploring the way different vendors implement Redfish.
I was able to take all of my learnings and easily translate them into the actual code.
Doing something like this would be much harder if the Redfish spec used something other than HTTP.
Still, as we've seen above, vendor-specific idiosyncrasies are prevalent, and that's really what I'd like to see solved. I was just trying to set up user accounts, something that doesn't feel like it needs to be different for each vendor. The whole thing is much worse if you want to do something more complicated in a vendor agnostic way (e.g. set BIOS boot order). Ultimately, given all the special cases, at work we ended up just sticking with IPMI for our automation (we just "shell out" to IPMI from Go code).
You could try to lock down the spec even more, but REST itself has a lot of wiggle room. One could argue the last thing we need is another standard, but I can't help but wonder if we could do better. Perhaps if we had a more detailed spec and used a true IDL (e.g. Protobuf), we could actually get closer to something like a generalized BMC Client that works across vendors.
Appendix
In a lot of the examples I elided some output and some input for the sake of clarity. One thing I
needed to include with all of the curl commands above when I was first exploring was the
--insecure
flag. This is because the default SSL certs that most BMCs come with are self-signed
and have really small key lengths. Furthermore, when I first started testing with the Dell BMCs, my
first curl
returned the following error:
Apparently the Dell BMCs default to using Diffe-Hellman in the key exchange and this algorithm is
super weak. You can get around this by just telling curl
not to use the Diffe-Hellman when doing
key exchange with the --cipher
flag. So most of my curl
commands (when I was initially testing)
looked like curl -s --cipher 'DEFAULT:!DH' --insecure ...
.
Some of the resources I found super helpful while doing development were: