cURLing Your Way To Redfish Idiosyncrasies

2024-03-19

Many modern data centers actually have two separate networks. Most people only ever interact with the primary, "in-band" network. This is the network that servers use to talk with one another and share information. But there's second network, the "out-of-band" (OOB) network. This network is used exclusively for managing the data center itself.

One of the ways the OOB network is used to manage a data center is to talk with servers' BMCs. BMCs are little computers that sit on the back of servers and can be used to run diagnostics, collect telemetry, and perform various administrative functions. The primary method of interaction with BMCs for most people is a protocol called IPMI. IPMI is...not great. It's essentially a custom network protocol which means if you want to write software using it you've got a lot of work ahead of you.

In 2014 we got something "better": Redfish. Redfish has two primary improvements over IPMI:

  1. Redfish is a REST API that can be used with any HTTP client.
  2. All the data sent to and from the BMC is formatted in nice, easily readable JSON.

I've been working on automating some of our server management at work, and I was excited by the prospect of using something different from IPMI. While it's far from perfect, Redfish seemed like a step up. What's more, since everything is just standard-ish REST, I could use curl to test out and explore various bits of the API before writing any real code.

Redfish Examples

Let's say we have a BMC sitting on our OOB network with the IP address 10.91.148.139. We can query the "redfish root" by doing a simple curl on that host with the path /redfish/v1:

kurtis@prodserver1:~$ curl -s "https://10.91.148.139/redfish/v1" | jq
{
  "AccountService": {
    "@odata.id": "/redfish/v1/AccountService"
  },
  "Chassis": {
    "@odata.id": "/redfish/v1/Chassis"
  },
  "EventService": {
    "@odata.id": "/redfish/v1/EventService"
  },
  "Links": {
    "Sessions": {
      "@odata.id": "/redfish/v1/SessionService/Sessions"
    }
  },
  "Managers": {
    "@odata.id": "/redfish/v1/Managers"
  },
  "Name": "Root Service",
  "RedfishVersion": "1.1.0",
  "Registries": {
    "@odata.id": "/redfish/v1/Registries"
  },
  "SessionService": {
    "@odata.id": "/redfish/v1/SessionService"
  },
  "Systems": {
    "@odata.id": "/redfish/v1/Systems"
  },
  "Tasks": {
    "@odata.id": "/redfish/v1/TaskService"
  },
  "UpdateService": {
    "@odata.id": "/redfish/v1/UpdateService"
  }
}

I've elided some of the data here, but as we can see, there's a lot of useful stuff. But if we want to look at any of it, we're going to need to establish an actual session. This can be done by doing a POST to the Session's link we received in the above response, i.e. "creating a session" in REST parlance. Let's login as an administrator:

kurtis@prodserver1:~$ curl -s -X POST -H "Content-Type: application/json" "https://10.91.148.139/redfish/v1/SessionService/Sessions" -D headers.txt -d '{"UserName": "adminstrator", "Password":"supersecret"}' | jq
{
  "Description": "Session for user admin",
  "Id": "4d6137cac0b5a5df73a571f89a2237a2",
  "Name": "admin Session",
  "UserName": "admin"
}
kurtis@prodserver1:~$ cat headers.txt
HTTP/1.1 201 Created
Location: /redfish/v1/SessionService/Sessions/4d6137cac0b5a5df73a571f89a2237a2
X-Auth-Token: 3HJtq2E=
Content-Type: application/json; charset=UTF-8
Content-Length: 593

The headers we received with the response (output to headers.txt) included an X-Auth-Token header. We can use this token in all future requests to act as the administrator. To any one with a passing familiarity with REST, this should all be very familiar. This is all very standard which is great! For example, using one of the links we got in our first response, we can now get information regarding all of the accounts on the BMC:

kurtis@prodserver1:~$ curl -s -H "X-Auth-Token: 3HJtq2E=" "https://10.91.148.139/redfish/v1/AccountService/Accounts" | jq
{
  "Members": [
    {
      "@odata.id": "/redfish/v1/AccountService/Accounts/1"
    },
    {
      "@odata.id": "/redfish/v1/AccountService/Accounts/2"
    }
  ],
  "Members@odata.count": 2,
  "Name": "Accounts Collection"
}

If we wanted to, we could query the individual accounts returned here. By now, you should be noticing a pattern. Redfish responses typically include links to further information. Also, as one might expect, if we wanted to create a whole new account we could do it using a properly formulated POST to the /redfish/v1/AccountService/Accounts path. Standard REST, it's a wonderful thing.

Idiosyncrasy 1: Failed Logins

So far, all the commands I've been running have been on a Quanta server. However, when I started working with a Dell server, I started to notice some...unexpected deviations. For instance, one of the forms of automation I've been creating is ensuring that certain accounts are present on the BMC. Since I'm doing this as part of a bootstrapping process, I actually don't know if I'm able to log in and do any of the introspection I've shown above. Absent the ability to actually log in, the easiest way to determine if an account exists is to attempt to login as that account. If I get back a 401 response, it's safe to assume that the account doesn't exist on the BMC yet. At least, that's what I would expect from having worked with REST before. On the Quanta machine, this is exactly what happens:

kurtis@prodserver1:~$ curl -s -X POST -H "Content-Type: application/json" "https://10.91.148.139/redfish/v1/SessionService/Sessions" -D headers.txt -d '{"UserName": "admin", "Password":"badpassword"}' | jq
{
  "error": {
    "@Message.ExtendedInfo": [
      {
        "@odata.type": "#Message.v1_0_4.Message",
        "Message": "While attempting to establish a connection to /redfish/v1/SessionService/Sessions, the service was denied access.",
        "MessageArgs": [
          "/redfish/v1/SessionService/Sessions"
        ],
        "MessageId": "Security.1.0.AccessDenied",
        "Resolution": "Attempt to ensure that the URI is correct and that the service has the appropriate credentials.",
        "Severity": "Critical"
      }
    ],
    "code": "Security.1.0.AccessDenied",
    "message": "While attempting to establish a connection to /redfish/v1/SessionService/Sessions, the service was denied access."
  }
}
kurtis@prodserver1:~$ cat headers.txt
HTTP/1.1 401 Unauthorized
Content-Type: application/json; charset=UTF-8
Content-Length: 593

Let's try the same thing on a Dell machine (note that for Dell, the Sessions path is different, which is very much permitted by the Redfish spec).

kurtis@prodserver1:~$ curl -s -X POST -H "Content-Type: application/json" "https://10.91.148.140/redfish/v1/Sessions" -D headers.txt -d '{"UserName": "admin", "Password":"badpassword"}' 
<!DOCTYPE html>
<head>
    <title>Bad Request</title>
    <link rel="shortcut icon" href="data:image/x-icon;," type="image/x-icon">
</head>
<body>
<h2>Access Error: 400 -- Bad Request</h2>
<pre></pre>
</body>
</html>
kurtis@prodserver1:~$ cat headers.txt
HTTP/1.1 400 Bad Request
Content-Type: text/html
Content-Length: 227

This is very different from what we got with the Quanta server and is less than ideal for a number of reasons. First of all, we only get back a 400 Bad Request. This error code is pretty generic and could mean a lot of different things (as Dell's documentation says). In addition, instead of getting back a JSON response, we've gotten back a pretty generic HTML page. In short, there's no way to know for sure that problem here is a bad username and password. I couldn't really find a good solution to this. Since I've actually ran my automation code and confirmed it works for valid credentials, I figured it was safe to assume that if I get a 400 when I'm doing auth on a Dell machine it's because of auth problems, not because of, say, a malformed request. If anyone from Dell is reading this, I would love to chat about how we can make this better. I'm pretty sure just having a JSON response with a well-defined error code would do the trick.

Idiosyncrasy 2: Account Creation

The next step in my automation journey was to actually create accounts that were missing. On Quanta, this works just the way you think it would. You curl the accounts path with a POST:

kurtis@prodserver1:~$ curl -s -X POST -H "Content-Type: application/json" -H "X-Auth-Token: 3HJtq2E" "https://10.91.148.139/redfish/v1/AccountService/Accounts" -d '{"UserName": "test", "Password":"testpass", "RoleId": "Administrator"}' | jq
{
  "@odata.context": "/redfish/v1/$metadataAccountService/Members/Accounts",
  "@odata.etag": "W/\"1590361768\"",
  "@odata.id": "/redfish/v1/AccountService/Accounts",
  "@odata.type": "#ManagerAccount.v1_0_3.ManagerAccount",
  "Description": "",
  "Enabled": true,
  "Id": "3",
  "Links": {
    "Role": {
      "@odata.id": "/redfish/v1/AccountService/Roles/Administrator"
    }
  },
  "Locked": false,
  "Name": "Account3",
  "RoleId": "Administrator",
  "UserName": "test"
}

We could even remove the account with a DELETE on the returned id:

kurtis@prodserver1:~$ curl -s -X DELETE -H "X-Auth-Token: 3HJtq2E" "https://10.91.148.139/redfish/v1/AccountService/Accounts/3"

But with Dell, things are completely different. The accounts path doesn't support a POST or DELETE, it only supports a PATCH. What? How are we supposed to create or delete accounts? After some digging, it turns out that all accounts a Dell BMC can have are already created, they're just all disabled, have their role set to None, and have blank credentials. So to "create" and account, you update the account like so:

prodadmin@prodserver1:~$ curl -s -X PATCH -H "X-Auth-Token: 1455555=" "https://10.91.148.140/redfish/v1/Managers/iDRAC.Embedded.1/Accounts/3" -H "Content-Type: application/json" -d '{"UserName":"test","Password":"testpass","RoleId":"Administrator","Enabled":true}' 

To "delete" the account you actually need to make two separate requests, one to disable the account and the other to clear out it's credentials (you'll get an error if you try to do both in the same request):

prodadmin@prodserver1:~$ curl -s -X PATCH -H "X-Auth-Token: 1455555=" "https://10.91.148.140/redfish/v1/Managers/iDRAC.Embedded.1/Accounts/3" -H "Content-Type: application/json" -d '{"RoleId":"None","Enabled":false}'
prodadmin@prodserver1:~$ curl -s -X PATCH -H "X-Auth-Token: 1455555=" "https://10.91.148.140/redfish/v1/Managers/iDRAC.Embedded.1/Accounts/3" -H "Content-Type: application/json" -d '{"UserName":""}'

This is...super weird. I'm sure it's due to some implementation details on the Dell BMC and perhaps related to backwards compatibility with Dell's proprietary management, iDRAC. But still, I'm sure there was something they could've done to paper over this. This is hardly what I'd call proper REST.

Can we do better?

When all is said and done, I'm not sure how much of a win Redfish is. The ability to have fairly standard responses over HTTP with wide language support is indeed better than what we had before. Furthermore, all of the curl I've show above was super helpful in exploring the way different vendors implement Redfish. I was able to take all of my learnings and easily translate them into the actual code. Doing something like this would be much harder if the Redfish spec used something other than HTTP.

Still, as we've seen above, vendor-specific idiosyncrasies are prevalent, and that's really what I'd like to see solved. I was just trying to set up user accounts, something that doesn't feel like it needs to be different for each vendor. The whole thing is much worse if you want to do something more complicated in a vendor agnostic way (e.g. set BIOS boot order). Ultimately, given all the special cases, at work we ended up just sticking with IPMI for our automation (we just "shell out" to IPMI from Go code).

You could try to lock down the spec even more, but REST itself has a lot of wiggle room. One could argue the last thing we need is another standard, but I can't help but wonder if we could do better. Perhaps if we had a more detailed spec and used a true IDL (e.g. Protobuf), we could actually get closer to something like a generalized BMC Client that works across vendors.

Appendix

In a lot of the examples I elided some output and some input for the sake of clarity. One thing I needed to include with all of the curl commands above when I was first exploring was the --insecure flag. This is because the default SSL certs that most BMCs come with are self-signed and have really small key lengths. Furthermore, when I first started testing with the Dell BMCs, my first curl returned the following error:

curl: (35) error:141A318A:SSL routines:tls_process_ske_dhe:dh key too small

Apparently the Dell BMCs default to using Diffe-Hellman in the key exchange and this algorithm is super weak. You can get around this by just telling curl not to use the Diffe-Hellman when doing key exchange with the --cipher flag. So most of my curl commands (when I was initially testing) looked like curl -s --cipher 'DEFAULT:!DH' --insecure ....

Some of the resources I found super helpful while doing development were: