Converting Active Directory SIDs to names using LDAP + Rust

CyberSift recently had a requirement to be able to convert Active Directory SIDs into user friendly names. SIDs crop up whenever you look at security related Windows Event logs, such as event 4627(S): Group membership information, and would be similar in format to: S-1-5-21-1377283216-344919071-3415362939-1104. This format is obviously not very readable by anyone who looks at it, and unfortunately Windows event logs do not always translate the SID into the more familiar username/group name

In order to avoid this issue, we planned to leverage LDAP since Active Directory servers store a mapping of name to SID. A rust based LDAP client would query the AD Domain Controller, and retrieve a list of SID to name mappings which can be used to translate the SIDs.

Our initial approach was a bit naive… simply use the ldap3 rust crate, set a correct filter to retrieve the required information, and cache it. The filter is very straightforward and follows standard LDAP filter notation. For example, in order to query for groups, we used:


Specifically in the ldap3 crate, you can use LDAP filters along with the “streaming_search_with” function. This function also allows you to specify which LDAP attributes you’d like to receive, which in our case would be “objectSid” and “name”, leading to the following call:

        vec!["objectSid", "name"]

The results are passed into a “SearchEntry” parser which – as per the documentation states – will “Parse raw BER data and convert it into attribute map(s).“, into a struct which looks like this:

pub struct SearchEntry {
    pub dn: String,
    pub attrs: HashMap<String, Vec<String>>,
    pub bin_attrs: HashMap<String, Vec<Vec<u8>>>,

Running all the above worked well, and the “attrs” HashMap did have the “name” attribute we asked for… but not the “objectSid”, leaving us perplexed as to what was going wrong.

Searching through the Internet got us across a very interesting link:

How do I convert a SID between binary and string forms?

So it seems like the “objectSid” is not stored as a simple string on LDAP, but rather as a binary blob. We confirmed this by checking the “bin_attrs” hash map which is generated in SearchEntry, and (to our relief) finding the “objectSid” there. That still left us with the challenge of actually translating this blob into a string.

This blob is represented in Rust as Vec<u8>, and the function we ended up using to translate this into a string by following the previously linked article end up being this:

fn bytes_to_sid(binary :&Vec<u8>) -> String {
let version = binary[0];
let length = binary[1] as usize;
let authority = u64::from_be_bytes([0, 0, binary[2], binary[3], binary[4], binary[5], binary[6], binary[7]]);
let mut string = format!("S-{}-{}", version, authority);
let binary = &binary[8..];
assert_eq!(binary.len(), 4 * length);
for i in 0..length {
let start = 4 * i;
let end = start + 4;
let value = u32::from_le_bytes(binary[start..end].try_into().unwrap());
string += &format!("-{}", value);
view raw hosted with ❤ by GitHub

The most noteworthy line IMHO that intially had stumped me was line 5 above. If you refer to the handy table in the MS article, you’d see:

Line 5 in the code corresponds to our attempt in the 3 row of the table. Note how all the other rows are either a single byte, or four bytes, which both have convenient representations in Rust. For example:

4 bytes = (4 * 8 bits) = 32bits, so the u32 data type is convenient to use here as it allows us to call the “from_le_bytes” function and get the number in little-endian format as per the table above.

But 6 bytes = (6 * 8 bits) = 48bits… and as you probably expect there is no u48 data type. The next biggest data type after u32 is u64. The answer is obvious now but did not occur to me until I read the solution presented here:

… simply create a u64 data type and pad the initial two bytes to 0… which is essentially what we do:

u64::from_be_bytes([0, 0, binary[2], binary[3], binary[4], binary[5], binary[6], binary[7]]);

Note the first two “0”s in our array before we continue on reading in the rest of our data. Also note the use of “from_be_bytes” to get the number in big-endian format as per the table above