Making “certificate-transparency-go” tools more accessible

While researching the best way to implement the SSL certificate monitoring feature for our Tutela product, we ran across the excellent Certificate Transparency Project. The project aims to “watch the watchers”, and provide independent certificate logs to monitor Certificate Authorities. Unfortunately for us, the project API endpoints do not return an easy-to-parse data feed. Instead, the logs contain base-64 binary encoded data representations of the certificates detected.

We won’t go into detail on how to parse this data structure from scratch, because there is some excellent material on the web on how to do this already. Instead, we were lucky enough that the official Google tools related to this project are written in Golang, the same language we use internally. The Google Github repository containing the code can be found here:

https://github.com/google/certificate-transparency

A quick look around that repository shows you Google are showcasing everything that’s possible with this project (rightly so – that’s part of good documentation). However, what if you are looking for only a small subset of the data returned (as was our use case)? We needed to dumb-down the toolset provided by Google, so as a quick proof of concept we set out to see if we could write a Golang utility which had a single goal:

“Given a start and end index, output the Subject of a certificate (i.e. which domain the certificate was assigned to)”

The below is the result:

package main
import (
"bytes"
"crypto/sha256"
"encoding/base64"
"encoding/json"
"fmt"
"net/http"
"strings"
ct "github.com/google/certificate-transparency-go"
ctTls "github.com/google/certificate-transparency-go/tls"
ctX509 "github.com/google/certificate-transparency-go/x509"
)
type CertData struct {
LeafInput string `json:"leaf_input"`
ExtraData string `json:"extra_data"`
}
type CertLog struct {
Entries []CertData
}
func getCerts(start int, end int) CertLog {
url := fmt.Sprintf("https://ct.googleapis.com/logs/argon2021/ct/v1/get-entries?start=%d&end=%d", start, end)
resp, err := http.Get(url)
if err != nil {
fmt.Println(err)
}
buf := new(bytes.Buffer)
buf.ReadFrom(resp.Body)
var results CertLog
json.Unmarshal(buf.Bytes(), &results)
return results
}
func testSslCertMonitor(startIndex int64, endIndex int64) {
total := 0
for i := startIndex; i < endIndex; i += 20 {
certs := getCerts(i, i+19)
for _, cert := range certs.Entries {
total++
testBytes, _ := base64.RawStdEncoding.DecodeString(cert.LeafInput)
var test ct.MerkleTreeLeaf
ctTls.Unmarshal(testBytes, &test)
switch eType := test.TimestampedEntry.EntryType; eType {
case 0:
cert, _ := ctX509.ParseCertificate(test.TimestampedEntry.X509Entry.Data)
for _, domain := range cert.DNSNames {
fmt.Println(domain)
fmt.Println("====================")
}
case 1:
cert, _ := ctX509.ParseTBSCertificate(test.TimestampedEntry.PrecertEntry.TBSCertificate)
for _, domain := range cert.DNSNames {
fmt.Println(domain)
fmt.Println("====================")
}
default:
// TODO(pavelkalinnikov): Section 4.6 of RFC6962 implies that unknown types
// are not errors. We should revisit how we proccess this case.
fmt.Printf("unknown entry type: %v", eType)
}
}
}
fmt.Println(total)
}
func main(){
testSslCertMonitor(218772510, 218772520)
}
https://gist.github.com/dvas0004/c6037de1ef6bc66e6d52b3d562ad690c

Building and running the above will return all domains which had a certificate issued to it starting from index 218772510 and ending with index 218772520.

Note:

  • Note that in lines 13-14 we explicitly imported the tls and x509 packages under different names from the certificate-transparency-go repo (ctTls and ctX509 respectively). This has to be done since these two packages are also defined in the “crypto” golang package, but are slightly different to cater for differences in parsing packages. To avoid namespace collision this step is important
  • The function getCerts defined in lines 28-45 is straightforward… it simply retrieves the raw certificate data from the JSON API, using types defined in lines 18-25
  • The main work happens in the testSslCertMonitor in lines 47-92. This has been distilled from Google’s code, but essentially you can see a loop which grabs 20 certificates at a time (the max allowed is about 31), and depending on the type of data returned, parsing the binary data into the previously defined types before printing out the subject domain name
sample output

Hopefully this gist would make it easier to fit this excellent repo into your own codes and projects 🙂