PyPi Transparency Client (Rust)
I’ve finally being able to hack my way through to a working Rust gRPC client (for PyPi Transparency).
It’s not very good: poorly structured, hacky etc. but it serves the purpose of giving me a foothold into Rust development so that I can evolve it as I learn the language and its practices.
There are several Rust crates (SDK) for gRPC. There’s no sanctioned SDK for Rust on grpc.io.
I chose stepancheg’s grpc-rust because it’s a pure Rust implementation (not built atop the C implementation).
PyPi Transparency
I’ve been noodling around with another Trillian personality.
Another in a theme that interests me in providing tamperproof logs for the packages in the popular package management registries.
The Golang team recently announced Go Module Mirror which is built atop Trillian. It seems to me that all the package registries (Go Modules, npm, Maven, NuGet etc.) would benefit from tamperproof logs hosted by a trusted 3rd-party.
As you may have guessed, PyPi Transparency is a log for PyPi packages. PyPi is comprehensive, definitive and trusted but, as with Go Module Mirror, it doesn’t hurt to provide a backup of some of its data. In the case of this solution, Trillian hosts a log of self-calculated SHA-256 hashes for Python packages that are added to it.
Run cAdvisor when using Docker Compose
cAdvisor
has long been a favorite monitoring tool of mine. I’m using Docker Compose for local testing and have begun including a cAdvisor
in my docker-compose.yaml
files.
cadvisor:
restart: always
image: google/cadvisor:${CADVISOR_VERSION}
container_name: cadvisor
# command:
# - --prometheus_endpoint="/metrics" # Default
volumes:
- "/:/rootfs:ro"
- "/var/run:/var/run:rw"
- "/sys:/sys:ro"
- "/var/snap/docker/current:/var/lib/docker:ro" #- "/var/lib/docker/:/var/lib/docker:ro"
expose:
- "8080"
ports:
- 8080:8080
I’d not realized until recently, that cAdvisor
also surfaces a Prometheus metrics endpoint and so, if you do follow this path and you’re also using Prometheus, don’t forget to add cAdvisor
to your Prometheus targets.
Kubernetes Engine and Free Tier
Google Cloud Platform Free Tier appears (please verify this for yourself) to provide the ability to run a(n admittedly miniscule) Kubernetes cluster for free. So, why do this? It provides a definitive Kubernetes (Engine) experience on Google Cloud Platform that you may use for learning and testing.
Kubernetes Engine the master node(s) and the control plane are free.
Kubernetes (i.e. Compute Engine) nodes potentially incur charges including for the VM runtime and any attached storage, snapshots etc. However, charges for these resources can be partially covered by the Free Tier.
Cloud Functions Simple(st) HTTP Multi-host Proxy
Tweaked yesterday’s solution so that it will randomly select one from several hosts with which it’s configured.
package proxy
import (
"log"
"math/rand"
"net/http"
"net/url"
"os"
"strings"
"time"
)
func robin() {
hostsList := os.Getenv("PROXY_HOST")
if hostsList == "" {
log.Fatal("'PROXY_HOST' environment variable should contain comma-separated list of hosts")
}
// Comma-separated lists of hosts
hosts := strings.Split(hostsList, ",")
urls := make([]*url.URL, len(hosts))
for i, host := range hosts {
var origin = Endpoint{
Host: host,
Port: os.Getenv("PROXY_PORT"),
}
url, err := origin.URL()
if err != nil {
log.Fatal(err)
}
urls[i] = url
}
s := rand.NewSource(time.Now().UnixNano())
q := rand.New(s)
Handler = func(w http.ResponseWriter, r *http.Request) {
// Pick one of the URLs at random
url := urls[q.Int31n(int32(len(urls)))]
log.Printf("[Handler] Forwarding: %s", url.String())
// Forward to it
reverseproxy(url, w, r)
}
}
This requires a minor tweak to the deployment to escape the commas within the PROXY_HOST
string to disambiguate these for gcloud
:
Cloud Functions Simple(st) HTTP Proxy
I’m investigating the use of LetsEncrypt for gRPC services. I found this straightforward post by Scott Devoid and am going to try this approach.
Before I can do that, I need to be able to publish services (make them Internet-accessible) and would like to try to continue to use GCP for free.
Some time ago, I wrote about using the excellent Microk8s on GCP. Using an f1-micro
, I’m hoping (!) to stay within the Compute Engine free tier. I’ll also try to be diligent and delete the instance when it’s not needed. This gives me a runtime platform and I can expose services to the Instance’s (Node)Ports but, I’d prefer to not be billed for a simple proxy.
Visual Studio Code: gopls and YAML
The Go team is developing a Language Server Protocol [LSP] implementation) called gopls
. Visual Studio Code (and others) support LSP. Other languages (e.g. Python have LSP implementations too). I’ve been using gopls
for some time. It works (mostly) very well and replaces multiple, indepedent tools with two (gopls
and delve
).
My Visual Studio Code settings that include gopls
is:
"go.autocompleteUnimportedPackages": true,
"go.useLanguageServer": true,
"[go]": {
"editor.snippetSuggestions": "none",
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": true
}
},
"gopls": {
"usePlaceholders": true,
"wantCompletionDocumentation": true,
},
"go.toolsEnvVars": {
},
"go.languageServerFlags": [
"-rpc.trace",
"serve",
"--debug=localhost:6060",
],
"go.enableCodeLens": {
"references": true,
"runtest": true
},
One of the Google engineers working on gopls
gave a comprehensive and interesting overview of the tool at GopherCon 2019.
pypi-transparency
The goal of pypi-transparency is very similar to the underlying motivation for the Golang team’s Checksum Database (also built with Trillian).
Even though, PyPi provides hashes of the content of packages it hosts, the developer must trust that PyPi’s data is consistent. One ambition with pypi-transparency is to provide a companion, tamperproof log of PyPi package files in order to provide a double-check of these hashes.
It is important to understand what this does (and does not) provide. There’s no validation of a package’s content. The only calculation is that, on first observation, a SHA-256 hash is computed of the package’s content and the hash is recorded. If the package is subsequently altered, it’s very probable that the hash will change and this provides a signal to the user that the package’s contents has changed. Because pypi-transparency uses a tamperproof log, it’s very difficult to update the hash recorded in the tamperproof log, to reflect this change. Corrolary: pypi-transparency will record the hashes of packages that include malicious code.
Welcome
Now that I’ve (p)retired from Google, I’m starting this blog and will no longer post stories to Medium.
As I concluded my time at Google, I wrapped up work on a Trillian prototype. As it remains Google’s IP, I’m not permitted to discuss it here.
I’ve begun work on another Trillian prototype for Python package transparency, informally pypi-transparency.