Answering my own coding exercise using Go and sharing a
couple of things I learned along the way about JSON and
concurrency.
A real world coding exercise
We have a one-hour open book coding exercise at Woolpert.
Each person interviewing for the team gets to answer it.
Because it’s open book, i.e., use Google, use Stackoverflow, use whatever, we feel it reflects real day-to-day programming.
The exercise is basically “Use a pre-determined API call to the GitHub API to download avatar photos of the top X
GitHub users.”
We ask people to get working code if possible, but if not, they should at least write pseudo code and explanatory notes.
The goals are:
- Get it running.
- Optional - make it concurrent
- Optional - make it robust
We’ve had people use bash, C#, Python, an JavaScript.
We find it a good way to quickly get a feel for how comfortable someone is getting data from an API.
Which these days is very much a real world daily or weekly task.
My attempt
I sat in bed one night and thought I’d have a crack with a language that’s relatively new to me: Go.
Here’s what I came up with in 60 minutes.
main.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
| package main
import (
"encoding/json"
"errors"
"fmt"
"io"
"io/ioutil"
"net/http"
"os"
"sync"
)
// Results of github API query
type Results struct {
Items []Person
}
// Person is a github user
type Person struct {
Login string
Avatar string `json:"avatar_url"`
}
func getPeople() []Person {
url := "https://api.github.com/search/users?q=followers:%3E10000+sort:followers&per_page=50"
res, err := http.Get(url)
if err != nil {
panic(err.Error())
}
// should be checking for specific header here before trying to read
// the body -- could have failed with overuse of API
body, err := ioutil.ReadAll(res.Body)
if err != nil {
panic(err.Error())
}
var result Results
json.Unmarshal([]byte(body), &result)
// unmarshall
return result.Items
}
func saveAvatar(p Person, wg *sync.WaitGroup) (outFile string, err error) {
// Didn't check for headers or 403. Doh!
outFile = fmt.Sprintf("%s.jpeg", p.Login)
//Get the response bytes from the url
response, err := http.Get(p.Avatar)
if err != nil {
return outFile, err
}
defer response.Body.Close()
if response.StatusCode != 200 {
return outFile, errors.New("Received non 200 response code")
}
//Create a empty file
file, err := os.Create(outFile)
if err != nil {
return outFile, err
}
defer file.Close()
//Write the bytes to the fiel
_, err = io.Copy(file, response.Body)
if err != nil {
return outFile, err
}
defer wg.Done()
fmt.Printf("Downloaded %s\n", outFile)
return outFile, nil
}
// Problem statement: download the avatars of the top 50 most popular GitHub users to local files. And
// do it concurrently if you have time. And watch out for X-RateLimit-Remaining or 403s.
func main() {
// use go run app.go to execute
people := getPeople()
var wg sync.WaitGroup
for _, p := range people {
wg.Add(1)
go saveAvatar(p, &wg)
}
wg.Wait()
}
|
Things I learned
net/http is nice
net/http
in the Go standard library is nice. I find myself always installing the requests
library in Python for sensible API work. Good to have that baked in.
JSON is nice but finicky
JSON handling in Go is nice. You just define a struct
(or even use an empty interface
. Optionally you can name fields in the struct something more meaningful while translating from the source JSON naming.
One thing that tripped me up (lot of rapid Googling!) was how to extract just a subset of the JSON document. For example, there are a lot of fields for each user in the Items
array that I mapped to Person[]
but I only needed a couple.
Turns out to be really simple, but not well documented–at least to someone under time pressure.
Basically you just omit fields for data you don’t care about and encoding/json
does the right thing.
Concurrency is easy
You can use goroutines
or in this case I found a sync.WaitGroup
to do what I needed. That’s nice succinct code.
1
2
3
4
5
6
| var wg sync.WaitGroup
for _, p := range people {
wg.Add(1)
go saveAvatar(p, &wg)
}
wg.Wait()
|
And in bash for fun
Marc on my team responded to this snippet with a fun little implementation in bash. It totally works, which just goes to show that there are many ways to tacklet a problem. And rarely just one right tool:
Both Marc and I are big fans of jq for slicing and dicing JSON.
And it makes short work of this task.
1
2
3
4
5
6
| curl -s 'https://api.github.com/search/users?q=followers:%3E10000+sort:followers&per_page=50' |
jq -r -c '.items[] | {item: (.avatar_url+ "," + .login)} | .item' |
while IFS=',' read -r url user; do
echo "Saving download to $user.jpeg"
curl -s $url > $user.jpeg &
done
|