AWS Developer Tools Blog

AWS SDK for Go 2.0 – Generated Marshalers

The AWS SDK for Go 2.0 has released generated marshalers for the restjson and restxml protocols. Generated marshalers will help with the performance and customer issues the SDK had been receiving.

To better understand what was causing the performance hit, we used Go’s benchmark tooling to help us determine the main bottleneck—reflection. The reflection package was consuming a large amount of memory and performance, as shown below.

Roughly 50% of the time is spent in the JSON marshaler, which uses a lot of the reflection package. To improve both memory and CPU performance, we implemented generated marshalers. The idea was to bypass the package that was affecting performance, like reflection, and set values directly in the query, header, or body of requests.

The following benchmark data was gathered from restjson and restxml benchmark tests on GitHub.

REST JSON Benchmarks

benchmark                                                   old ns/op     new ns/op     delta
BenchmarkRESTJSONBuild_Complex_ETCCreateJob-4               91645         36968         -59.66%
BenchmarkRESTJSONBuild_Simple_ETCListJobsByPipeline-4       8323          5722          -31.25%
BenchmarkRESTJSONRequest_Complex_CFCreateJob-4              274958        221579        -19.41%
BenchmarkRESTJSONRequest_Simple_ETCListJobsByPipeline-4     147774        140943        -4.62%

benchmark                                                   old allocs     new allocs     delta
BenchmarkRESTJSONBuild_Complex_ETCCreateJob-4               334            366            +9.58%
BenchmarkRESTJSONBuild_Simple_ETCListJobsByPipeline-4       73             59             -19.18%
BenchmarkRESTJSONRequest_Complex_CFCreateJob-4              679            711            +4.71%
BenchmarkRESTJSONRequest_Simple_ETCListJobsByPipeline-4     251            237            -5.58%

The restjson protocol shows great performance gains. We see that for the Complex_ETCCreateJob benchmark, the speed improved by 59.66%. However, the overall gains in memory allocation were far less than expected, and some benchmarks took even more memory

REST XML Benchmarks

benchmark                                                   old ns/op     new ns/op     delta
BenchmarkRESTXMLBuild_Complex_CFCreateDistro-4              212835        63765         -70.04%
BenchmarkRESTXMLBuild_Simple_CFDeleteDistro-4               8942          6893          -22.91%
BenchmarkRESTXMLBuild_REST_S3HeadObject-4                   17222         7194          -58.23%
BenchmarkRESTXMLBuild_XML_S3PutObjectAcl-4                  36723         14958         -59.27%
BenchmarkRESTXMLRequest_Complex_CFCreateDistro-4            416695        231318        -44.49%
BenchmarkRESTXMLRequest_Simple_CFDeleteDistro-4             143133        137391        -4.01%
BenchmarkRESTXMLRequest_REST_S3HeadObject-4                 182617        187526        +2.69%
BenchmarkRESTXMLRequest_XML_S3PutObjectAcl-4                212515        174650        -17.82%

benchmark                                                   old allocs     new allocs     delta
BenchmarkRESTXMLBuild_Complex_CFCreateDistro-4              1341           439            -67.26%
BenchmarkRESTXMLBuild_Simple_CFDeleteDistro-4               70             50             -28.57%
BenchmarkRESTXMLBuild_REST_S3HeadObject-4                   143            65             -54.55%
BenchmarkRESTXMLBuild_XML_S3PutObjectAcl-4                  260            122            -53.08%
BenchmarkRESTXMLRequest_Complex_CFCreateDistro-4            1627           723            -55.56%
BenchmarkRESTXMLRequest_Simple_CFDeleteDistro-4             237            217            -8.44%
BenchmarkRESTXMLRequest_REST_S3HeadObject-4                 452            374            -17.26%
BenchmarkRESTXMLRequest_XML_S3PutObjectAcl-4                476            338            -28.99%

The restxml protocol greatly improved both memory and speed. The RESTXMLBuild_Complex_CFCreateDistro benchmark had a about a 70% improvement in both.

Overall for both protocols there were massive speed improvements in more complex shapes, but only small improvements in the simple shapes. There are some outliers in the data that have minor performance or memory hits. However, there are more optimizations we can do to potentially eliminate them.

Try out the developer preview of the AWS SDK for Go 2.0 here, and let us know what you think in the comments below!