REST + JSON's biggest strength is human readability. But for internal microservice traffic — where nobody reads the bytes and latency and throughput matter — the cost of JSON parsing, a text format, and HTTP/1.1 head-of-line blocking shows up. gRPC, open-sourced by Google in 2015, removes these costs via protobuf's binary wire format, HTTP/2 multiplexed streams, and code generation. This guide covers how gRPC actually works, its four RPC modes, where it beats and loses to REST, and why you still can't call it directly from a browser.
The Big Picture
.proto file (schema)
│
│ protoc + plugin
▼
client stub server skeleton
(Java/Go/Py/…) (Java/Go/Py/…)
│ ▲
│ method call │ method impl
▼ │
┌─────────────────────────────┐
│ gRPC runtime │
│ ┌─────────────────────────┐ │
│ │ protobuf encode/decode │ │
│ ├─────────────────────────┤ │
│ │ HTTP/2 frames │ │ ← multiplexed streams
│ ├─────────────────────────┤ │
│ │ TCP + TLS │ │
│ └─────────────────────────┘ │
└─────────────────────────────┘
Key idea: developer defines one .proto → stubs generated in both
languages → calling those stub methods looks like "a regular function"
but is actually an RPC..proto and Code Generation
// user.proto
syntax = "proto3";
package myapp;
service UserService {
rpc GetUser (GetUserRequest) returns (User);
rpc ListUsers (ListUsersRequest) returns (stream User); // server stream
rpc UpdateProfile (stream ProfilePatch) returns (User); // client stream
rpc Chat (stream Message) returns (stream Message); // bidi
}
message GetUserRequest { int64 id = 1; }
message User { int64 id = 1; string name = 2; string email = 3; }
// Compile
protoc --go_out=. --go-grpc_out=. user.proto
protoc --python_out=. --grpc_python_out=. user.proto
// Use the Go client
client := pb.NewUserServiceClient(conn)
user, err := client.GetUser(ctx, &pb.GetUserRequest{Id: 42})
// → Looks like a normal call, but internally: protobuf encode + HTTP/2 callProtobuf — Why It Beats JSON
JSON:
{"id":42,"name":"jade","email":"x@y.com"}
→ 38 bytes, expensive to parse (string → number, key matching)
Protobuf (wire format):
08 2a 12 04 6a 61 64 65 1a 07 78 40 79 2e 63 6f 6d
│ │ │ │ jade │ x@y.com
│ │ │ │ field 3, length-delimited (7 bytes)
│ │ field 2, length-delimited (4 bytes)
│ varint(42) = id value
field 1, varint (tag = 1<<3 | 0)
17 bytes total — roughly half.
Pros:
- Compact (less network)
- Fast (direct field-number mapping, no key matching)
- Schema enforced (zero runtime typos)
Cons:
- Not human-readable — need grpcurl / proto reflection for debugging
- Useless without the schema — old binary logs are hard to parseHTTP/2 — gRPC's Second Foundation
HTTP/1.1:
One TCP connection = one in-flight request (head-of-line blocking)
100 requests = 100 connections or 100 sequential trips
HTTP/2:
Multiplexed streams over one TCP connection — 100 concurrent requests
Binary framing — no text parsing
Header compression (HPACK) — repeated header costs ↓
Server push (rarely used)
gRPC mapping:
1 RPC = 1 HTTP/2 stream
request / response = HEADERS frame + DATA frames + trailer
→ Thousands of concurrent RPCs over a single connection. Connection
setup (TLS handshake) happens just once.
cf. REST + HTTP/1.1: a round trip per request, mitigated by keep-alive
but no multiplexing. REST over HTTP/2 inherits some of the benefit.The Four RPC Modes
# 1. Unary — most common, equivalent to a single REST call
rpc GetUser (Req) returns (Resp);
client → 1 request, server → 1 response.
# 2. Server Streaming
rpc ListUsers (Req) returns (stream User);
client → 1 request, server → N responses (one stream).
Use: large result sets, progress updates, server-side push.
# 3. Client Streaming
rpc Upload (stream Chunk) returns (UploadResult);
client → N requests (one stream), server → 1 response (at the end).
Use: file uploads, ingesting sensor data.
# 4. Bidirectional Streaming
rpc Chat (stream Msg) returns (stream Msg);
client ↔ server, two independent streams.
Use: chat, real-time games, collaborative editing.
→ REST over plain HTTP needs SSE / WebSocket / long-poll as separate
mechanisms for streaming. gRPC handles all four modes in one
framework.Deadlines · Cancellation · Metadata
# Deadline (timeout)
ctx, cancel := context.WithTimeout(ctx, 200*time.Millisecond)
client.GetUser(ctx, ...)
→ Auto-cancel if no response in 200ms.
→ Propagation: if that server makes another gRPC call, the deadline
is inherited → cascade timeouts handled naturally.
# Cancellation
Client cancels → the server stream gets a cancel signal too.
→ Halt unnecessary work immediately (long queries, big responses).
# Metadata (equivalent to headers)
md := metadata.Pairs("authorization", "Bearer …", "trace-id", "abc")
ctx := metadata.NewOutgoingContext(ctx, md)
→ Same role as REST headers, key/value pairs.
# Status codes
gRPC has its own status codes (12 + 1) — different from HTTP statuses.
OK / CANCELLED / DEADLINE_EXCEEDED / NOT_FOUND / PERMISSION_DENIED /
RESOURCE_EXHAUSTED / UNAVAILABLE / INTERNAL …vs REST — When to Use What
| Axis | REST + JSON | gRPC + protobuf |
|---|---|---|
| Payload size | Large (text) | Small (binary) |
| Parse cost | High | Low |
| Schema enforcement | OpenAPI, separate | .proto is the schema |
| Streaming | SSE / WS, separate | Native, 4 modes |
| Multiplexing | Needs HTTP/2 | Built-in |
| Human readable | Yes (curl works) | No (grpcurl etc.) |
| Browser support | Native fetch | No (needs gRPC-Web) |
| Cache-friendly | HTTP cache works | No (POST-only) |
| External exposure | Standard (familiar) | Rare (usually via gateway) |
Browser Limits — gRPC-Web
Browser fetch / XHR can't reach some HTTP/2 features (trailer headers,
raw frame control). So you can't call pure gRPC directly.
Solution: gRPC-Web
- Browser ↔ proxy (Envoy / grpc-web-proxy) speaks HTTP/1.1 or a
restricted HTTP/2 subset
- proxy ↔ backend speaks real gRPC
- Some streaming modes (client / bidi) are unsupported or hacky
→ So public-facing APIs are usually REST or GraphQL, and gRPC is
common only for internal service-to-service traffic.
→ Connect-RPC / Twirp are alternative designs that work directly from
browsers (HTTP/1.1 + JSON too).Common Pitfalls
- Schema breaking changes — protobuf field numbers are forever. Never reuse them. Mark deletions as reserved. Type changes also break compatibility (int32 → int64 OK, int32 → string not OK).
- HTTP status vs gRPC status confusion — a successful gRPC transport is HTTP 200; the actual OK / NOT_FOUND comes back as a gRPC status code in the trailer. Monitoring has to look at both.
- Load balancer compatibility — HTTP/2's multiplexed connections don't play well with L4 load balancers (one connection sticks to one server → unbalanced load). Use L7 (Envoy, nginx 1.13+) or client-side load balancing (xDS).
- No deadline — if the client doesn't set a deadline, the call can hang. The standard pattern is to require deadlines on every RPC.
- Simplistic error model — 13 gRPC status codes aren't enough for domain errors. Attach structured errors via google.rpc.Status's details (Any).
- Auth for external exposure — gRPC supports Bearer tokens via metadata. But IAM / OAuth2 integration is custom code. See
oauth2-explained. - Generated-code build burden — every .proto change requires regenerating stubs in every language. Manage with monorepo + buf or similar.
Wrap-up
gRPC's strength in one line: strong schemas + binary wire + HTTP/2 multiplexing + streaming + multi-language code generation. When all four stack up (many internal services, polyglot, high throughput), gRPC is overwhelmingly better than REST.
Conversely, for external public APIs, direct browser callers, and curl-heavy debugging, REST + JSON still wins. The practical pattern: gRPC inside, an edge gateway that translates to REST/GraphQL for the outside world. Cache, rate-limit, and other HTTP-based concerns (cors-explained, rate-limiting-strategies) live at that gateway.