Quickstart¶
1. Start the Server¶
# Default (localhost:5555)
maru-server
# With custom host/port
maru-server --host 0.0.0.0 --port 5556
# With debug logging
maru-server --log-level DEBUG
2. Run a Client Example¶
maru-servermust be running before proceeding. See step 1 above.
Zero-Copy Store & Retrieve¶
from maru import MaruConfig, MaruHandler
config = MaruConfig(
server_url="tcp://localhost:5555",
pool_size=1024 * 1024 * 100, # 100MB
)
with MaruHandler(config) as handler:
data = b"A" * (1024 * 1024) # 1MB KV chunk
# 1. Allocate a page in CXL shared memory
handle = handler.alloc(size=len(data))
# 2. Write directly to CXL memory (mmap — no intermediate buffer)
handle.buf[:] = data
# 3. Register the key — only metadata (key → region, offset) is sent
handler.store(key=42, handle=handle)
# Retrieve: returns a memoryview pointing into CXL memory
result = handler.retrieve(key=42)
assert result is not None
assert bytes(result.view[:5]) == b"AAAAA"
The three steps — alloc(), write to handle.buf, store(handle=) — make Maru’s control/data plane separation explicit:
Control plane:
alloc()requests a memory region from the Resource Manager;store()registers the key’s location metadata (~tens of bytes) with MaruServer.Data plane:
handle.buf[:]reads/writes directly on mmap’d CXL memory. No server involved.
Full runnable example:
examples/basic/single_instance.py
Cross-Instance Sharing¶
This is the core of Maru — two independent processes sharing KV cache through CXL shared memory with zero data copy. Metadata travels, data doesn’t.
flowchart LR
P[Producer] -->|"alloc + write"| CXL[(CXL Memory)]
P -->|"store(key)"| S[MaruServer]
C[Consumer] -->|"retrieve(key)"| S
S -.->|"location"| C
C -->|"read"| CXL
Terminal 1 — Producer: allocate, write, register metadata:
from maru import MaruConfig, MaruHandler
config = MaruConfig(
server_url="tcp://localhost:5555",
instance_id="producer",
pool_size=1024 * 1024 * 100,
)
with MaruHandler(config) as handler:
for i, key in enumerate([1001, 1002, 1003]):
data = bytes([ord("A") + i]) * (1024 * 1024)
handle = handler.alloc(size=len(data))
handle.buf[:] = data
handler.store(key=key, handle=handle)
input("Press Enter to exit...")
Terminal 2 — Consumer: retrieve from CXL (zero copy):
from maru import MaruConfig, MaruHandler
config = MaruConfig(
server_url="tcp://localhost:5555",
instance_id="consumer",
pool_size=1024 * 1024 * 100,
)
with MaruHandler(config) as handler:
for key in [1001, 1002, 1003]:
result = handler.retrieve(key=key)
# result.view points directly into Producer's CXL region (mapped read-only).
# No data was copied — consumer reads the same physical memory.
assert result is not None
print(f"key={key}: {len(result.view)} bytes, char={chr(result.view[0])!r}")
Full runnable examples:
examples/basic/producer.py+examples/basic/consumer.pySingle-script version:
examples/basic/cross_instance.py
Next Steps¶
LMCache Integration — Use Maru as a shared KV cache backend for LMCache/vLLM
Architecture Overview — System architecture and component interactions
API Reference — Python API documentation
Configuration — Configuration options