XArith is a device-side C++ library for XCENA's Computational Memory (MX1), providing high-performance vector computation primitives for VPE (Vector Processing Engine). Used with the MU library (mu/mu.hpp), it enables developers to build high-performance MU kernels running on the MX1's vector processing hardware.

Key Characteristics

Device-only library: Runs exclusively on MX1, not on the host
Thread-safe: Safe for concurrent access from multiple threads. For best performance, each thread should use its own VpeContext
Synchronous operations: All vector operations block until completion. Asynchronous operations will be available in a future release

API Reference

Context

Method	Description
`VpeContext(dimension, strategy)`	Create context with vector dimension and VPE strategy
`getDimension()`	Get vector dimension
`getAvailableBufferCount()`	Get number of available slots
`getMaxBufferCount()`	Get maximum buffer slots

Buffer Management

Method	Description
`allocateBuffer()`	Allocate SRAM buffer, returns offset or `INVALID_BUFFER`
`tryAllocateBuffers(count, outBuffers)`	Atomically allocate multiple buffers (all-or-nothing), returns `true` on success
`freeBuffer(offset)`	Free previously allocated buffer
`freeBuffers(count, buffers)`	Free multiple buffers in a single atomic operation
`load(srcDram, destBuffer)`	Load vector from DRAM to SRAM
`store(srcBuffer, destDram)`	Store vector from SRAM to DRAM

Vector Operations

Method	Description
`dot(buf1, buf2)`	Compute dot product
`add(src1, src2, dest)`	Element-wise addition: dest = src1 + src2
`sub(src1, src2, dest)`	Element-wise subtraction: dest = src1 - src2
`mul(src1, src2, dest)`	Element-wise multiplication: dest = src1 * src2
`div(src1, src2, dest)`	Element-wise division: dest = src1 / src2
`square(src, dest)`	Element-wise square: dest = src * src
`bitwiseXor(src1, src2, dest)`	Element-wise XOR: dest = src1 ^ src2
`addReduce(src1, src2)`	Sum of element-wise addition
`subReduce(src1, src2)`	Sum of element-wise subtraction
`divReduce(src1, src2)`	Sum of element-wise division
`equals(buf1, buf2)`	Check if vectors are equal
`isAllZero(buf)`	Check if all elements are zero

VpeIdStrategy Options

Each Sub contains two VPEs. The strategy determines how work is distributed across them.

Strategy	Value	Description
`ByThreadId`	Default	Distributes threads within same MU across VPEs
`ByClusterId`	Alternative	Distributes MUs from different clusters across VPEs