v1.4.7 (2026-05-28)

Download

New

pxl
  • PXL: Multi-chunk allocations served by virtual-contiguity scatter v2p remap — memAlloc(N × chunkSize) now succeeds against a fragmented daemon chunk pool by binding physically scattered chunks into a virtually contiguous host range.

Changed

Fixed

Docker Image

  • Image Name: xcenadev/sdk:v1.4.7
  • Pull Command:

    docker pull xcenadev/sdk:v1.4.7
    

v1.4.6 (2026-05-15)

Download

Important Notes

  • Internal bandwidth improved by ~20% through internal tuning.
pxl
  • Context::stream() removed from the public API. Use the Stream that each Job owns internally, or allocate a user-managed Stream via pxl::createStream().
  • Concurrent Job count is now effectively bounded by the per-process Stream pool capacity (128 streams, shared with user-created Streams). createJob* returns nullptr / Result::Failure when the pool is exhausted.
  • Multi-stream usage with interleaved flush/execute may observe conservative over-sync (never under-sync) due to max-size-wins matching policy.
  • subFree() and subFree(numSub) invalidate all Maps owned by the Job because each Map snapshots its Job’s Sub assignment at construction time.
  • MemInfo_t struct size changed (deviceId field added). Recompilation required.
  • Reduced nop task turnaround time by ~30–70% on average vs. 1.4.5, with up to 84% improvement in some cases.

New

mxdriver_dev
  • mxdriver: DKMS support.
pxl
  • PXL: Added Map::cancel() — soft cancellation that drains in-flight batches, skips unissued ones, and parks the map in the new ExecuteStatus::Cancelled terminal state. The map is re-executable after cancel; argument caches are preserved.
  • PXL: Added Map::getProgress() returning a Progress_t snapshot (target / issued / done packet counts) for live and post-cancel progress inspection.
  • PXL: Added pxl/compat.hpp with deprecated aliases for the pre-3.0 API (pxl::runtime::, pxl::kernel::, pxl::memory::, pxl::PxlResult, pxl::memory::systemCommand). Scheduled for removal in a future release.
  • PXL: Added Device-less Launcher API — pxl::Launcher().execute<K>(N, args).run() for single-line kernel execution with automatic device detection.
  • PXL: Added pxl::allocateMemory / pxl::releaseMemory free functions for device memory allocation.
xtop
  • xtop: Multi-device profiling — xtop run auto-detects every device on the node and captures one .xpti per device.
  • xtop: Live recording — xtop view --record FILE.xpti streams the live monitor into an .xpti for later replay.
  • xtop: Post-mortem workflow split — xtop convert (Perfetto), xtop export (CSV/raw), xtop open (browser) replace the bundled run-time export path.

Changed

mxdriver_dev
  • driver: Reduced per-call overhead of command submission ioctl in light workloads by avoiding unnecessary synchronous PCIe reads.
pxl
  • Context::createJob* now allocates a private default Stream per Job from the per-process Stream pool. Multi-Job workloads that previously shared one Context-level Stream are no longer serialised behind a single FIFO.
  • PXL: pxl::flushHostCache(ptr, size) now reduces host-to-device auto-sync transfer to Map::execute() for the same pointer, avoiding full-allocation flush/DMA when only a sub-region was flushed. Device-to-host output sync preserves the full allocation size for safety. No API signature change.
  • PXL: destroyJob() may now return Failure when it races a concurrent subFree*() on the same Job. Callers should retry after the competing teardown completes.
  • PXL: Map::execute may now block when the shared stream queue is full, instead of returning Failure immediately.
xtop
  • xtop: xtop run --csv removed. Please use xtop export <.xpti> --csv [DIR].

Fixed

pxl
  • PXL: Optimized host-side performance of Map::execute() on repeated invocations.
  • PXL: Propagated output copyFromDevice failures from the host-side post-execute step.
  • PXL: Fixed teardown races that could fail or crash when subFree*() or destroyJob() overlapped with in-flight async Map or async sub-allocation work.
  • PXL: Fix submission drops on shared default streams (queue overflow) and under transient internal task-pool pressure.
  • PXL: Fixed incorrect IO memory granularity applied to non-first devices in mixed CXL/non-CXL multi-device systems.
  • PXL: Save coredump at first error detection instead of job destruction, preserving correct device state for debugging.
xtop
  • xtop: Skip profiler-unsupported firmware devices so multi-device runs no longer abort when one device runs a non-profiling firmware.
  • xtop: Stale device-event rows from prior runs are cleaned up so analysis isn’t polluted by leftover hardware-probe data.

Docker Image

  • Image Name: xcenadev/sdk:v1.4.6
  • Pull Command:

    docker pull xcenadev/sdk:v1.4.6
    

v1.4.5 (2026-04-06)

Download

New

  • PXL: Added pxl::memory API domain for standalone CXL memory operations (allocate, release).
  • PXL: Added support for dynamic CXL region rescan.

Changed

  • PXL: Added MAP_POPULATE to CXL DAX mmap to pre-fault pages at allocation time, avoiding latency spikes during copyToDevice.
  • PXL: Optimized HostFinalize.
  • SDK: Documentation: Simplified the “Verify Installation” documentation by using the validate_host.sh validation script guide.

Fixed

  • CXL Emulator: Fixed a PXL versioning issue on QEMU.
  • PXL: Fix invalid PXL versioning information.
  • SDK: Documentation: Fixed invalid GitHub URL links.
  • xTop: Improve perfetto trace host/device timeline and analysis map/task report.
  • xTop: Support multi device TUI (xtop view).
  • xTop: Integrate pxltop into xtop view.

Docker Image

  • Image Name: xcenadev/sdk:v1.4.5
  • Pull Command:

    docker pull xcenadev/sdk:v1.4.5
    

v1.4.4 (2026-03-16)

Download

New

  • PXL: Added version CLI (pxl --version)
    • Adds a command-line option to display the installed PXL version.
  • PXL: Latest XPTI profiling integration
    • Embeds host API tracing, device task/memory event capture, and Perfetto-compatible trace export directly.
  • PXL: Added copyToDeviceAsync for v2p updates
    • Reduces DeviceInit overhead by using asynchronous device copies during the v2p update stage.
  • CLI: Added fw-switch command to CLI guide
    • fw-switch command for switching between installed firmware versions on the device.

Changed

  • No items.

Fixed

  • xTop: Fixed profiling failure in xtop run
    • Fixes an issue that prevented profiling from working properly in xtop run.
  • Fixed FW version mismatch
    • Fixes an issue that FW version mismatch with latest release FW version. (based on bsp v.1.0.4)

Important Notes

  • No items.

Docker Image

  • Image Name: xcenadev/sdk:v1.4.4
  • Pull Command:

    docker pull xcenadev/sdk:v1.4.4
    

v1.4.3 (2026-02-13)

Download

New

  • PXL: Adaptive CXL memory allocation
    • Enables automatic expansion into the I/O memory region to utilize the CXL memory area even on systems without a CXL DAX device (/dev/daxX.0).
  • PXL: Performance profiling & tracing infrastructure
    • Adds the ability to collect and analyze detailed performance data during application execution.
  • xTOP:
    • xTOP is a real-time profiler tool based on XPTI that provides interactive terminal monitoring (view mode) and application profiling with CSV export (run mode) for device performance analysis.
    • Adds on-demand analysis commands for .xpti profile data — per-task breakdown, Map-level summary, Perfetto trace export, and one-click Perfetto UI visualization.
  • sdk_release
    • Docs: Added memory test documentation and tutorial.
  • mu_lib
    • MU_LIB: Added Cache Prefetch API.

Changed

  • PXL: Optimized installation process
    • Reduces install time by removing unnecessary dependencies and simplifying submodule initialization.
  • PXL: Improved log output
    • Changes MU message prefix from [host] to [mu] and displays sub-indices linearly.
  • mxdriver: Improved DMA submission handling
    • Enhances timeout and interrupt handling during DMA command submission to improve system responsiveness.
  • mxdriver: Stronger concurrency control
    • Reinforces concurrency management during DMA job processing.
  • Emulator: Ubuntu 24.04-based rootfs added
    • Supports running the emulator on a newer Ubuntu environment.
  • sdk_release
    • CI: Added memory_test to example test suite.
    • SDK: Removed xkvstore and xmapreduce submodules from the SDK package.
    • SDK: Updated installation scripts to exclude xkvstore and xmapreduce modules.

Fixed

  • PXL: Fixed pxltrace installation failure
    • Resolves an issue where the pxltrace tool failed to install.
  • PXL: Memory leak fix after Map completion
    • Fixes a memory leak where statistics data was not properly cleaned up after Map operations.
  • PXL: Improved device initialization validation
    • Adds defensive checks for invalid memory properties to prevent abnormal termination.
  • PXL: Host cache flush bug at boundary addresses
    • Fixes an issue where Host cache flush did not work correctly at boundary addresses (this was addressed in v1.4.2).
  • mxdriver: Build fix for device_match_t type mismatch
    • Fixes a compilation error caused by a device_match_t type mismatch due to kernel version changes.

Important Notes

  • We recommend using firmware compatible with SDK v1.4.3.
  • On systems without CXL support, adaptive memory allocation is automatically enabled to optimize performance.
  • The memory_test example requires MU compiler toolchain to build.
  • The memory_test runs with 192 GiB of memory and 3072 tasks by default (adjustable via CLI options).
  • The xkvstore and xmapreduce libraries are no longer included in the SDK.
  • Users who depend on these libraries should use alternative solutions or contact the development team.

Docker Image

  • Image Name: xcenadev/sdk:1.4.3
  • Pull Command:

    docker pull xcenadev/sdk:1.4.3
    

v1.4.2 (2025-12-11)

New

  • PXL:
    • Support ioMemAlloc() for CXL memory area in non-CXL host environment.
    • Added real-time progress display for Map (Enable with environment variable PXL_ENABLE_PROGRESS).
    • Support task tracing (CSV traces, Perfetto UI, summary; can be enabled with environment variable PXL_ENABLE_TRACE).
    • Support locality mode for Map.
    • Added Destroy API for Map.

Changed

  • PXL:
    • Updated MU Compiler path.
    • Improved internal error log messages.
    • Optimized launch overhead & task load balancing.
  • MXDriver
    • Improved to support installation regardless of the kernel version.

Fixed

  • PXL:
    • Fixed pxltop UI not shown without any key input.
    • Fixed abnormal hang in reset Sub.
    • Fixed host cache flush logic bug for boundary address case.
  • Example:
    • Fixed integer overflow in memory allocation calculation.

Important Notes

Docker Image

  • Image Name: xcenadev/sdk:1.4.2
  • Pull Command:

    docker pull xcenadev/sdk:1.4.2
    

v1.4.0 (2025-10-31)

New

  • PXL:
    • Added Executor statistics feature.
    • Implemented adaptive CXL memory enumeration.
    • Added version compatibility check between pxl and SDK framework.
    • Introduced Multi-Sub management functionality (alloc, free, reset).
    • Added Map setInput / setOutput support.
  • QEMU:
    • Added support for CXL 2.0 specification.
  • CLI:
    • Added new commands: fw-infoand fw-update to xcena_cli.
  • DEBUGGER:
    • Added installation path option (-p)
  • MU_LIB:
    • Introduced a new function that retrieves execution count to enhance performance monitoring.

Changed

  • PXL:
    • PXL Error logs are now printed by default.
    • When Debug logs are enabled, packet push/pop events are printed by default.

Fixed

  • PXL: Improved fail recovery and hang handling
  • QEMU: fixed invalid memfree issue
  • MU_LIB:
    • Resolved an intermittent issue where the cache flush operation might not complete as expected.
    • Corrected a problem that could cause atomic operations to fail under specific conditions.

Important Notes

Docker Image

  • Image Name: xcenadev/sdk:1.4.0
  • Pull Command:

    docker pull xcenadev/sdk:1.4.0
    

v1.3.0 (2025-08-28)

New

  • PXL:
    • Memory Allocation: Added support for calloc via pxl::runtime::Context::memCalloc(const size_t num, const size_t size).
    • Resource:
      • Added support for loading large MU ELF files (up to 256MB).
      • Added support for loading modules from image.
      • Added support for directly loading MU binary files into pxl::runtime::Job using pxl::runtime::Job::load(const char* path).
      • Added support for resource queries via pxl::runtime::Context::getAttribute(DeviceAttr attr, uint64_t& value).
  • MU Library:
    • Added support for large MU ELF files (up to 256MB)
    • Enabled ID reading functionality for individual Mu instances
    • Introduced NDArray support for enhanced data handling
    • Supported Global Read/Write operations and Cache Flush functionality

Changed

  • PXL:
    • Functions that previously returned bool now return PxlResult.
    • invalidL1(...), flushL1(...) -> invalidL2(...), flushL2(...)
    • Data structures:
      • pxl::runtime::Tensor<T> has been renamed to pxl::runtime::NDArray<T>.
        pxl::runtime::NDArray<T> is internally split into multiple mu::NDArray<T> objects for parallel execution in kernels.
    • Namespaces:
      • pxl::Device, pxl::Sub, pxl::Taskpxl::device::Device, pxl::device::Sub, pxl::device::Task
      • pxl::Module, pxl::Function, pxl::createModule(...)pxl::kernel::Module, pxl::kernel::Function, pxl::kernel::createModule(...)
    • Resource queries:
      • Context::remainSub()Context::availableSubCount()
      • Context::remainMemorySize()Context::availableMemSize()
      • Context::remainIoMemorySize()Context::availableIoMemSize()
    • Map:
      • Map::setSuccessCallback()Map::setCompleteCallback()
    • Logger:
      • pxl::log::setLog(...) is removed.
  • MU Library:
    • atomicRead -> atomicLoad
    • Explicitly separated asynchronous functions for clarity and control.
  • Updated APIs are available in the API Reference.

Fixed

  • Documentation has been updated to align with the API changes.

Important Notes

  • This release contains many API changes. Please refer to the API Reference for details.

Docker Image

  • Image Name: xcenadev/sdk:1.3.0
  • Pull Command:

    docker pull xcenadev/sdk:1.3.0
    

v1.2.0 (2025-05-20)

New

  • PXL Rust: Added Rust binding for PXL, enabling Rust developers to leverage PXL’s capabilities with Rust language support
  • Fail Recovery: Added fail recovery mechanism to handle device failures
  • Jupyter: Added Jupyter notebook support for interactive development of C++ applications using PXL in a Docker + QEMU environment
  • xMapReduce: Added xMapReduce and xkvstore libraries to provide a mapreduce framework to leverage PXL’s capabilities
  • Documentation:
    • Added example documentation for PXL Rust bindings
    • Added API documentation for MU Rust Library
    • Added tutorial documentation for Jupyter notebook
    • Added documentation for xMapReduce framework usage

Changed

  • PXL:
    • pxl::runtime::Parallel is now unified into pxl::runtime::Map for consistency. pxl::runtime::Map can be used as a drop-in replacement for all pxl::runtime::Parallel.
    • pxl::runtime::getNumDevice() is now removed. Use pxl::getNumDevice() instead.
    • pxl::runtime::syncToDevice() and pxl::runtime::syncFromDevice() are now removed. Use pxl::runtime::Context::syncToDevice() and pxl::runtime::Context::syncFromDevice() instead.
    • pxl::runtime::Context::copyToDevice() and pxl::runtime::Context::copyFromDevice() are now supported.
  • QEMU: Reduced CPU usage during idle time

Fixed

  • QEMU: Fixed hang issue during MMIO access in MSI interrupts
  • Documentation: Updated documentation to align with API changes

Important Notes

  • pxl::runtime::Parallel has been removed in this version. All code using the pxl::runtime::Parallel must be updated to use the new unified pxl::runtime::Map.

Docker Image

  • Image Name: xcenadev/sdk:1.2.0
  • Pull Command:

    docker pull xcenadev/sdk:1.2.0
    

v1.1.0 (2025-04-04)

New

  • QEMU: Support multiple devices
  • PXL: Added logger and performance tracer
  • Tool: Added pxltop for real time resource monitoring
  • MU Compiler: Support Rust
  • MU Library: Support Rust
  • Document: Added community and troubleshooting page

Changed

  • PXL: Unified interface for map/parallel argument type

Fixed

  • PXL: Resolved stability issues during job executions
  • CLI: Fix bug in device number counting

Important Notes

Docker Image

  • Image Name: xcenadev/sdk:1.1.0
  • Pull Command:

    docker pull xcenadev/sdk:1.1.0
    

v1.0.1 (2025-01-10)

New

  • Initial release of SDK for FPGA environment
  • Example: Add KNN application
  • MU lib: Support atomic operation
  • CLI: Add MU logger
  • Docker: Support Dev container

Changed

  • QEMU: Apply XCENA vendor ID for lspci information
  • Example: Migrate build environment from Makefile to CMake

Fixed

  • QEMU: Fix assert for misaligned CXL memory access
  • PXL: Fix bugs in device library, memory allocator

Deprecated

Important Notes

Docker image

  • Image name : xcenadev/sdk:1.0.1
  • Pull command

      docker pull xcenadev/sdk:1.0.1
    

v1.0.0 (2024-12-15)

New

  • Initial release of SDK emulator environment

Changed

Fixed

Deprecated

Important Notes

Docker image

  • Image name : xcenadev/sdk:1.0.0
  • Pull command

      docker pull xcenadev/sdk:1.0.0