G2CPU version 1.6.2 is now available as a preview release. While this version does not introduce headline end-user features, it represents an important architectural step toward G2CPU 1.7.0. The focus is on advanced synchronization, deeper integration capabilities, and further reductions in runtime latency—especially relevant for responsive and high-performance applications.
Direct Access to CUDA and OpenCL Pointers from LabVIEW
A major usability improvement in 1.6.2 is direct access to CUDA and OpenCL pointers via LabVIEW property nodes. This allows developers to retrieve native GPU pointers from G2CPU objects and pass them directly into custom kernels or third-party GPU libraries.
This significantly lowers the barrier to integrating bespoke GPU code while maintaining a clean, idiomatic LabVIEW workflow and avoiding unnecessary abstractions or data copies.
Advanced Synchronization with Event Stream APIs
G2CPU 1.6.2 introduces a new event stream API for both CUDA and OpenCL, enabling advanced synchronization scenarios. Developers can now explicitly coordinate GPU execution, memory transfers, and host-side logic using event-based mechanisms.
This is particularly valuable for complex pipelines and for applications that combine G2CPU functionality with custom GPU kernels. These capabilities also form an important foundation for the more advanced execution models planned in G2CPU 1.7.0.
Explicit Control Over the JIT Compiler Queue
G2CPU relies on Just-In-Time compilation to support multiple GPU backends efficiently. In version 1.6.2, developers gain the ability to explicitly flush the JIT compiler queue.
This gives developers explicit control over when queued JIT compilation work is completed, which is particularly useful when integrating custom kernels or when GPU compilation should be completed before entering a steady-state execution phase.
Significant Runtime Latency Reduction
Another major advancement in G2CPU 1.6.2 is a further reduction in runtime latency. Developers can now disable runtime debugging by setting the project variable:
G2CPU_DEBUG = FALSE
For high-throughput applications, overall throughput remains unchanged with our best in market Zero Copy technology. However, for applications that process small data packets or require fast responses, the impact is substantial:
- 3× lower latency compared to G2CPU 1.6.1
- 7× lower latency compared to G2CPU 1.6.0
These values where generated with both G2CPU_MULTI_DEVICE = TRUE and G2CPU_DEBUG = FALSE project variables set.
These improvements apply across Windows, Linux, and LabVIEW Real-Time targets, reinforcing G2CPU’s unique position in GPU-accelerated LabVIEW environments.
New Example: Efficient GPU-Based Indexing and Reordering
G2CPU 1.6.2 also introduces a new example demonstrating efficient GPU-based indexing and reordering using G2CPU lookup operations.
The example shows how to:
- Index into arrays
- Insert and delete elements
- Reorder data efficiently
- And many more
This pattern is particularly relevant for signal processing pipelines, streaming applications, and data conditioning workloads.
Preparing for G2CPU 1.7.0
G2CPU 1.6.2 is best viewed as an enabling release. The new event stream APIs, pointer access mechanisms, JIT control, and latency improvements collectively prepare the ground for G2CPU 1.7.0, where deeper customization and more advanced GPU integration will become first-class capabilities.
Feedback from preview users is highly valuable and directly informs the direction of upcoming releases.







