搜索快捷键 cmd + k | ctrl + k
arrow

允许从文件和直接从流缓冲区读取和写入 Apache Arrow 进程间通信 (IPC) 格式。

维护者:paleolimbot, pdet

安装和加载

INSTALL nanoarrow FROM community;
LOAD nanoarrow;

示例

-- Read from a file in Arrow IPC format
FROM 'arrow_file.arrow';
FROM 'arrow_file.arrows';
FROM read_arrow('arrow_file.arrow');

-- Write a file in Arrow IPC stream format
CREATE TABLE arrow_libraries AS SELECT 'nanoarrow' as name, '0.6' as version;
COPY arrow_libraries TO 'test.arrows' (FORMAT ARROWS, BATCH_SIZE 100);

-- Write to buffers: This returns IPC message BLOBs and indicates which one is the header.
FROM to_arrow_ipc((FROM arrow_libraries));

关于 nanoarrow

Arrow IPC 库允许用户以 Arrow IPC 流格式读取和写入数据。这可以通过读取和生成 .arrow 文件,或者直接使用指针和大小读取缓冲区来完成。需要注意的是,读取缓冲区是危险的,因为不正确的指针可能会导致数据库系统崩溃。此过程是临时的,将来会被弃用,因为客户端(例如 Python DuckDB 客户端)将有一个函数可以从 Arrow 流中内部提取这些缓冲区。