Package: arrow 22.0.0

Jonathan Keane

arrow: Integration to 'Apache' 'Arrow'

'Apache' 'Arrow' <https://arrow.apache.org/> is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. This package provides an interface to the 'Arrow C++' library.

Authors:Neal Richardson [aut], Ian Cook [aut], Nic Crane [aut], Dewey Dunnington [aut], Romain François [aut], Jonathan Keane [aut, cre], Bryce Mecum [aut], Dragoș Moldovan-Grünfeld [aut], Jeroen Ooms [aut], Jacob Wujciak-Jens [aut], Javier Luraschi [ctb], Karl Dunkle Werner [ctb], Jeffrey Wong [ctb], Apache Arrow [aut, cph]

arrow_22.0.0.tar.gz
arrow_22.0.0.zip(r-4.6)arrow_22.0.0.zip(r-4.5)arrow_22.0.0.zip(r-4.4)
arrow_22.0.0.tgz(r-4.5-x86_64)arrow_22.0.0.tgz(r-4.5-arm64)arrow_22.0.0.tgz(r-4.4-x86_64)arrow_22.0.0.tgz(r-4.4-arm64)
arrow_22.0.0.tar.gz(r-4.6-arm64)arrow_22.0.0.tar.gz(r-4.6-x86_64)arrow_22.0.0.tar.gz(r-4.5-arm64)arrow_22.0.0.tar.gz(r-4.5-x86_64)
arrow.pdf |arrow.html✨
arrow/json (API)
NEWS

# Install 'arrow' in R:

install.packages('arrow', repos = c('https://staging.r-multiverse.org', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/apache/arrow/issues

Pkgdown/docs site:https://arrow.apache.org

Uses libs:

bzip2– High-quality block-sorting file compressor library
brotli– Library implementing brotli encoder and decoder
zlib– Compression library
lz4– Fast LZ compression algorithm library
libzstd– Fast lossless compression algorithm
curl– Easy-to-use client-side URL transfer library
openssl– Secure Sockets Layer toolkit
c++– GNU Standard C++ Library v3

On CRAN:

arrow parquet bzip2 brotli zlib lz4 libzstd curl openssl cpp

19.84 score 16k stars 120 packages 15k scripts 511k downloads 6 mentions 242 exports 14 dependencies

Last updated from:5aeb5f217f (on apache-arrow-22.0.0). Checks:8 OK, 4 NOTE, 1 FAIL. Indexed: no.

Target	Result	Total time
linux-devel-arm64	OK	1709
linux-devel-x86_64	OK	1750
source / vignettes	OK	1364
linux-release-arm64	OK	1355
linux-release-x86_64	OK	1366
macos-release-arm64	OK	475
macos-release-x86_64	OK	542
macos-oldrel-arm64	NOTE	329
macos-oldrel-x86_64	NOTE	1020
windows-devel	NOTE	1499
windows-release	OK	753
windows-oldrel	NOTE	765
wasm-release	FAIL	127

Exports:all_of Array arrow_array arrow_available arrow_info arrow_table arrow_with_acero arrow_with_dataset arrow_with_gcs arrow_with_json arrow_with_parquet arrow_with_s3 arrow_with_substrait as_arrow_array as_arrow_table as_chunked_array as_data_type as_record_batch as_record_batch_reader as_schema binary bool boolean buffer Buffer BufferOutputStream BufferReader call_function cast_options chunked_array ChunkedArray Codec codec_is_available CompressedInputStream CompressedOutputStream CompressionType concat_arrays concat_tables contains copy_files cpu_count create_package_with_all_dependencies csv_convert_options csv_parse_options csv_read_options csv_write_options CsvConvertOptions CsvFileFormat CsvFragmentScanOptions CsvParseOptions CsvReadOptions CsvTableReader CsvWriteOptions Dataset dataset_factory DatasetFactory date32 date64 DateUnit decimal decimal128 decimal256 decimal32 decimal64 default_memory_pool dictionary DictionaryArray DirectoryPartitioning DirectoryPartitioningFactory duration ends_with everything Expression ExtensionArray ExtensionType FeatherReader field Field FileFormat FileInfo FileMode FileOutputStream FileSelector FileSystem FileSystemDataset FileSystemDatasetFactory FileType fixed_size_binary fixed_size_list_of FixedSizeListArray FixedSizeListType flight_connect flight_disconnect flight_get flight_path_exists flight_put float float16 float32 float64 FragmentScanOptions GcsFileSystem gs_bucket halffloat hive_partition HivePartitioning HivePartitioningFactory infer_schema infer_type InMemoryDataset install_arrow install_pyarrow int16 int32 int64 int8 io_thread_count IpcFileFormat is_in JoinType JsonFileFormat JsonFragmentScanOptions JsonParseOptions JsonReadOptions JsonTableReader large_binary large_list_of large_utf8 LargeListArray last_col list_compute_functions list_flights list_of ListArray load_flight_server LocalFileSystem map_batches map_of MapArray MapType match_arrow matches MemoryMappedFile MessageReader MessageType MetadataVersion mmap_create mmap_open new_extension_array new_extension_type null NullEncodingBehavior NullHandlingBehavior num_range one_of open_csv_dataset open_dataset open_delim_dataset open_tsv_dataset ParquetArrowReaderProperties ParquetFileFormat ParquetFileReader ParquetFileWriter ParquetFragmentScanOptions ParquetReaderProperties ParquetVersionType ParquetWriterProperties Partitioning QuantileInterpolation RandomAccessFile read_csv_arrow read_csv2_arrow read_delim_arrow read_feather read_ipc_file read_ipc_stream read_json_arrow read_message read_parquet read_schema read_tsv_arrow ReadableFile record_batch RecordBatch RecordBatchFileReader RecordBatchFileWriter RecordBatchReader RecordBatchStreamReader RecordBatchStreamWriter register_extension_type register_scalar_function reregister_extension_type RoundMode s3_bucket S3FileSystem scalar Scalar Scanner ScannerBuilder schema Schema set_cpu_count set_io_thread_count show_exec_plan starts_with StatusCode string struct StructArray StructScalar SubTreeFileSystem Table time32 time64 timestamp TimestampParser TimeUnit to_arrow to_duckdb type Type uint16 uint32 uint64 uint8 unify_schemas UnionDataset unregister_extension_type utf8 value_counts vctrs_extension_array vctrs_extension_type write_csv_arrow write_csv_dataset write_dataset write_delim_dataset write_feather write_ipc_file write_ipc_stream write_parquet write_to_raw write_tsv_dataset

Dependencies:assertthat bit bit64 cli cpp11 glue lifecycle magrittr purrr R6 rlang tidyselect vctrs withr

Help page	Topics
Functions available in Arrow dplyr queries	acero arrow-dplyr arrow-functions arrow-verbs
Array Classes	Array DictionaryArray FixedSizeListArray LargeListArray ListArray MapArray StructArray
ArrayData class	ArrayData
Create an Arrow Array	arrow_array
Report information on the package's capabilities	arrow_available arrow_info arrow_with_acero arrow_with_dataset arrow_with_gcs arrow_with_json arrow_with_parquet arrow_with_s3 arrow_with_substrait
Create an Arrow Table	arrow_table
Convert an object to an Arrow Array	as_arrow_array as_arrow_array.Array as_arrow_array.ChunkedArray as_arrow_array.Scalar
Convert an object to an Arrow Table	as_arrow_table as_arrow_table.arrow_dplyr_query as_arrow_table.data.frame as_arrow_table.Dataset as_arrow_table.default as_arrow_table.RecordBatch as_arrow_table.RecordBatchReader as_arrow_table.Schema as_arrow_table.Table
Convert an object to an Arrow ChunkedArray	as_chunked_array as_chunked_array.Array as_chunked_array.ChunkedArray
Convert an object to an Arrow DataType	as_data_type as_data_type.DataType as_data_type.Field as_data_type.Schema
Convert an object to an Arrow RecordBatch	as_record_batch as_record_batch.arrow_dplyr_query as_record_batch.data.frame as_record_batch.RecordBatch as_record_batch.Table
Convert an object to an Arrow RecordBatchReader	as_record_batch_reader as_record_batch_reader.arrow_dplyr_query as_record_batch_reader.data.frame as_record_batch_reader.Dataset as_record_batch_reader.function as_record_batch_reader.RecordBatch as_record_batch_reader.RecordBatchReader as_record_batch_reader.Scanner as_record_batch_reader.Table
Convert an object to an Arrow Schema	as_schema as_schema.Schema as_schema.StructType
Create a Buffer	buffer
Buffer class	Buffer
Call an Arrow compute function	call_function
Create a Chunked Array	chunked_array
ChunkedArray class	ChunkedArray
Compression Codec class	Codec
Check whether a compression codec is available	codec_is_available
Compressed stream classes	CompressedInputStream CompressedOutputStream compression
Concatenate zero or more Arrays	c.Array concat_arrays
Concatenate one or more Tables	concat_tables
Copy files between FileSystems	copy_files
Manage the global CPU thread pool in libarrow	cpu_count set_cpu_count
Create a source bundle that includes all thirdparty dependencies	create_package_with_all_dependencies
CSV Convert Options	csv_convert_options
CSV Parsing Options	csv_parse_options
CSV Reading Options	csv_read_options
CSV Writing Options	csv_write_options
CSV dataset file format	CsvFileFormat
File reader options	CsvConvertOptions CsvParseOptions CsvReadOptions CsvWriteOptions JsonParseOptions JsonReadOptions TimestampParser
Arrow CSV and JSON table reader classes	CsvTableReader JsonTableReader
Create Arrow data types	binary bool boolean data-type date32 date64 decimal decimal128 decimal256 decimal32 decimal64 duration FixedSizeListType fixed_size_binary fixed_size_list_of float float16 float32 float64 halffloat int16 int32 int64 int8 large_binary large_list_of large_utf8 list_of MapType map_of null string struct time32 time64 timestamp uint16 uint32 uint64 uint8 utf8
Multi-file datasets	Dataset DatasetFactory FileSystemDataset FileSystemDatasetFactory InMemoryDataset UnionDataset
Create a DatasetFactory	dataset_factory
DataType class	DataType
Create a dictionary type	dictionary
class DictionaryType	DictionaryType
Arrow expressions	Expression
ExtensionArray class	ExtensionArray
ExtensionType class	ExtensionType
FeatherReader class	FeatherReader
Create a Field	field
Field class	Field
Dataset file formats	FileFormat IpcFileFormat ParquetFileFormat
FileSystem entry info	FileInfo
file selector	FileSelector
FileSystem classes	FileSystem GcsFileSystem LocalFileSystem S3FileSystem SubTreeFileSystem
Format-specific write options	FileWriteOptions
FixedWidthType class	FixedWidthType
Connect to a Flight server	flight_connect
Explicitly close a Flight client	flight_disconnect
Get data from a Flight server	flight_get
Send data to a Flight server	flight_put
Format-specific scan options	CsvFragmentScanOptions FragmentScanOptions JsonFragmentScanOptions ParquetFragmentScanOptions
Connect to a Google Cloud Storage (GCS) bucket	gs_bucket
Construct Hive partitioning	hive_partition
Extract a schema from an object	infer_schema
Infer the arrow Array type from an R object	infer_type type
InputStream classes	BufferReader InputStream MemoryMappedFile RandomAccessFile ReadableFile
Install or upgrade the Arrow library	install_arrow
Install pyarrow for use with reticulate	install_pyarrow
Manage the global I/O thread pool in libarrow	io_thread_count set_io_thread_count
JSON dataset file format	JsonFileFormat
List available Arrow C++ compute functions	list_compute_functions
See available resources on a Flight server	flight_path_exists list_flights
Load a Python Flight server	load_flight_server
Apply a function to a stream of RecordBatches	map_batches
Value matching for Arrow objects	is_in match_arrow
Message class	Message
MessageReader class	MessageReader
Create a new read/write memory mapped file of a given size	mmap_create
Open a memory mapped file	mmap_open
Extension types	new_extension_array new_extension_type register_extension_type reregister_extension_type unregister_extension_type
Open a multi-file dataset	open_dataset
Open a multi-file dataset of CSV or other delimiter-separated format	open_csv_dataset open_delim_dataset open_tsv_dataset
OutputStream classes	BufferOutputStream FileOutputStream OutputStream
ParquetArrowReaderProperties class	ParquetArrowReaderProperties
ParquetFileReader class	ParquetFileReader
ParquetFileWriter class	ParquetFileWriter
ParquetReaderProperties class	ParquetReaderProperties
ParquetWriterProperties class	ParquetWriterProperties
Define Partitioning for a Dataset	DirectoryPartitioning DirectoryPartitioningFactory HivePartitioning HivePartitioningFactory Partitioning
Read a CSV or other delimited file with Arrow	read_csv2_arrow read_csv_arrow read_delim_arrow read_tsv_arrow
Read a Feather file (an Arrow IPC file)	read_feather read_ipc_file
Read Arrow IPC stream format	read_ipc_stream
Read a JSON file	read_json_arrow
Read a Message from a stream	read_message
Read a Parquet file	read_parquet
Read a Schema from a stream	read_schema
Create a RecordBatch	record_batch
RecordBatch class	RecordBatch
RecordBatchReader classes	RecordBatchFileReader RecordBatchReader RecordBatchStreamReader
RecordBatchWriter classes	RecordBatchFileWriter RecordBatchStreamWriter RecordBatchWriter
Register user-defined functions	register_scalar_function
Connect to an AWS S3 bucket	s3_bucket
Create an Arrow Scalar	scalar StructScalar
Arrow scalars	Scalar
Scan the contents of a dataset	Scanner ScannerBuilder
Create a schema or extract one from an object.	schema
Schema class	Schema
Show the details of an Arrow Execution Plan	show_exec_plan
Table class	Table
Create an Arrow object from a DuckDB connection	to_arrow
Create a (virtual) DuckDB table from an Arrow object	to_duckdb
Combine and harmonize schemas	unify_schemas
'table' for Arrow objects	value_counts
Extension type for generic typed vectors	vctrs_extension_array vctrs_extension_type
Write CSV file to disk	write_csv_arrow
Write a dataset	write_dataset
Write a dataset into partitioned flat files.	write_csv_dataset write_delim_dataset write_tsv_dataset
Write a Feather file (an Arrow IPC file)	write_feather write_ipc_file
Write Arrow IPC stream format	write_ipc_stream
Write Parquet file to disk	write_parquet
Write Arrow data to a raw vector	write_to_raw

Package: arrow 22.0.0

arrow: Integration to 'Apache' 'Arrow'

Citation

Development and contributors

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)