Convert Arrow to Parquet

Trusted by over 10,000 every month

Convert Arrow to Parquet online

With our online Arrow to Parquet converter you can convert your files without downloading any software or writing code. Unlike other services, you can make graphs from your converted data or perform analysis. Just click the navigation on the left hand side.

done_outline

Convert Arrow to Parquet online

done_outline

Works with large Arrow files that have millions of rows

done_outline

View your converted Parquet before downloading it

Arrow

Apache Arrow (.arrow) is a format that was designed for storing tabular data in memory (RAM). It was not designed for storing data as files on disk.

Arrow enforces a schema. Which means that every value in a column must have the same value.

Arrow was designed to work well between different data analysis systems without needing to be serialised or deserialized.

Arrow is a binary data format, which means that it can be easily read by computers. But it cannot be read by people. The easiest way to view arrow data is to convert it to CSV first.

It can be best to convert Arrow data to Parquet before saving to disk. Arrow was not designed for storing data to disk, so can be inefficient and slow to query.

Parquet

Apache Parquet (.parquet) is a format that was designed for storing tabular data on disk. It was designed based on the format used in Google's Dremel paper (Dremel later became Big Query).

Parquet files store data in a binary format, which means that they can be efficiently read by computers but are difficult for people to read.

Parquet files have a schema, so means that every value in a column must have the same type. The schema makes Parquet files easier to analyse than CSV files and also helps them to have better compression so they are smaller on disk.

How to convert Arrow to Parquet

  1. Upload your Arrow file
  2. Your Arrow file will be converted to Parquet
  3. Download your Parquet file
  4. Click the view button to view your file

How to convert Arrow to Parquet in Python

We can convert Arrow to Parquet in Python using Pandas or DuckDB

How to Convert Arrow to Parquet using Pandas

First, we need to install pandas

pip install pandas

Then we can load the Arrow file into a dataframe

df = pd.read_feather('path/to/file.arrow')

Finally, we can export the dataframe to the Parquet format

df.to_parquet('path/to/file.parquet', index=False)

How to Convert Arrow to Parquet using DuckDB

First, we need to install duckdb for Python

pip install duckdb

The following DuckDB query will read a Arrow file and output a Parquet file

duckdb.sql("""COPY (select * from 'path/to/file.arrow') TO 'path/to/file.parquet' (FORMAT 'parquet')""")

Supercharge your data exploration

Open csv, parquet, arrow, json and tsv files straight from your desktop

Work straight from Google Drive

Open csv, parquet, arrow, json and tsv files directly from Drive, Gmail and Classroom by installing the Google Workspace App