core.store¶
Parquet (and future duckdb) IO primitives.
pandas is imported inside read_parquet() and write_parquet() so importing
taskclf.core.store does not eagerly load the full dataframe stack; callers
that only need other modules avoid that cost until parquet I/O runs.
taskclf.core.store
¶
Parquet I/O primitives for persisting DataFrames.
write_parquet(df, path)
¶
Write df to a parquet file at path atomically.
Writes to a temporary file in the same directory first, then
atomically replaces the target via :func:os.replace. This
prevents readers from ever seeing a partially-written file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame to persist. |
required |
path
|
Path
|
Destination file path (e.g. |
required |
Returns:
| Type | Description |
|---|---|
Path
|
The path that was written, for convenient chaining. |
Source code in src/taskclf/core/store.py
read_parquet(path)
¶
Read a parquet file into a DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
Path to an existing |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
The loaded DataFrame. |