Skip to content

Commit 046dc85

Browse files
committed
Dataframe v2: reference docs (#7820)
Add a reference page for the dataframe APIs. It's still very barebones at this point because #7819 makes it very difficult to write snippets for this. But it is literally infinitely better than what's there right now: nothing. - DNM: requires #7817 - Closes #7828
1 parent 3cc5370 commit 046dc85

File tree

3 files changed

+100
-1
lines changed

3 files changed

+100
-1
lines changed

docs/content/reference/dataframes.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,43 @@ title: Dataframes
33
order: 300
44
---
55

6-
Incoming.
6+
Rerun, at its core, is a database. As such, you can always get your data back in the form of tables (also known as dataframes, or records, or batches...).
7+
8+
This can be achieved in three different ways, depending on your needs:
9+
* using the dataframe API, currently available in [Python](https://ref.rerun.io/docs/python/stable/common/dataframe/) and [Rust](https://docs.rs/rerun/latest/rerun/dataframe/index.html),
10+
* using the [blueprint API](../concepts/blueprint) to configure a [dataframe view](types/views/dataframe_view) from code,
11+
* or simply by setting up [dataframe view](types/views/dataframe_view) manually in the UI.
12+
13+
This page is meant as a reference to get you up and running with these different solutions as quickly as possible.
14+
For an in-depth introduction to the dataframe API and the possible workflows it enables, check out [our Getting Started guide](../getting-started/data-out) or one of the accompanying [How-Tos](../howto/dataframe-api).
15+
16+
17+
> We'll need an RRD file to query. Either use one of yours, or grab some of the example ones, e.g.:
18+
> ```
19+
> curl 'https://app.rerun.io/version/latest/examples/dna.rrd' -o - > /tmp/dna.rrd
20+
> ```
21+
22+
### Using the dataframe API
23+
24+
The following snippet demonstrates how to query the first 10 rows in a Rerun recording:
25+
26+
snippet: reference/dataframe_query
27+
28+
Check out the API reference to learn more about all the ways that data can be searched and filtered:
29+
* [🐍 Python API reference](https://ref.rerun.io/docs/python/stable/common/dataframe/)
30+
* [🐍 Python example](https://github.com/rerun-io/rerun/blob/c00a9f649fd4463f91620e8e2eac11355b245ac5/examples/python/dataframe_query/dataframe_query.py)
31+
* [🦀 Rust API reference](https://docs.rs/crate/rerun/latest)
32+
* [🦀 Rust example](https://github.com/rerun-io/rerun/blob/c00a9f649fd4463f91620e8e2eac11355b245ac5/examples/rust/dataframe_query/src/main.rs)
33+
34+
35+
### Using the blueprint API to configure a dataframe view
36+
37+
TODO(cmc): incoming.
38+
39+
Check out the blueprint API reference to learn more about all the ways that data can be searched and filtered:
40+
* [🐍 Python blueprint API reference](https://ref.rerun.io/docs/python/latest/common/blueprint_apis/)
41+
42+
43+
### Setting up dataframe view manually in the UI
44+
45+
TODO(cmc): incoming.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
"""Query and display the first 10 rows of a recording."""
2+
3+
import sys
4+
5+
import rerun as rr
6+
7+
path_to_rrd = sys.argv[1]
8+
9+
recording = rr.dataframe.load_recording(path_to_rrd)
10+
view = recording.view(index="log_time", contents="/**")
11+
batches = view.select()
12+
13+
for _ in range(10):
14+
row = batches.read_next_batch()
15+
if row is None:
16+
break
17+
# Each row is a `RecordBatch`, which can be easily passed around across different data ecosystems.
18+
print(row)
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
//! Query and display the first 10 rows of a recording.
2+
3+
#![allow(clippy::unwrap_used)]
4+
5+
use rerun::{
6+
dataframe::{QueryCache, QueryEngine, QueryExpression, SparseFillStrategy, Timeline},
7+
ChunkStore, ChunkStoreConfig, VersionPolicy,
8+
};
9+
10+
fn main() -> Result<(), Box<dyn std::error::Error>> {
11+
let args = std::env::args().collect::<Vec<_>>();
12+
13+
let path_to_rrd = &args[1];
14+
let timeline = Timeline::log_time();
15+
16+
let stores = ChunkStore::from_rrd_filepath(
17+
&ChunkStoreConfig::DEFAULT,
18+
path_to_rrd,
19+
VersionPolicy::Warn,
20+
)?;
21+
let (_, store) = stores.first_key_value().unwrap();
22+
23+
let query_cache = QueryCache::new(store);
24+
let query_engine = QueryEngine {
25+
store,
26+
cache: &query_cache,
27+
};
28+
29+
let query = QueryExpression {
30+
filtered_index: Some(timeline),
31+
sparse_fill_strategy: SparseFillStrategy::LatestAtGlobal,
32+
..Default::default()
33+
};
34+
35+
let query_handle = query_engine.query(query.clone());
36+
for row in query_handle.batch_iter().take(10) {
37+
// Each row is a `RecordBatch`, which can be easily passed around across different data ecosystems.
38+
println!("{row}");
39+
}
40+
41+
Ok(())
42+
}

0 commit comments

Comments
 (0)