Skip to content

Add DatabaseClient support #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Oct 31, 2023
Merged

Add DatabaseClient support #13

merged 13 commits into from
Oct 31, 2023

Conversation

wiltsecarpenter
Copy link
Contributor

@wiltsecarpenter wiltsecarpenter commented Oct 18, 2023

This PR adds support for the DatabaseClient API using the Observable database proxy service. This allows notebook pages to use databases that are defined and running via the observable-database-proxy command. The DatabaseClient function implemented here provides a subset of the functions found in the Observable Database Client specification,

  • db.query(SELECT statement, [params...]) - SQL query function style
  • db.sql`SELECT statement ${param}` - Tagged template literal style

Database configuration is read from the ~/.observablehq configuration file as defined by the observable-database-proxy command. The configuration includes the server address for the database proxy and a shared secret value used as an authorization header bearer token. The configuration for referenced databases is included in the generated page for preview and build of CLI projects.

Example

Source:

# Database test

```js
const db = DatabaseClient("dwh-local-proxy");
```

Database name is ${db.name}.

## query example
```js
Inputs.table(result)
```

```js
let result = await db.sql`SELECT * FROM fct_user_events LIMIT ${limit}`;
```

```js
let limit = 10;
```

## tagged template literal example

There are ${(await db.sql`SELECT COUNT(*) AS win64 FROM fct_user_events WHERE BROWSER_ID LIKE '%Win64%'`)[0].WIN64} Win64 user agent rows

Output:

image

Note

Using this feature requires un-submitted enhancements to the observable-notebook-proxy command.

Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! Thanks for taking this on. 🙏

Let’s chat about the security model for this tomorrow. In particular, I’d like to explore whether we can continue to use static analysis to tighten this down, i.e., can we prevent a notebook from being able to successfully make a request to /_database/<name>/token if that notebook doesn’t reference that DatabaseClient statically?

One way would be to transpile references to DatabaseClient so that they include a secure token, e.g., transpile DatabaseClient("foo") to DatabaseClient("foo", "<initial-token>"). That initial token could be used to make requests to /_database/<name>/token, and subsequently to /query. Imported/dynamic code wouldn’t be able to guess the initial token, and therefore wouldn’t be able to initiate a database query.

@wiltsecarpenter wiltsecarpenter force-pushed the wiltse/database branch 2 times, most recently from 439f08b to 0efed9d Compare October 18, 2023 17:02
@mootari
Copy link
Member

mootari commented Oct 23, 2023

Imported/dynamic code wouldn’t be able to guess the initial token

What would prevent code from parsing the inlined token from the document source and initiating a separate connection?

@mbostock
Copy link
Member

What would prevent code from parsing the inlined token from the document source and initiating a separate connection?

Nothing, if you’re on a page that statically references a DatabaseClient. But the token wouldn’t be present in the source if the page doesn’t statically reference a DatabaseClient.

@mbostock
Copy link
Member

#46

@wiltsecarpenter
Copy link
Contributor Author

After initial discussion, I re-implemented the token passing strategy to include the tokens in the rendered HTML instead of making a separate request for them. This is implemented using a general cell "resolver" that can be enhanced to support other types of tokens, like Secrets, and to also support other methods of passing in configuration values like command line flags and environment variables.

@cinxmo
Copy link
Contributor

cinxmo commented Oct 24, 2023

currently yarn global add @observablehq/database-proxy doesn't work for our team because the @observablehq package points to our Github registry instead of the default NPM registry as part of the bi setup

this.#token = token;
}
async query(sql, params, {signal} = {}) {
const response = await fetch(`${this.#token.url}query`, {
Copy link
Contributor

@cinxmo cinxmo Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this supposed to call the /query endpoint and if so is #token.url guaranteed to end with /?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the token.url is created within makeCLIResolver in resolver.ts and will end with a /.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should first normalize the url and then append "/query" to it to avoid this confusion. When I tested the current database proxy workflow, the url stored in ~/.observablehq doesn't end with a "/". I'm missing something here...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also what are the additional changes that need to be made to @observablehq/database-proxy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, changed to use URL here too. Note that the token.url has already gone through normalization in the makeCLIResolver, it is dervied from the host and port values ~/.observablehq.

src/build.ts Outdated
root: sourceRoot,
path,
pages,
resolver: await makeCLIResolver()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest lifting this up, before this for loop, so we can share it across pages rather than creating separate resolvers for each.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return `${Buffer.from(data).toString("base64") + "." + Buffer.from(hmac).toString("base64")}`;
}

export async function makeCLIResolver(): Promise<CellResolver> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call this makeCellResolver instead of makeCLIResolver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can name it that if we want to, but the point of "CLI" here is to differentiate it from what will be needed in the platform where we will need a "makePlatformResolver".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The “makePlatformResolver” wouldn’t live in this codebase, presumably?

Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very close to me. My biggest questions are:

  • What changes are needed in the self-hosted database proxy? We should review those concurrently.
  • I’d like to have a <link rel=modulepreload> for database.js if a page references a DatabaseClient, so that the page loads faster (rather than waiting until the first time a DatabaseClient is referenced). That should be pretty easy to add.

@wiltsecarpenter
Copy link
Contributor Author

See observablehq/database-proxy#85 for changes to the database-proxy that enable the CLI preview to be used with it. More changes will likely be needed.

Copy link
Contributor

@cinxmo cinxmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one nit. I think to have <link rel=modulepreload> for database.js, we'd need to concat the output of parseResult.databases here: https://github.com/observablehq/cli/blob/4343c4f283eace660b5716f02a22b53d6b8fa783/src/render.ts#L65-L68. Or we can add that logic to getImportMap but that seems to be for npm packages as opposed to local modules

@wiltsecarpenter
Copy link
Contributor Author

I added the module preload for database.js

@wiltsecarpenter wiltsecarpenter merged commit 1bd179c into main Oct 31, 2023
@wiltsecarpenter wiltsecarpenter deleted the wiltse/database branch October 31, 2023 16:36
url: string;
}

const configFile = join(homedir(), ".observablehq");
Copy link
Member

@mbostock mbostock Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want this config file to be scoped per-project rather than in the user’s home directory. Presumably you might want different databases per project.

url.protocol = db.ssl !== "disabled" ? "https:" : "http:";
url.host = db.host;
url.port = String(db.port);
url.toString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be removed.

for (const item of diff) {
if (item.type === "add") {
for (const addItem of item.items) {
if (addItem.type === "cell" && "databases" in addItem) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the "databases" in addItem check here, and instead leave it to the resolver to decide whether any resolution is necessary?

@mbostock mbostock mentioned this pull request Oct 31, 2023
@cinxmo cinxmo mentioned this pull request Nov 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants