Skip to content

Getting started documentation #82

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 19 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 126 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,63 +1,159 @@
# Data APIs for Apache Cassandra

## Getting started
Easy to use APIs for accessing data stored in Apache Cassandra.

The GraphQL endpoints can be used as a standalone webserver or you could plugin
the routes within your HTTP request router.
These APIs can be used as a standalone server using either Docker or manually
running a server. They can also be embedded in existing applications using HTTP
routes.

### Run as a container (GraphQL only)
Currently, this project provides GraphQL APIs. Other API types are possible in
the future.

```bash
docker build -t cassandra-data-apis .
docker run -p 8080:8080 -e "ENDPOINT_HOSTS=<cassandra_hosts_here>" cassandra-data-apis
## Getting Started

### Installation

```sh
docker pull datastaxlabs/cassandra-data-apis
docker run --rm -d -p 8080:8080 -e DATA_API_HOSTS=<cassandra_hosts_here> datastaxlabs/cassandra-data-apis
```

Or to use with a configuration file, create a file with the following contents:
You can also manually build the docker image and/or the server using the
[instructions](#building) below.


### Using GraphQL

By default, a GraphQL endpoint is started. Use the [GraphQL
documentation](/docs/graphql/README.md) for getting started.

## Configuration

Configuration for Docker can be done using either environment variables, a
mounted configuration file, or both.

Add additional configuration using environment variables by adding them to the
`docker run` command.

```
docker run -e DATA_API_HOSTS=127.0.0.1 -e DATA_API_KEYSPACE=example ...
```

### Using a configuration file

To use a configuration file, create a file with the following contents:

```yaml
hosts:
# Change to your cluster's hosts
- 127.0.0.1
# keyspace: example
# username: cassandra
# password: cassandra

# See the "Settings" section for additional configuration

# Add your configuration here
```

Then start docker with:

```sh
docker run -p 8080:8080 -v "${PWD}/<your_config_file>.yaml:/root/config.yaml" datastaxlabs/cassandra-data-apis
```

Then start the endpoints with:
### Settings

| Name | Type | Env. Variable | Description |
| --- | --- | --- | --- |
| hosts | strings | DATA_API_HOSTS | Hosts for connecting to the database |
| keyspace | string | DATA_API_KEYSPACE | Only allow access to a single keyspace |
| excluded-keyspaces | strings | DATA_API_EXCLUDED_KEYSPACES | Keyspaces to exclude from the endpoint |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it obvious how these keyspaces should be formatted in lists for environment variables, vs. config file?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment for "hosts"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I've added some more explanation for that type (w/ examples) in the Settings section.

| username | string | DATA_API_USERNAME | Connect with database user |
| password | string | DATA_API_PASSWORD | Database user's password |
| operations | strings | DATA_API_OPERATIONS | A list of supported schema management operations. See below. (default `"TableCreate, KeyspaceCreate"`) |
| request-logging | bool | DATA_API_REQUEST_LOGGING | Enable request logging |
| schema-update-interval | duration | DATA_API_SCHEMA_UPDATE_INTERVAL | Interval in seconds used to update the graphql schema (default `10s`) |
| start-graphql | bool | DATA_API_START_GRAPHQL | Start the GraphQL endpoint (default `true`) |
| graphql-path | string | DATA_API_GRAPHQL_PATH | GraphQL endpoint path (default `"/graphql"`) |
| graphql-port | int | DATA_API_GRAPHQL_PORT | GraphQL endpoint port (default `8080`) |
| graphql-schema-path | string | DATA_API_GRAPHQL_SCHEMA_PATH | GraphQL schema management path (default `"/graphql-schema"`) |

#### Configuration Types

The `strings` type expects a comma-delimited list e.g. `127.0.0.1, 127.0.0.2,
127.0.0.3` when using environment variables or a command flag, and it expects
an array type when using a configuration file.

YAML:

```yaml
---
host:
- "127.0.0.1"
- "127.0.0.2"
- "127.0.0.3"

```

JSON:
```json
{
"hosts": ["127.0.0.1", "127.0.0.2", "127.0.0.3"]
}
```

#### Schema Management Operations

| Operation | Allows |
| --- | --- |
| `TableCreate` | Creation of tables |
| `TableDrop` | Removal of tables |
| `TableAlterAdd` | Add new table columns |
| `TableAlterDrop` | Remove table columns |
| `KeyspaceCreate` | Creation of keyspaces |
| `KeyspaceDrop` | Removal of keyspaces |

## Building

This section is mostly for developers. Pre-built docker image recommended.

### Building the Docker Image

```bash
docker run -p 8080:8080 -v "${PWD}/<your_config_file>.yaml:/root/config.yaml" cassandra-data-apis
cd <path_to_data-apis>/cassandra-data-apis
docker build -t cassandra-data-apis .
```

#### Use with single node, local Cassandra cluster
### Run locally with single node, local Cassandra cluster

```bash
cd <path_to_data-apis>/cassandra-data-apis
docker build -t cassandra-data-apis .

# On Linux (with a cluster started on the docker bridge: 172.17.0.1)
docker run -p 8080:8080 -e "ENDPOINT_HOSTS=172.17.0.1" cassandra-data-apis
docker run -p 8080:8080 -e "DATA_API_HOSTS=172.17.0.1" cassandra-data-apis

# Or (with a cluster bound to 0.0.0.0)
run --network host -e "ENDPOINT_HOSTS=127.0.0.1" cassandra-data-apis
# With a cluster bound to 0.0.0.0
docker run --network host -e "DATA_API_HOSTS=127.0.0.1" cassandra-data-apis

# On macOS (with a cluster bound to 0.0.0.0)
docker run -p 8080:8080 -e "ENDPOINT_HOSTS=host.docker.internal" cassandra-data-apis
docker run -p 8080:8080 -e "DATA_API_HOSTS=host.docker.internal" cassandra-data-apis
```

These host values can also be used in the configuration file approach used in
the previous section.

### Run as a standalone webserver (GraphQL only)
### Build and run as a standalone webserver

If you want to run this module as a standalone webserver, use:

```bash
# Define the keyspace you want to use
# Start the webserver
go build -o run.exe && ./run.exe --hosts 127.0.0.1 --keyspace store
go build run.exe && ./run.exe --hosts 127.0.0.1 --keyspace store
```

Or your settings can be persisted using a configuration file:
Your settings can be persisted using a configuration file:

```yaml
hosts:
Expand All @@ -77,17 +173,23 @@ To start the server using a configuration file, use:
```

Settings can also be overridden using environment variables prefixed with
`ENDPOINT_`:
`DATA_API_`:

```bash
ENDPOINT_HOSTS=127.0.0.1 ENDPOINT_KEYSPACE=store ./run.exe --config <your_config_file>.yaml
DATA_API_HOSTS=127.0.0.1 DATA_API_KEYSPACE=store ./run.exe --config <your_config_file>.yaml
```

Note `--start-rest` is not currently implemented.

### Plugin the routes within your HTTP request router

If you want to add the routes to your existing HTTP request router, use:
#### Installation

```
go get github.com/datastax/cassandra-data-apis
```

#### Using the API

To add the routes to your existing HTTP request router, use:

```go
cfg := endpoint.NewEndpointConfig("your.first.contact.point", "your.second.contact.point")
Expand Down
57 changes: 46 additions & 11 deletions cmd/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,15 @@ import (
log2 "log"
"net/http"
"os"
"strings"
)

const defaultGraphQLPath = "/graphql"
const defaultGraphQLSchemaPath = "/graphql-schema"
const defaultRESTPath = "/todo"

// Environment variables prefixed with "ENDPOINT_" can override settings e.g. "ENDPOINT_HOSTS"
const envVarPrefix = "endpoint"
// Environment variables prefixed with "DATA_API_" can override settings e.g. "DATA_API_HOSTS"
const envVarPrefix = "data_api"

var cfgFile string
var logger log.Logger
Expand All @@ -41,6 +42,10 @@ var serverCmd = &cobra.Command{

startGraphQL := viper.GetBool("start-graphql")
startREST := viper.GetBool("start-rest")

if startREST {
return errors.New("REST endpoint is not currently supported")
}
if !startGraphQL && !startREST {
return errors.New("at least one endpoint type should be started")
}
Expand All @@ -57,7 +62,7 @@ var serverCmd = &cobra.Command{
startREST := viper.GetBool("start-rest")

if graphqlPort == restPort {
router := httprouter.New()
router := createRouter()
endpointNames := ""
if startGraphQL {
addGraphQLRoutes(router, endpoint)
Expand All @@ -70,11 +75,11 @@ var serverCmd = &cobra.Command{
}
endpointNames += "REST"
}
listenAndServe(maybeAddRequestLogging(router), graphqlPort, endpointNames)
listenAndServe(maybeAddCORS(maybeAddRequestLogging(router)), graphqlPort, endpointNames)
} else {
finish := make(chan bool)
if startGraphQL {
router := httprouter.New()
router := createRouter()
addGraphQLRoutes(router, endpoint)
go listenAndServe(maybeAddRequestLogging(router), graphqlPort, "GraphQL")
}
Expand Down Expand Up @@ -113,17 +118,19 @@ func Execute() {
"TableCreate",
"KeyspaceCreate",
}, "list of supported table and keyspace management operations. options: TableCreate,TableDrop,TableAlterAdd,TableAlterDrop,KeyspaceCreate,KeyspaceDrop")
flags.String("access-control-allow-origin", "", "Access-Control-Allow-Origin header value")

// GraphQL specific flags
flags.Bool("start-graphql", true, "start the GraphQL endpoint")
flags.String("graphql-path", defaultGraphQLPath, "path for the GraphQL endpoint")
flags.String("graphql-schema-path", defaultGraphQLSchemaPath, "path for the GraphQL schema management")
flags.Int("graphql-port", 8080, "port for the GraphQL endpoint")
flags.String("graphql-path", defaultGraphQLPath, "GraphQL endpoint path")
flags.String("graphql-schema-path", defaultGraphQLSchemaPath, "GraphQL schema management path")
flags.Int("graphql-port", 8080, "GraphQL endpoint port")

// TODO:
// REST specific flags
flags.Bool("start-rest", false, "start the REST endpoint")
flags.String("rest-path", defaultRESTPath, "path for the REST endpoint")
flags.Int("rest-port", 8080, "port for the REST endpoint")
// flags.Bool("start-rest", false, "start the REST endpoint")
// flags.String("rest-path", defaultRESTPath, "REST endpoint path")
// flags.Int("rest-port", 8080, "REST endpoint port")

flags.VisitAll(func(flag *pflag.Flag) {
if flag.Name != "config" {
Expand All @@ -134,6 +141,7 @@ func Execute() {
cobra.OnInitialize(initialize)

viper.SetEnvPrefix(envVarPrefix)
viper.SetEnvKeyReplacer(strings.NewReplacer("-", "_"))
viper.AutomaticEnv()

if err := serverCmd.Execute(); err != nil {
Expand Down Expand Up @@ -216,6 +224,16 @@ func maybeAddRequestLogging(handler http.Handler) http.Handler {
return handler
}

func maybeAddCORS(handler http.Handler) http.Handler {
if value := viper.GetString("access-control-allow-origin"); value != "" {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Access-Control-Allow-Origin", value)
handler.ServeHTTP(w, r)
})
}
return handler
}

func initialize() {
if cfgFile != "" {
viper.SetConfigFile(cfgFile)
Expand All @@ -226,6 +244,23 @@ func initialize() {
}
}

func createRouter() *httprouter.Router {
router := httprouter.New()
if value := viper.GetString("access-control-allow-origin"); value != "" {
router.GlobalOPTIONS = http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
if r.Header.Get("Access-Control-Request-Method") != "" {
header := w.Header()
header.Set("Access-Control-Allow-Method", r.Header.Get("Access-Control-Request-Method"))
header.Set("Access-Control-Allow-Headers", r.Header.Get("Access-Control-Request-Headers"))
header.Set("Access-Control-Allow-Origin", value)
}

w.WriteHeader(http.StatusNoContent)
})
}
return router
}

func listenAndServe(handler http.Handler, port int, endpointNames string) {
logger.Info("server listening",
"port", port,
Expand Down
2 changes: 1 addition & 1 deletion config/naming.go
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ func generateAvailableName(baseName string, nameMap map[string]string) string {
func isReserved(name string) bool {
switch name {
case "BasicType", "Bigint", "Blob", "Column", "ColumnInput", "ColumnKind", "Consistency", "ClusteringKeyInput",
"DataType", "DataTypeInput", "Decimal", "QueryOptions", "Table", "TableQuery", "TableMutation", "Time",
"DataType", "DataTypeInput", "Decimal", "QueryOptions", "Table", "Query", "Mutation", "Time",
"Timestamp", "TimeUuid", "UpdateOptions", "Uuid", "Varint":
return true
}
Expand Down
6 changes: 4 additions & 2 deletions db/table.go
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ func (db *Db) CreateTable(info *CreateTableInfo, options *QueryOptions) (bool, e
}

if info.ClusteringKeys != nil {
primaryKeys = fmt.Sprintf("(%s)", primaryKeys)
primaryKeys = fmt.Sprintf("(%s)", primaryKeys[2:])

for _, c := range info.ClusteringKeys {
columns += fmt.Sprintf(`"%s" %s, `, c.Name, c.Type)
Expand All @@ -52,6 +52,8 @@ func (db *Db) CreateTable(info *CreateTableInfo, options *QueryOptions) (bool, e
}
clusteringOrder += fmt.Sprintf(`, "%s" %s`, c.Name, order)
}
} else {
primaryKeys = primaryKeys[2:]
}

if info.Values != nil {
Expand All @@ -60,7 +62,7 @@ func (db *Db) CreateTable(info *CreateTableInfo, options *QueryOptions) (bool, e
}
}

query := fmt.Sprintf(`CREATE TABLE "%s"."%s" (%sPRIMARY KEY (%s))`, info.Keyspace, info.Table, columns, primaryKeys[2:])
query := fmt.Sprintf(`CREATE TABLE "%s"."%s" (%sPRIMARY KEY (%s))`, info.Keyspace, info.Table, columns, primaryKeys)

if clusteringOrder != "" {
query += fmt.Sprintf(" WITH CLUSTERING ORDER BY (%s)", clusteringOrder[2:])
Expand Down
Loading