Skip to content

postgresql-style pre-startup init scripts hook #53

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions 2.6/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ RUN curl -SL "https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-$MONGO_VERSI
&& tar -xvf mongo.tgz -C /usr/local --strip-components=1 \
&& rm mongo.tgz*

RUN mkdir /docker-entrypoint-initdb.d

VOLUME /data/db

COPY docker-entrypoint.sh /entrypoint.sh
Expand Down
14 changes: 14 additions & 0 deletions 2.6/docker-entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,20 @@ if [ "$1" = 'mongod' ]; then
set -- $numa "$@"
fi

# internal start of server in order to allow set-up using mongo client
gosu mongodb mongod --fork --dbpath=/data/db --syslog
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be gosu mongodb "$@" --fork --bind_ip 127.0.0.1? That way it is only available inside the container and we keep numa or any passed in flags.

I'm not sure if --syslog or --logpath /dev/stdout should be added, since one is required when using --fork.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the postgres image they don't take the $@ args when starting the temporary server... I'm not sure what the reasoning to or not to is. I think the only really important thing is that you're using the same data dir as the real server will be using.

I think maybe --logpath /dev/stdout would be best... all the logs are supposed to go there under docker, the main server ones will do, so it would be the same.

I'm happy with whatever you'd advise in both cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datadir of /data/db is automatic (docs.mongodb).

We actually need to adjust this, postgres, and the other SQLs to start with passed in options, so that users could override things like datadir or run with --storageEngine=wiredTiger. I don't think using /dev/stdout for logpath will work, since that stdout won't be the same stdout as the bash process running it. We might as well just stick it in /var/log/mongodb/.

gosu mongodb "$@" --fork --bind_ip 127.0.0.1 --logpath /var/log/mongodb/mongo-init.log

The other option is to drop --fork and --logpath and just background it with & and stick in a "try to connect" loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a little more playing with this, and the following works successfully: (from tianon/gosu#8 (comment))

$ docker run -it --rm mongo bash
root@201b1fb3ccb3:/# chown --dereference mongodb /dev/stdout
root@201b1fb3ccb3:/# gosu mongodb mongod --bind_ip 127.0.0.1 --logpath /dev/stdout
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten] MongoDB starting : pid=9 port=27017 dbpath=/data/db 64-bit host=201b1fb3ccb3
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten] db version v3.4.2
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten] git version: 3f76e40c105fc223b3e5aac3e20dcd026b83b38b
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1t  3 May 2016
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten] modules: none
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten] build environment:
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten]     distmod: debian81
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten]     distarch: x86_64
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2017-02-14T00:03:55.866+0000 I CONTROL  [initandlisten] options: { net: { bindIp: "127.0.0.1" }, systemLog: { destination: "file", path: "/dev/stdout" } }
2017-02-14T00:03:55.872+0000 I STORAGE  [initandlisten] 
2017-02-14T00:03:55.872+0000 I STORAGE  [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2017-02-14T00:03:55.872+0000 I STORAGE  [initandlisten] **          See http://dochub.mongodb.org/core/prodnotes-filesystem
2017-02-14T00:03:55.872+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=15562M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-02-14T00:03:56.106+0000 I CONTROL  [initandlisten] 
2017-02-14T00:03:56.106+0000 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-02-14T00:03:56.106+0000 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-02-14T00:03:56.106+0000 I CONTROL  [initandlisten] 
2017-02-14T00:03:56.173+0000 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
2017-02-14T00:03:56.279+0000 I INDEX    [initandlisten] build index on: admin.system.version properties: { v: 2, key: { version: 1 }, name: "incompatible_with_version_32", ns: "admin.system.version" }
2017-02-14T00:03:56.279+0000 I INDEX    [initandlisten] 	 building index using bulk method; build may temporarily use up to 500 megabytes of RAM
2017-02-14T00:03:56.280+0000 I INDEX    [initandlisten] build index done.  scanned 0 total records. 0 secs
2017-02-14T00:03:56.281+0000 I COMMAND  [initandlisten] setting featureCompatibilityVersion to 3.4
2017-02-14T00:03:56.282+0000 I NETWORK  [thread1] waiting for connections on port 27017

The main issue still remaining is that as written, this code will re-run on every startup of MongoDB, so we need a reasonably non-hacky way to determine whether a database has already been initialized. In the PostgreSQL image, we check for a specific file that Postgres itself always creates. In the MySQL image, we took a check from upstream which checks for a mysql database within the configured /data/db folder.

In the case of this image, we're even slightly more complicated because --datadir might be passed on the command line, and we need to be able to handle that intelligently (and it might be hidden behind -f in a config file too), so I think this might be layering hacks deeper and deeper. 😞 😢

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realise postgres only ran it once, I'd assumed the startup script should be idempotent and it wouldn't matter if it ran every time


for f in /docker-entrypoint-initdb.d/*; do
case "$f" in
*.sh) echo "$0: running $f"; . "$f" ;;
*.js) echo "$0: running $f"; mongo --nodb "$f" && echo ;;
*) echo "$0: ignoring $f" ;;
esac
done

# stop the temporary server daemon
gosu mongodb mongod --shutdown --dbpath=/data/db

exec gosu mongodb "$@"
fi

Expand Down