Skip to content

Using multiple DSNs, choosing based on the logger #198

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
saifelse opened this issue Dec 10, 2018 · 22 comments
Closed

Using multiple DSNs, choosing based on the logger #198

saifelse opened this issue Dec 10, 2018 · 22 comments

Comments

@saifelse
Copy link
Contributor

saifelse commented Dec 10, 2018

I'm looking to upgrade from raven-python[flask] to sentry-sdk[flask]. We previous had two DSNs for our backend: one for errors and one to log performance issues (e.g. slow requests / high DB query count).

We were previously able to configure this via:

from raven.contrib.flask import Sentry
from raven.handlers.logging import SentryHandler

performance_logger = logging.getLogger("benchling.performance")
performance_logger.setLevel(logging.WARNING)
sentry = Sentry(logging=True, level=logging.WARNING)

def init_sentry(app):
    sentry.init_app(app, dsn=app.config["SENTRY_DSN_SERVER"])
    performance_handler = SentryHandler(dsn=app.config["SENTRY_DSN_BACKEND_PERFORMANCE"])
    performance_logger.addHandler(performance_handler)

With the new architecture, this seems hard to do since the DSN is configured once via sentry_sdk.init where the LoggingIntegration simply listens to the root logger. I was able to hack around this by monkey-patching the logging_integration's handler as follows:

import logging

import sentry_sdk
from sentry_sdk.client import Client
from sentry_sdk.hub import Hub
from sentry_sdk.integrations.celery import CeleryIntegration
from sentry_sdk.integrations.flask import FlaskIntegration
from sentry_sdk.integrations.logging import LoggingIntegration


def register_clients_for_loggers(logger_name_to_client):
    """Monkeypatch LoggingIntegration's EventHandler to override the Client based on the record's logger"""
    hub = Hub.current
    logging_integration = hub.get_integration(LoggingIntegration)
    if not logging_integration:
        return
    handler = logging_integration._handler
    old_emit = handler.emit

    def new_emit(record):
        new_client = logger_name_to_client.get(record.name)
        previous_client = hub.client
        should_bind = new_client is not None
        try:
            if should_bind:
                hub.bind_client(new_client)
            old_emit(record)
        finally:
            if should_bind:
                hub.bind_client(previous_client)

    handler.emit = new_emit

def init_sentry(app):
    sentry_sdk.init(
        dsn=app.config["SENTRY_DSN_SERVER"],
        release=app.config["SENTRY_RELEASE"],
        environment=app.config["SENTRY_ENVIRONMENT"],
        integrations=[
            LoggingIntegration(
                level=logging.WARNING,  # Capture info and above as breadcrumbs
                event_level=logging.WARNING,  # Send warnings and errors as events
            ),
            CeleryIntegration(),
            FlaskIntegration(),
        ],
    )

    performance_logger = logging.getLogger("benchling.performance")
    performance_logger.setLevel(logging.WARNING)

    perf_client = Client(
        dsn=app.config["SENTRY_DSN_BACKEND_PERFORMANCE"],
        release=app.config["SENTRY_RELEASE"],
        environment=app.config["SENTRY_ENVIRONMENT"],
    )
    register_clients_for_loggers({performance_logger.name: perf_client})

Two questions:

  1. Does this approach seem reasonable, or is there a better way to handle this?
  2. If there isn't a better way, would it be possible to have sentry_sdk.init let you specify this mapping?
@untitaker
Copy link
Member

What you can do is:

perf_client = Client(..)
regular_client = Client(..)

def send_event(event):
    if  event.get("logger") == "performance.logger":
        perf_client.capture_event(event)
    else:
        regular_client.capture_event(event)

init(transport=send_event)

We don't have a good story right now for using multiple clients transparently, although explicitly rebinding clients is something that we do aim to support:

with Hub(Hub.current) as hub:
    hub.bind_client(client)
    ...

# old hub + client is active here

@saifelse
Copy link
Contributor Author

saifelse commented Mar 31, 2019

@untitaker : Finally picking this back up 😅, and thanks for the suggestion.

I wanted to double-check my understanding of your workaround. The gist of it is that there will be three clients configured: two of which are configured just to have their DSN set properly so that we indirectly instantiate two HTTP transports, and then one more that is automatically configured via sentry_sdk.init which is fully configured correctly and uses a function as the transport that just switches between the two explicitly instantiated clients.

However, a few things that seem off about this approach:

  1. Client.capture_event actually does more than necessary, and it seems like I'd actually just want to call client.transport.capture_event(event), which seems easy enough.

  2. Using a function for the transport, means that flush will be not be called for these nested transports, which seems to be called via hub.client.flush() for a few integrations (e.g. celery, lambda). Noting that my original approach would not have handled flushes correctly either.

It seems like the thing I truly want looks closer to:

from collections import defaultdict

import sentry_sdk

def make_transport_proxy_cls(make_transport=sentry_sdk.transport.make_transport, dsn_by_logger={}):
    """Return a custom Transport subclasss that routes to a child transport based on an event's logger.

    :param make_transport: a function that returns a Transport instance given sentry_sdk.init `options`
    :param dsn_by_logger: a dict mapping a logger to a specific DSN.
    """

    class TransportProxy(sentry_sdk.transport.Transport):
        def __init__(self, options):
            super(TransportProxy, self).__init__(options)
            default_dsn = options["dsn"]
            self._dsn_by_logger_name = defaultdict(lambda: default_dsn)
            self._transport_by_dsn = {default_dsn: self._make_transport(default_dsn)}
            for logger, dsn in dsn_by_logger.iteritems():
                if dsn:
                    self._dsn_by_logger_name[logger.name] = dsn
                    self._transport_by_dsn[dsn] = self._make_transport(dsn)

        def _make_transport(self, dsn):
            return make_transport(dict(self.options, dsn=dsn))

        def flush(self, timeout, callback=None):
            for transport in self._transport_by_dsn.itervalues():
                transport.flush(timeout, callback)

        def kill(self):
            for transport in self._transport_by_dsn.itervalues():
                transport.kill()

        def capture_event(self, event):
            logger_name = event.get("logger")
            dsn = self._dsn_by_logger_name[logger_name]
            transport = self._transport_by_dsn[dsn]
            transport.capture_event(event)

    return TransportProxy

with usage like:

sentry_sdk.init(
    dsn=app.config["SENTRY_DSN_SERVER"],
        transport=make_transport_proxy_cls(
            dsn_by_logger={performance_logger: app.config["SENTRY_DSN_BACKEND_PERFORMANCE"]}
        ),
    ...
)

Getting a bit more in the weeds here, but does this approach seem sound?

@wss-chadical
Copy link

@untitaker

We have the same scenario - multiple projects for different types of issues/messages. Currently, we are just switching between two projects on the fly using sentry_sdk.init().

This seems to work just fine - but I'm pretty sure we are heavily leaking memory somewhere by doing this...

Help please. :-)

+1

@untitaker
Copy link
Member

untitaker commented Apr 17, 2019

@wss-chadical Calling init multiple times definetly leaves a lot for GC and creates a lot of threads. If you do that, a good and simple improvement is to call Hub.current.client.close() before calling init() again, but I have no idea if that's going to fix all of your problems.

@saifelse's approach is good, but whether it's suitable for you depends on whether you want to have separate client config besides DSN for each sentry project.

@untitaker
Copy link
Member

btw sorry @saifelse for the lack of response. Passing a function as a transport is only used for testing purposes. I think your approach is fine. Something we could improve is to put a TransportMultiplexer class into the SDK which forwards the event to all transports, and then each transport decides by itself if it wants to send.

@wss-chadical
Copy link

@untitaker I'll start with the close() - Thank you 👍

@untitaker
Copy link
Member

to be clear I still strongly recommend against this approach because it will likely impact perf. If you need multiple clients and can't use transports, I will post a solution but it takes a while to write.

@wss-chadical
Copy link

This is a data integration server (zato) - the transactional volume is not high - a little extra overhead is not a problem. If the hub.close reduces the thread/mem leaks it will work for us.

@wss-chadical
Copy link

wss-chadical commented Apr 22, 2019

Hub.current.client.close() has slowed the bleeding, but not stopped it.

I decided to better understand the thread above... and it looks like I have it figured out. :-)

def send_event(event):
    if 'extra' in event and 'project' in event['extra'] and 'data-issues' in event['extra']['project'].lower():
        sio_data_issues.capture_event(event)
    else:
        sio_zato_esb_production.capture_event(event)

def sentry_init():
    global sio_zato_esb_production
    global sio_data_issues

    if sentry_sdk.Hub.current.client is None:
        sio_zato_esb_production = sentry_sdk.client.Client("https://[email protected]/34563456")
        sio_data_issues = sentry_sdk.client.Client("https://[email protected]/12341234")

        sentry_sdk.init(transport=send_event)

def test():
    init_test()
    with sentry_sdk.push_scope() as scope:
        # flag this to go to the data-issues project
        scope.set_extra( 'project', 'data-issues' )
        sentry_sdk.capture_message('test')
   # raise below goes to zato-esb-production
   raise ValueError

@untitaker
Copy link
Member

Yes, this can work as well. You might encounter odd issues since this will trim too large strings twice (because an event goes through two clients) but I am not sure if it even matters. It's certainly a flexible solution.

@wss-chadical
Copy link

that's good to know - I'll keep an eye on it. I do pass quite a bit of extra.

It's very nice to have support here guys.

@saifelse Thanks for getting this started!

@untitaker
Copy link
Member

I'll close this since the original question has been answered by now.

@kiddten
Copy link

kiddten commented Oct 27, 2020

What is the easiest way to use multiple dsn nowadays? I do not need to configure somehow, I just need 2 dns supported simultaneously.

@wss-chadical
Copy link

@kiddick I'm still using the method from 4/19

@yossivainshtein
Copy link

We have the same use case, is there yet a supported way to implement this?
Is @saifelse 's method with make_transport_proxy_cls still the best option?

mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Sep 28, 2021
Replaces our custom Sentry WSGI middleware with middleware shipped by
sentry-sdk. Adds sentry-sdk ASGI middleware. Our custom sentry
middleware also allowed logging slow requests to sentry. I have dropped
this for now, and if we still want this in the future, we should
probably implement this as a custom logging middleware instead, which
would automatically appear in Sentry as well.

We used to hardcode the log level for events to WARNING, and the Sentry
default is ERROR now, which I think is a good idea. I've made this
configurable using the sentry_event_level config variable.

It does not seem to be possible to connect multiple dsn's in the same
app, unless we start hacking
(getsentry/sentry-python#198), so I dropped
the custom_dsn from the sentry error reporting plugin (which has been
broken anyway, so shouldn't be a big deal).
mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Sep 28, 2021
Replaces our custom Sentry WSGI middleware with middleware shipped by
sentry-sdk. Adds sentry-sdk ASGI middleware. Our custom sentry
middleware also allowed logging slow requests to sentry. I have dropped
this for now, and if we still want this in the future, we should
probably implement this as a custom logging middleware instead, which
would automatically appear in Sentry as well.

We used to hardcode the log level for events to WARNING, and the Sentry
default is ERROR now, which I think is a good idea. I've made this
configurable using the sentry_event_level config variable.

It does not seem to be possible to connect multiple dsn's in the same
app, unless we start hacking
(getsentry/sentry-python#198), so I dropped
the custom_dsn from the sentry error reporting plugin (which has been
broken anyway, so shouldn't be a big deal).
mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Sep 28, 2021
Replaces our custom Sentry WSGI middleware with middleware shipped by
sentry-sdk. Adds sentry-sdk ASGI middleware. Our custom sentry
middleware also allowed logging slow requests to sentry. I have dropped
this for now, and if we still want this in the future, we should
probably implement this as a custom logging middleware instead, which
would automatically appear in Sentry as well.

We used to hardcode the log level for events to WARNING, and the Sentry
default is ERROR now, which I think is a good idea. I've made this
configurable using the sentry_event_level config variable.

It does not seem to be possible to connect multiple dsn's in the same
app, unless we start hacking
(getsentry/sentry-python#198), so I dropped
the custom_dsn from the sentry error reporting plugin (which has been
broken anyway, so shouldn't be a big deal).
@th3hamm0r
Copy link

@untitaker Are the above approaches still valid?

I stumbled over this, because I'm trying to migrate a project, which uses multiple DSNs, from the legacy raven sdk to the new sentry sdk and I've seen randomly occurring warnings in my pytests like this:

...
    File "/vagrant/.venv/lib/python3.8/site-packages/sentry_sdk/transport.py", line 82, in capture_envelope
      raise NotImplementedError()
  NotImplementedError
  
    warnings.warn(pytest.PytestUnhandledThreadExceptionWarning(msg))

While digging through sentry's code I found out, that since 4/2019 there is a new capture_envelope() method (see #627) in the transport class, which raises a NotImplementedError() by default. The above approaches seem to be handled as a _FunctionTransport, which also does not implement the capture_envelope() call. Is this a problem? Do we have to subclass the Transport-class and implement both capture_envelope and capture_event or is this a problem of the SDK?

@untitaker
Copy link
Member

@th3hamm0r you have to make a subclass of Transport and implement both of those methods to dispatch-and-defer similarly to how the function-based transports work.

Either that or you disable sending of sessions and transactions, which will leave you only with error-monitoring.

@RubcovMindbox
Copy link

@untitaker hi! I also need to use two dnses.
Made it the way, you recommended above:

errors_sentry_client = Client(
    dsn="https://errors",
)

autobags_sentry_client = Client(
    dsn="https://autobugs"
)


def send_event(event: Dict[str, Any]):
    if event.get("logger") == "errors_sentry_logger":
        errors_sentry_client.capture_event(event)
    else:
        autobags_sentry_client.capture_event(event)


sentry_sdk.init(
    transport=send_event,
    traces_sample_rate=1.0,
)

division_by_zero = 1 / 0 #not logged anywhere

But uncaught errors aren't logged anywhere with such configuration. Could you advice how to fix it?

@jeffchuber
Copy link

I have multiple clients working, but I get this error whenever the server refreshes or an error is captured

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/Users/jeff/.pyenv/versions/3.9.2/lib/python3.9/site-packages/sentry_sdk/integrations/atexit.py", line 62, in _shutdown
    client.close(callback=integration.callback)
  File "/Users/jeff/.pyenv/versions/3.9.2/lib/python3.9/site-packages/sentry_sdk/client.py", line 465, in close
    self.flush(timeout=timeout, callback=callback)
  File "/Users/jeff/.pyenv/versions/3.9.2/lib/python3.9/site-packages/sentry_sdk/client.py", line 486, in flush
    self.session_flusher.flush()
  File "/Users/jeff/.pyenv/versions/3.9.2/lib/python3.9/site-packages/sentry_sdk/sessions.py", line 104, in flush
    self.capture_func(envelope)
  File "/Users/jeff/.pyenv/versions/3.9.2/lib/python3.9/site-packages/sentry_sdk/client.py", line 120, in _capture_envelope
    self.transport.capture_envelope(envelope)
  File "/Users/jeff/.pyenv/versions/3.9.2/lib/python3.9/site-packages/sentry_sdk/transport.py", line 82, in capture_envelope
    raise NotImplementedError()
NotImplementedError`

I tried implementing a custom transport but that was not successful. Anyone else solve this?

@sl0thentr0py
Copy link
Member

@jeffchuber just implementing send_event is no longer sufficient, so the above advice is outdated.
Please use something similar to
#1521

@jeffchuber
Copy link

@sl0thentr0py thank you! did not think to check the discussions tab

@sl0thentr0py
Copy link
Member

no worries! we just started using discussions for more 'support' like threads, so it's fairly new.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants