Skip to content

feat: Use Snuba events in group model #13905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 23 additions & 78 deletions src/sentry/models/group.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,8 @@
from django.utils.http import urlencode
from django.utils.translation import ugettext_lazy as _

from sentry import eventtypes, tagstore, options
from sentry.constants import (
DEFAULT_LOGGER_NAME, EVENT_ORDERING_KEY, LOG_LEVELS, MAX_CULPRIT_LENGTH
)
from sentry import eventtypes, tagstore
from sentry.constants import DEFAULT_LOGGER_NAME, LOG_LEVELS, MAX_CULPRIT_LENGTH
from sentry.db.models import (
BaseManager, BoundedBigIntegerField, BoundedIntegerField, BoundedPositiveIntegerField,
FlexibleForeignKey, GzippedDictField, Model, sane_repr
Expand Down Expand Up @@ -158,24 +156,13 @@ def from_event_id(self, project, event_id):
Resolves the 32 character event_id string into
a Group for which it is found.
"""
from sentry.models import EventMapping, Event
from sentry.models import SnubaEvent
group_id = None

# Look up event_id in both Event and EventMapping,
# and bail when it matches one of them, prioritizing
# Event since it contains more history.
for model in Event, EventMapping:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can remove the EventMapping check yet.
EventMapping is saved if the event is sampled (see

if is_sample:
try:
with transaction.atomic(using=router.db_for_write(EventMapping)):
EventMapping.objects.create(project=project, group=group, event_id=event_id)
except IntegrityError:
logger.info(
'duplicate.found',
exc_info=True,
extra={
'event_uuid': event_id,
'project_id': project.id,
'group_id': group.id if group else None,
'model': EventMapping.__name__,
}
)
)

Even after removing events from postgres, eventmapping is going to stay.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below, won't all the events be in Snuba?

try:
group_id = model.objects.filter(
project_id=project.id,
event_id=event_id,
).values_list('group_id', flat=True)[0]

# It's possible that group_id is NULL
if group_id is not None:
break
except IndexError:
pass
event = SnubaEvent.objects.from_event_id(event_id, project.id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before merging this, could you please check who is using the method and ensure we won't increase the load on snuba too much. Which means ensure this does not happen in the ingestion flow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one seems fine

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide some details about the research you have done ? What did you look into ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sorry, this function seems to be just used in these 3 places, none of them are related to ingestion

src/sentry/web/frontend/error_page_embed.py:                    report.group = Group.objects.from_event_id(report.project, report.event_id)
src/sentry/api/endpoints/project_group_index.py:                    matching_group = Group.objects.from_event_id(project, event_id)
src/sentry/api/endpoints/project_user_reports.py:                report.group = Group.objects.from_event_id(project, report.event_id)


if event:
group_id = event.group_id

if group_id is None:
# Raise a Group.DoesNotExist here since it makes
Expand All @@ -186,18 +173,20 @@ def from_event_id(self, project, event_id):
return Group.objects.get(id=group_id)

def filter_by_event_id(self, project_ids, event_id):
from sentry.models import EventMapping, Event
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue, I think the check on EventMapping is still valid because it is the only way to fetch sampled events.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that events are not sampled in Snuba (unlike Postgres) so we would not need to additionally check EventMapping.. is that not correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good quesiton. @tkaemming ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, to my knowledge.

I guess if we're going to be introducing a hard dependency on Snuba for events we should probably just go ahead and get rid of all vestiges of sampling (incl. EventMapping.)

group_ids = set()
# see above for explanation as to why we're
# looking at both Event and EventMapping
for model in Event, EventMapping:
group_ids.update(
model.objects.filter(
project_id__in=project_ids,
event_id=event_id,
group_id__isnull=False,
).values_list('group_id', flat=True)
)
from sentry.utils import snuba

group_ids = set([evt['issue'] for evt in snuba.raw_query(
start=datetime.utcfromtimestamp(0),
end=datetime.utcnow(),
selected_columns=['issue'],
conditions=[['issue', 'IS NOT NULL', None]],
filter_keys={
'event_id': [event_id],
'project_id': project_ids,
},
limit=1000,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for this limit specifically. And same question as before. Please ensure we do not go through this code during event ingestion.

Copy link
Member Author

@lynnagara lynnagara Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just some arbitrarily large number

referrer="Group.filter_by_event_id",
)['data']])

return Group.objects.filter(id__in=group_ids)

Expand Down Expand Up @@ -325,11 +314,6 @@ def qualified_short_id(self):
if self.short_id is not None:
return '%s-%s' % (self.project.slug.upper(), base32_encode(self.short_id), )

@property
def event_set(self):
from sentry.models import Event
return Event.objects.filter(group_id=self.id)

def is_over_resolve_age(self):
resolve_age = self.project.get_option('sentry:resolve_age', None)
if not resolve_age:
Expand Down Expand Up @@ -390,58 +374,19 @@ def get_score(self):
return type(self).calculate_score(self.times_seen, self.last_seen)

def get_latest_event(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a little tricker to remove.
I saw it is used in the plugin code:

src/sentry/plugins/base/v1.py: event = group.get_latest_event() or Event()
src/sentry/plugins/bases/issue.py: event = group.get_latest_event()
src/sentry/plugins/bases/issue2.py: event = group.get_latest_event()
src/sentry/plugins/bases/issue2.py: event = group.get_latest_event()
src/sentry/tasks/email.py: event = group.get_latest_event()

I can think about two possible issues:

  1. load on snuba
  2. the event may have not been stored in snuba while it would already been in postgres. postgres write happens before we execute plugins in post_process.py while at that point the event has just been produced in kafka. so it may or may not have been consumed yet. This should be verified.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re 2: We might get away with this since these all seem to be view related functions - the event object is passed to any post process plugin code so we don't need to do .get_latest_event() there

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this might come up in slack unfurls

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually post_process is synchronized on clickhouse storage. So it should be fine https://github.com/getsentry/sentry/blob/master/src/sentry/eventstream/kafka/consumer.py#L119-L145

from sentry.models import Event

if not hasattr(self, '_latest_event'):
latest_events = sorted(
Event.objects.filter(
group_id=self.id,
).order_by('-datetime')[0:5],
key=EVENT_ORDERING_KEY,
reverse=True,
)
try:
self._latest_event = latest_events[0]
except IndexError:
self._latest_event = None
self._latest_event = self.get_latest_event_for_environments()

return self._latest_event

def get_latest_event_for_environments(self, environments=()):
use_snuba = options.get('snuba.events-queries.enabled')

# Fetch without environment if Snuba is not enabled
if not use_snuba:
return self.get_latest_event()

return get_oldest_or_latest_event_for_environments(
EventOrdering.LATEST,
environments=environments,
issue_id=self.id,
project_id=self.project_id)

def get_oldest_event(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could plugins in sentry-plugins or other plugin repos be using this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I checked the whole Sentry org on github and there doesn't seem to be any other usages of this

from sentry.models import Event

if not hasattr(self, '_oldest_event'):
oldest_events = sorted(
Event.objects.filter(
group_id=self.id,
).order_by('datetime')[0:5],
key=EVENT_ORDERING_KEY,
)
try:
self._oldest_event = oldest_events[0]
except IndexError:
self._oldest_event = None
return self._oldest_event

def get_oldest_event_for_environments(self, environments=()):
use_snuba = options.get('snuba.events-queries.enabled')

# Fetch without environment if Snuba is not enabled
if not use_snuba:
return self.get_oldest_event()

return get_oldest_or_latest_event_for_environments(
EventOrdering.OLDEST,
environments=environments,
Expand Down
4 changes: 1 addition & 3 deletions tests/integration/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,7 @@ def queue_event(method, url, body, headers):
assert request.call_count is 1
assert Group.objects.count() == 1
group = Group.objects.get()
assert group.event_set.count() == 1
instance = group.event_set.get()
assert instance.data['logentry']['formatted'] == 'foo'
assert group.data['title'] == 'foo'


class SentryRemoteTest(TestCase):
Expand Down
48 changes: 34 additions & 14 deletions tests/sentry/api/endpoints/test_group_integration_details.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,44 @@

import six
import mock
from datetime import timedelta
from django.utils import timezone
import copy

from sentry.integrations.example.integration import ExampleIntegration
from sentry.integrations.exceptions import IntegrationError
from sentry.models import Activity, ExternalIssue, GroupLink, Integration
from sentry.testutils import APITestCase
from sentry.utils.http import absolute_uri
from sentry.testutils.factories import DEFAULT_EVENT_DATA


class GroupIntegrationDetailsTest(APITestCase):
def setUp(self):
super(GroupIntegrationDetailsTest, self).setUp()
self.min_ago = timezone.now() - timedelta(minutes=1)
self.event = self.store_event(
data={
'event_id': 'a' * 32,
'timestamp': self.min_ago.isoformat()[:19],
'message': 'message',
'stacktrace': copy.deepcopy(DEFAULT_EVENT_DATA['stacktrace']),
},
project_id=self.project.id
)
self.group = self.event.group

def test_simple_get_link(self):
self.login_as(user=self.user)
org = self.organization
group = self.create_group()
integration = Integration.objects.create(
provider='example',
name='Example',
)
integration.add_organization(org, self.user)

path = u'/api/0/issues/{}/integrations/{}/?action=link'.format(group.id, integration.id)
path = u'/api/0/issues/{}/integrations/{}/?action=link'.format(
self.group.id, integration.id)

with self.feature('organizations:integrations-issue-basic'):
response = self.client.get(path)
Expand Down Expand Up @@ -61,13 +79,12 @@ def test_simple_get_link(self):
def test_simple_get_create(self):
self.login_as(user=self.user)
org = self.organization
group = self.create_group()
self.create_event(group=group)
integration = Integration.objects.create(
provider='example',
name='Example',
)
integration.add_organization(org, self.user)
group = self.group

path = u'/api/0/issues/{}/integrations/{}/?action=create'.format(group.id, integration.id)

Expand Down Expand Up @@ -99,7 +116,7 @@ def test_simple_get_create(self):
'required': True,
}, {
'default': ('Sentry Issue: [%s](%s)\n\n```\n'
'Stacktrace (most recent call last):\n\n '
'Stacktrace (most recent call first):\n\n '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure, @tkaemming - do you know?

Copy link
Contributor

@tkaemming tkaemming Jul 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... seems like that comes from here. Seems like the changes from create_group/create_event maybe triggered something like the platform to be saved differently? (I don't know if that's the case, I didn't follow the fixture creation down to the source.)

'File "sentry/models/foo.py", line 29, in build_msg\n '
'string_max_length=self.string_max_length)\n\nmessage\n```'
) % (group.qualified_short_id, absolute_uri(group.get_absolute_url(params={'referrer': 'example_integration'}))),
Expand All @@ -121,15 +138,14 @@ def test_simple_get_create(self):
def test_get_create_with_error(self):
self.login_as(user=self.user)
org = self.organization
group = self.create_group()
self.create_event(group=group)
integration = Integration.objects.create(
provider='example',
name='Example',
)
integration.add_organization(org, self.user)

path = u'/api/0/issues/{}/integrations/{}/?action=create'.format(group.id, integration.id)
path = u'/api/0/issues/{}/integrations/{}/?action=create'.format(
self.group.id, integration.id)

with self.feature('organizations:integrations-issue-basic'):
with mock.patch.object(ExampleIntegration, 'get_create_issue_config', side_effect=IntegrationError('oops')):
Expand All @@ -141,15 +157,14 @@ def test_get_create_with_error(self):
def test_get_feature_disabled(self):
self.login_as(user=self.user)
org = self.organization
group = self.create_group()
self.create_event(group=group)
integration = Integration.objects.create(
provider='example',
name='Example',
)
integration.add_organization(org, self.user)

path = u'/api/0/issues/{}/integrations/{}/?action=create'.format(group.id, integration.id)
path = u'/api/0/issues/{}/integrations/{}/?action=create'.format(
self.group.id, integration.id)

with self.feature({'organizations:integrations-issue-basic': False}):
response = self.client.get(path)
Expand All @@ -167,7 +182,6 @@ def test_simple_put(self):
integration.add_organization(org, self.user)

path = u'/api/0/issues/{}/integrations/{}/'.format(group.id, integration.id)

with self.feature('organizations:integrations-issue-basic'):
response = self.client.put(path, data={
'externalIssue': 'APP-123'
Expand Down Expand Up @@ -367,8 +381,14 @@ def assert_default_project(path, action, expected_project_field):

self.login_as(user=self.user)
org = self.organization
group = self.create_group()
self.create_event(group=group)
event = self.store_event(
data={
'event_id': 'a' * 32,
'timestamp': self.min_ago.isoformat()[:19],
},
project_id=self.project.id
)
group = event.group
integration = Integration.objects.create(
provider='example',
name='Example',
Expand Down
14 changes: 11 additions & 3 deletions tests/sentry/api/endpoints/test_shared_group_details.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
from __future__ import absolute_import, print_function

import six
from datetime import timedelta
from django.utils import timezone

from sentry.testutils import APITestCase
from sentry.models import GroupShare
Expand All @@ -10,8 +12,14 @@ class SharedGroupDetailsTest(APITestCase):
def test_simple(self):
self.login_as(user=self.user)

group = self.create_group()
event = self.create_event(group=group)
min_ago = (timezone.now() - timedelta(minutes=1)).isoformat()[:19]
event = self.store_event(
data={
'timestamp': min_ago,
},
project_id=self.project.id,
)
group = event.group

share_id = group.get_share_id()
assert share_id is None
Expand All @@ -29,7 +37,7 @@ def test_simple(self):

assert response.status_code == 200, response.content
assert response.data['id'] == six.text_type(group.id)
assert response.data['latestEvent']['id'] == six.text_type(event.id)
assert response.data['latestEvent']['id'] == six.text_type(event.event_id)
assert response.data['project']['slug'] == group.project.slug
assert response.data['project']['organization']['slug'] == group.organization.slug

Expand Down
22 changes: 19 additions & 3 deletions tests/sentry/integrations/bitbucket/test_issues.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
from __future__ import absolute_import

from datetime import timedelta
import copy

from django.utils import timezone

from sentry.integrations.bitbucket.issues import ISSUE_TYPES, PRIORITIES
from sentry.models import ExternalIssue, Integration
from sentry.testutils import APITestCase
from sentry.testutils.factories import DEFAULT_EVENT_DATA

import json
import responses
Expand All @@ -24,8 +30,18 @@ def setUp(self):
'subject': self.subject,
}
)
self.group = self.create_group()
self.create_event(group=self.group)
min_ago = (timezone.now() - timedelta(minutes=1)).isoformat()[:19]
event = self.store_event(
data={
'event_id': 'a' * 32,
'message': 'message',
'timestamp': min_ago,
'stacktrace': copy.deepcopy(DEFAULT_EVENT_DATA['stacktrace']),

},
project_id=self.project.id,
)
self.group = event.group
self.repo_choices = [('myaccount/repo1', 'myaccount/repo1'),
('myaccount/repo2', 'myaccount/repo2')]
self.org_integration = self.integration.add_organization(self.organization)
Expand Down Expand Up @@ -203,7 +219,7 @@ def test_get_create_issue_config(self):
}, {
'name': 'description',
'label': 'Description',
'default': u'Sentry Issue: [BAR-1](http://testserver/organizations/baz/issues/%d/?referrer=bitbucket_integration)\n\n```\nStacktrace (most recent call last):\n\n File "sentry/models/foo.py", line 29, in build_msg\n string_max_length=self.string_max_length)\n\nmessage\n```' % self.group.id,
'default': u'Sentry Issue: [BAR-1](http://testserver/organizations/baz/issues/%d/?referrer=bitbucket_integration)\n\n```\nStacktrace (most recent call first):\n\n File "sentry/models/foo.py", line 29, in build_msg\n string_max_length=self.string_max_length)\n\nmessage\n```' % self.group.id,
'type': 'textarea',
'autosize': True,
'maxRows': 10,
Expand Down
Loading