Skip to content

Daily buckets #180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 5, 2017
Merged

Daily buckets #180

merged 5 commits into from
Jan 5, 2017

Conversation

tomwilkie
Copy link
Contributor

@tomwilkie tomwilkie commented Jan 4, 2017

Second attempt at #178, part of #10

juliusv and others added 3 commits January 4, 2017 11:39
There can be slight overlap around the switchover time, but I deemed it
safer (and not incorrect) than risking gaps due to some calculation
error.
Copy link
Contributor

@jml jml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's "Approve", but Github makes it too hard to review all the comments. Ping me when you've replied and I'll try to follow up promptly.

@@ -206,6 +208,10 @@ type AWSStore struct {
tableName string
bucketName string

// Around which time to start bucketing indexes by day instead of by hour.
// Only the day matters, not the time within the day.
dailyBucketsFrom model.Time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we model this with a type that only specifies day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it an int, but this just pushes the problem into main.go. I think I prefer it like this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I had to dig into the model code to make sense of that. It's just an alias to a number of milliseconds, rather than, say, a struct. I guess we could make our own type alias, Day, to make it clearer, but not worth it IMO.

@@ -206,6 +208,10 @@ type AWSStore struct {
tableName string
bucketName string

// Around which time to start bucketing indexes by day instead of by hour.
// Only the day matters, not the time within the day.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before or after this time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this time; will update comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

fromDay = from.Unix() / secondsInDay
throughDay = through.Unix() / secondsInDay

firstDailyBucket = c.dailyBucketsFrom.Unix() / secondsInDay
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a personal taste thing, but I'd just make dailyBucketsFrom a parameter. Would reduce total LoC because you wouldn't have to construct an AWSStore in your tests. Up to you though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment to @juliusv was the exact opposite - mainly because of whats coming in #181.

@@ -332,25 +339,45 @@ func (c *AWSStore) CreateTables() error {
return err
}

func bigBuckets(from, through model.Time) []int64 {
func (c *AWSStore) bigBuckets(from, through model.Time) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reckon this function / method could use a comment saying what it does. Especially, it should give examples of the expected bucket strings. Or point to the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -146,3 +146,44 @@ func diff(want, have interface{}) string {
})
return "\n" + text
}

func TestBigBuckets(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really glad you wrote tests!

I found them a bit hard to understand their purpose at first, so I'm going to suggest some commentary inside.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

dailyBucketsFrom: model.TimeFromUnix(0).Add(1*24*time.Hour) - 1,
buckets: []string{"0", "d0", "d1", "d2"},
},
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// If `dailyBucketsFrom` is after the interval, everything will be bucketed by hour.

through: model.TimeFromUnix(0).Add(2 * 24 * time.Hour),
dailyBucketsFrom: model.TimeFromUnix(0).Add(99 * 24 * time.Hour),
buckets: []string{"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48"},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add another case that checks when dailyBucketsFrom is before the interval?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

var (
secondsInHour = int64(time.Hour / time.Second)
fromHour = from.Unix() / secondsInHour
throughHour = through.Unix() / secondsInHour
result []int64

secondsInDay = int64(24 * time.Hour / time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know Go is a bit fiddly about this, but wdyt about making secondsInDay and secondsInHour consts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if i > lastHourlyBucket {
break
}
result = append(result, strconv.Itoa(int(i)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be prefixed with h? Or is there a constraint which prevents that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backwards compatibility prevents us doing that.

@@ -246,10 +249,15 @@ func setupChunkStore(cfg cfg) (chunk.Store, error) {
Expiration: cfg.memcachedExpiration,
}
}
dailyBucketsFrom, err := time.Parse("2006-01-02", cfg.dynamodbDailyBucketsFrom)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"2006-12-31" might communicate intent a bit better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats not how go date parsing works - https://golang.org/pkg/time/#Parse

The layout defines the format by showing how the reference time, defined to be:

Mon Jan 2 15:04:05 -0700 MST 2006

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, OK. That's ... interesting.

@tomwilkie
Copy link
Contributor Author

@jml PTAL

Copy link
Contributor

@jml jml left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

buckets: []string{"0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "d1", "d2"},
},

{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a comment describing the behaviour this is testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll have to wait for @juliusv next week

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess I just wanted to show that 24 1-hour buckets disappear when you make the dailyBucketsFriom one day earlier than in the case above (but the one overlapping 1h bucket still remains). So maybe the following?

"// Moving dailyBucketsFrom to the previous day compared to the above makes 24 1-hour buckets disappear."

But yeah, the incremental value of this test case is perhaps debatable :)

@tomwilkie tomwilkie merged commit 3592422 into master Jan 5, 2017
@tomwilkie tomwilkie deleted the daily-buckets branch January 5, 2017 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants