Are there best practices for time series database designs?

jupyter@programming.dev · 4 days ago

Are there best practices for time series database designs?

tmat256@lemmings.world · 4 days ago

How are you planning on querying the data? I would plan your columns and indexes based on that answer. I’m going to assume arbitrary time ranges.

Just indexing a standard datetime column should be sufficient but that’s going to depend on a few things. Like you’re not accessing date ranges 10,000 times per second (website data source). Or you’re not inserting 100,000 records per second. Even then you might be able to get away with it if you want to throw more hardware at it.

The answer to most of database design questions at larger scales depends on the scale. If you need to query this date once for a report you do it in a spreadsheet. If you are building the next Facebook api then you shard the crap out of it into a time series database and add layers of caching.

My suggestion is to build it simple first and test. Don’t make assumptions about scale before you have data to back it up…

jupyter@programming.dev · 4 days ago

Thanks! That helps a lot

plumbus@feddit.org · 4 days ago

I thought InfluxDB is the choice for such use cases. But I’m not an expert…