Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

IDK, "which ZIP codes do we have customers in?" seems like a reasonable thing to want to know


The very next ask will be "order the zipcodes by number of customers" at which point you'll be back to aggregations, which is where you should have started


Anti-Patterns You Should Avoid: overengineering for potential future requirements. Are there real-life cases where you should design with the future in mind? Yes. Are there real-life cases where DISTINCT is the best choice by whatever metric you prioritize at the time? Also yes.


> Are there real-life cases where DISTINCT is the best choice by whatever metric you prioritize at the time

Indeed, along that line, I would say that DISTINCT can be used to convey intent... and doing that in code is important.

- I want to know the zipcodes we have customers in - DISTINCT

- I want to know how many customers we have in each zipcode - aggregates

Can you do the first with the second? Sure.. but the first makes it clear what your goal is.


Partly in jest, but maybe we need a NON-DISTINCT signaller to convey the inverse and return duplicate values only.

SOMEWHAT-DISTINCT with a fuzzy threshold would also be useful.


I hear you. It's not all _that_ uncommon for me to query for "things with more than one instance". Although, to be fair, it's more common for me to that when grep/sort/uniqing logs on the command line.


Here we start to get close to analytics sql vs application sql, and I think that's a whole separate beast itself with different patterns and anti-patterns.


Ah, yeah, you beat me to it. I do reporting, not applications.


distinct seems like an aggregation to me


Whole seconds will have been wasted!


I do reporting, not application development. If somebody wants to know different information I'd write a different query.


count(id) group by post_code order by 1


In OP's defense, "becoming suspicious" doesn't mean it's always wrong. I would definitely suggest an explaining comment if someone is using DISTINCT in a multi-column query.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: