Quality of the model tends to be pretty subjective, and people also complain about gaming benchmarks. At least context window length and generation speed are concrete improvements. There's always a way you can downplay how valuable or impressive a model is.