It's rife across the whole benchmarking industry. Lies, damn lies, and benchmarks.
The massive problem, which has particularly afflicted Qualcomm and Intel in the past, is the difference between peak and sustained performance being absolutely huge, since running at peak would trigger thermal conditions necessitating dynamic down clocking to below normal speeds, which would then persist for a surprisingly long time before normal performance resumes. Certain groups would detect known benchmarking code and alter the acceptable thermal parameters for the execution of the benchmark tests. (i.e. allow the device to become unusually hot).
As others have mentioned this has all the hallmarks of an execution failure, either by Google, Samsung, both, or some unknown third party. Google have a cultural problem of believing way too much in theoretical untested potential solutions instead of aiming for the boring known to work but only 95% as theoretically good, only to find when building it reality is different enough to more than nullify the advantages. It has bitten them in this area before, and it more than likely will again.
Now that you mention it...I have only seen one review about workstations that benchmarked machines for 24 hours to make sure they didn't thermally saturate and throttle and I'm wondering why that isn't more of a metric for non mobile stuff. Heck even mobile workstations may be put under heavy load for durations and this would show if their cooling keeps up or falls short
No, Apple is not the same for this. Their devices do not suffer from anything like the same downtime after running at peak - but they can do nothing to stop developers draining the battery, and that's what is going on. If they were to do that developers would have a giant hissy fit about it.
I have worked with prototypes from other manufacturers that had to be recalled after burning people. I have also found bugs in SoCs that their manufacturers were lying to OEMs about, and had to explain to different OEMs what was going on, after they had shipped millions of units.
The simple truth is modern developers are irresponsible and so the manufacturers get to play everyone off against each other to spread as much self serving nonsense as possible.
The massive problem, which has particularly afflicted Qualcomm and Intel in the past, is the difference between peak and sustained performance being absolutely huge, since running at peak would trigger thermal conditions necessitating dynamic down clocking to below normal speeds, which would then persist for a surprisingly long time before normal performance resumes. Certain groups would detect known benchmarking code and alter the acceptable thermal parameters for the execution of the benchmark tests. (i.e. allow the device to become unusually hot).
As others have mentioned this has all the hallmarks of an execution failure, either by Google, Samsung, both, or some unknown third party. Google have a cultural problem of believing way too much in theoretical untested potential solutions instead of aiming for the boring known to work but only 95% as theoretically good, only to find when building it reality is different enough to more than nullify the advantages. It has bitten them in this area before, and it more than likely will again.