Galaxy S9 Exynos 9810 Hands-On – Awkward First Results

Following our launch article I promised an update on the performance scores of the Exynos 9810 variant of the Galaxy S9. I was able to have some time with one of the demo devices at the launch event and thoroughly benchmark it with a few of our common tests.

Samsung Exynos SoCs Specifications
SoC Exynos 9810 Exynos 8895
CPU 4x Exynos M3
One Core : 2.704 GHz
Two Core: 2.314 GHz
Four Core: 1.794 GHz
4x 512KB L2
4096KB L3 DSU
4x Exynos M2
@ 2.314 GHz
2048KB L2
4x Cortex A55 @ 1.95 GHz
No L2
512KB L3 DSU
4 x Cortex A53 @ 1.690 GHz
512KB L2
GPU Mali G72MP18
@ 572 MHz
Mali G71MP20
@ 546 MHz

As a refresher, early in the year Samsung LSI had dropped a bombshell in claiming an astounding 2x single-thread performance improvement with the new Exynos 9810. While this initially caused a lot of controversy and discussions on the validity of the claim, early this year we exclusively covered the high-level micro-architectural features of the new Exynos M3 core and by then it was clear that the performance claims were not just marketing claims. The new Samsung CPU core is the first “very wide” CPU microarchitecture to power Android SoCs and the first to finally follow Apple’s footsteps in the direction of maximising single-thread performance. As a result it stands to be a very interesting – and ideally very powerful – SoC for the Android market.

Determining Clock Speeds

Firstly one of the biggest questions for me was confirming the final clock that Samsung would use on the Galaxy S9. We detected the clock as 2704 MHz, which is 200MHz less than the 2.9 GHz that Samsung’s LSI division advertises for the chipset. What makes the story more compelling is that the 2.7 GHz clock is only achievable when one of the cores in the cluster is active – thus making Samsung employ scalable maximum frequencies depending on active core numbers in the big cluster. At two active cores the frequency drops down to 2314 MHz while three and four active cores the cores clock down to only 1794 MHz.

We can also confirm that the Mali G72MP18 GPU is running at a very conservative 572MHz. This is not what we had expected – the previous generation Exynos 8895 had a larger MP20 configuration, running at a similar 546MHz. The resulting performance gains for the GPU thus seem to be even lower than we had expected, as I was betting on a ~650-700 MHz clock for the graphics.

Memory Latency

I was also able to confirm the cache configurations of the CPUs with help of our latency test. The L1D cache of the M3 cores is 64KB, up from the 32KB on the previous generation. The M3 cores also come with 512KB of private L2 caches, and a shared 4MB L3 cache.

The little A55 cores came at a surprise as they look to be in a separate cluster, rather than in a single DynamIQ cluster with the big cores. This creates something similar to a big.Little design, but each part of the 4+4 is its own DynamIQ cluster. So here it looks like Samsung has decided not to employ the optional L2 caches for the Cortex A55s, and instead the cluster solely relies on a shared 512KB L3 cache of the DSU. The latency scores to DRAM are outlandishly good and the best we’ve ever seen among current Android SoCs, so Samsung has definitely introduced a new generation of interconnect or memory controllers.

Parsing the Benchmark Results: Geekbench Looks Good

GeekBench 4 Single Core

In our testing we were able to confirm the GeekBench 4 scores already leaked, where we saw the Exynos 9810 achieving excellent performance gains and vastly outpacing the Snapdragon 845, and coming into the territory of the Apple A10 and A11. Meanwhile versus the last-generation Exynos 8895, the floating point performance increases handily exceed Samsung’s projected gains of 2x as we see a 114% improvement even at the lowered 2.7GHz frequency.

Geekbench 4 (Single Threaded) Integer Score/MHz

Geekbench 4 (Single Threaded) Floating Point Score/MHz

When looking at the performance per clock it is clear how the Exynos M3 distinguishes itself as a much wider microarchitecture compared to any other existing CPU which powers Android SoCs.

Parsing the Benchmark Results: PCMark and Web Tests

Finally I stumbled upon some very questionable performance figures when testing system performance. I’m not going to go into the details for every benchmark as they are generally all painting the same picture:

PCMark Work 2.0 - Web Browsing 2.0

PCMark Work 2.0 - Writing 2.0

PCMark Work 2.0 - Data Manipulation

PCMark Work 2.0 - Photo Editing 2.0

WebXPRT 2015 - OS WebView

Speedometer 2.0 - OS WebView

What seems clear is that there is something is very very wrong with the Exynos 9810 S9+ that I tested. It was barely able to distinguish itself from last year’s Exynos 8895, let alone the Snapdragon 845 in the Qualcomm Reference Device which we previewed earlier this month. I looked through the system and monitored frequencies and indeed the big cores were reaching the maximum 2.7GHz core frequency. The only explanation I have right now is that it’s possible that the DVFS configuration, as well as the scheduler, are currently so conservatively tuned that there is barely any activity on the big cores.

I dug a bit more through the system and found out Samsung uses some new scheduler called “eHMP”. I’m not sure if this is something based on EAS but the system did use schedutil as a frequency governor.

One of the Samsung spokesmen confirmed to me that the demo unit were running special firmware for MWC and that they might not be optimized. I’m having a bit of a hard time believing they would so drastically limit the performance of the device for the show demo units and less so that they would mess around with the scheduler settings. I did get confirmation that Samsung is planning to “tune down” the Exynos variant to match the Snapdragon performance – however the current scores which I got on these devices make absolutely no sense so I do hope this is just a mistake that will be resolved in shipping firmwares and we see the full potential of the SoC.

Parsing the Benchmark Results: Graphics

On the GPU side, the lower cluster count of the new Mali G72MP18 is a surprise, as the minor clock bump is negated by the fact that the new SoC has two less GPU cores compared to the 8895. If the performance per clock per core between the G71 and G72 were the same then this would actually mean a downgrade in raw GPU power from the Exynos 8895, so any increase, if any, should come solely thanks to the architectural changes of the new G72 GPU, power efficiency improvements, as well as possibly SoC memory subsystem improvements.

GFXBench Manhattan 3.1 Off-screen - Peak

In Manhattan 3.1 the Exynos 9810 sees a mere 7% increase and lags far behind the new Snapdragon 845’s Adreno 630.

GFXBench T-Rex 2.7 Off-screen - Peak

In T-Rex, the increase is 18% which might be one of the benchmarks that Samsung sourced their 20% improvement from. Here the Exynos is more near to the performance of the Snapdragon 845.

Measuring Power

I wasn’t able to properly measure power on the event demo devices, as they had different interface settings than my tool had been programmed with, so I only was able to make some inaccurate estimates based on coarse current readout from the system.

For CPU workloads, our usual CPU power virus used up 3.1W at 1-core 2.7 GHz loads. 2-core 2.3 GHz seemed to have floated around 3.1-3.5W, and a 4-core load at 1.8 GHz maintained this power consumption.

Over the following days I will need more time, and hopefully get some SPEC figures to paint a more accurate picture. For now the results could swing either way and be either positive or negative for the M3 cores. It’s clear that the higher frequencies have a very large power penalty, and Samsung should want to operate more in the low-to-mid frequencies, hence the current frequency scheme.

On the GPU side for Manhattan fluctuated between 4.5 and 5.2W, which is an improvement over the Exynos 8895. But again, this is still at a disadvantage compared to the Snapdragon 845.

Quick Thoughts

Overall today’s quick benchmarking session opened up more questions than it managed to answer. Hopefully with more time we will be able to investigate the working of the new SoC and, fingers crossed, today’s results are not representative of shipping product as that would otherwise be an utterly massive disappointment.

Original Article