Senin, 16 Desember 2019

The Snapdragon 865 Performance Preview: Setting the Stage for Flagship Android 2020 - AnandTech

Earlier this month we had the pleasure to attend Qualcomm’s Maui launch event of the new Snapdragon 865 and 765 mobile platforms. The new chipsets promise to bring a lot of new upgrades in terms of performance and features, and undoubtedly will be the silicon upon which the vast majority of 2020 flagship devices will base their designs on. We’ve covered the new improvements and changes of the new chipset in our dedicated launch article, so be sure to read that piece if you’re not yet familiar with the Snapdragon 865.

As has seemingly become a tradition with Qualcomm, following the launch event we’ve been given the opportunity to have some hands-on time with the company’s reference devices, and had the chance to run the phones through our benchmark suite. The QRD865 is a reference phone made by Qualcomm and integrates the new flagship chip. The device offers insight into what we should be expecting from commercial devices in 2020, and today’s piece particularly focuses on the performance improvements of the new generation.

A quick recap of the Snapdragon 865 if you haven’t read the more thorough examination of the changes:

Qualcomm Snapdragon Flagship SoCs 2019-2020
SoC

Snapdragon 865

Snapdragon 855
CPU 1x Cortex A77
@ 2.84GHz 1x512KB pL2

3x Cortex A77
@ 2.42GHz 3x256KB pL2

4x Cortex A55
@ 1.80GHz 4x128KB pL2

4MB sL3 @ ?MHz

1x Kryo 485 Gold (A76 derivative)
@ 2.84GHz 1x512KB pL2

3x Kryo 485 Gold (A76 derivative)
@ 2.42GHz 3x256KB pL2

4x Kryo 485 Silver (A55 derivative)
@ 1.80GHz 4x128KB pL2

2MB sL3 @ 1612MHz

GPU Adreno 650 @ 587 MHz

+25% perf
+50% ALUs
+50% pixel/clock
+0% texels/clock

Adreno 640 @ 585 MHz
DSP / NPU Hexagon 698

15 TOPS AI
(Total CPU+GPU+HVX+Tensor)

Hexagon 690

7 TOPS AI
(Total CPU+GPU+HVX+Tensor)

Memory
Controller
4x 16-bit CH

@ 2133MHz LPDDR4X / 33.4GB/s
or
@ 2750MHz LPDDR5  /  44.0GB/s

3MB system level cache

4x 16-bit CH

@ 1866MHz LPDDR4X 29.9GB/s

3MB system level cache

ISP/Camera Dual 14-bit Spectra 480 ISP

1x 200MP

64MP ZSL or 2x 25MP ZSL

4K video & 64MP burst capture

Dual 14-bit Spectra 380 ISP

1x 192MP

1x 48MP ZSL or 2x 22MP ZSL

Encode/
Decode
8K30 / 4K120 10-bit H.265

Dolby Vision, HDR10+, HDR10, HLG

720p960 infinite recording

4K60 10-bit H.265

HDR10, HDR10+, HLG

720p480

Integrated Modem none
(Paired with external X55 only)

(LTE Category 24/22)
DL = 2500 Mbps
7x20MHz CA, 1024-QAM
UL = 316 Mbps
3x20MHz CA, 256-QAM

(5G NR Sub-6 + mmWave)
DL = 7000 Mbps
UL = 3000 Mbps

Snapdragon X24 LTE
(Category 20)

DL = 2000Mbps
7x20MHz CA, 256-QAM, 4x4

UL = 316Mbps
3x20MHz CA, 256-QAM

Mfc. Process TSMC
7nm (N7P)
TSMC
7nm (N7)

The Snapdragon 865 is a successor to the Snapdragon 855 last year, and thus represents Qualcomm’s latest flagship chipset offering the newest IP and technologies. On the CPU side, Qualcomm has integrated Arm’s newest Cortex-A77 CPU cores, replacing the A76-based IP from last year. This year Qualcomm has decided against requesting any microarchitectural changes to the IP, so unlike the semi-custom Kryo 485 / A76-based CPUs which had some differing aspects to the design, the new A77 in the Snapdragon 865 represents the default IP configuration that Arm offers.

Clock frequencies and core cache configurations haven’t changed this year – there’s still a single “Prime” A77 CPU core with 512KB cache running at a higher 2.84GHz and three “Performance” or “Gold” cores with reduced 256KB caches at a lower 2.42GHz. The four little cores remain A55s, and also the same cache configuration as well as the 1.8GHz clock. The L3 cache of the CPU cluster has been doubled from 2 to 4MB. In general, Qualcomm’s advertised 25% performance uplift on the CPU side solely comes from the IPC increases of the new A77 cores.

The GPU this year features an updates Adreno 650 design which increases ALU and pixel rendering units by 50%. The end-result in terms of performance is a promised 25% upgrade – it’s likely that the company is running the new block at a lower frequency than what we’ve seen on the Snapdragon 855, although we won’t be able to confirm this until we have access to commercial devices early next year.

A big performance upgrade on the new chip is the quadrupling of the processing power of the new Tensor cores in the Hexagon 698. Qualcomm advertises 15 TOPS throughput for all computing blocks on the SoC and we estimate that the new Tensor cores roughly represent 10 TOPS out of that figure.

In general, the Snapdragon 865 promises to be a very versatile chip and comes with a lot of new improvements – particularly 5G connectivity and new camera capabilities are promised to be the key features of the new SoC. Today’s focus lies solely on the performance of the chip, so let’s move on to our first test results and analysis.

New Memory Controllers & LPDDR5: A Big Improvement

One of the larger changes in the SoC this generation was the integration of a new hybrid LPDDR5 and LPDDR4X memory controller. On the QRD865 device we’ve tested the chip was naturally equipped with the new LP5 standard. Qualcomm was actually downplaying the importance of LP5 itself: the new standard does bring higher memory speeds providing better bandwidth, however latency should be the same, and power efficiency benefits, while there, shouldn’t be overplayed. Nevertheless, Qualcomm did claim they focused more on improving their memory controllers, and this year we’re finally seeing the new chip address some of the weaknesses exhibited by the past two generations; memory latency.

We had criticised Qualcomm’s Snapdragon 845 and 855 for having quite bad memory latency – ever since the company had introduced their system level cache architecture to the designs, this aspect of the memory subsystem had seen some rather mediocre characteristics. There’s been a lot of arguments in regards to how much this actually affected performance, with Qualcomm themselves naturally downplaying the differences. Arm generally notes a 1% performance difference for each 5ns of latency to DRAM, if the differences are big, it can sum up to a noticeable difference.


 (   )

Looking at the new Snapdragon 865, the first thing that pops up when comparing the two latency charts is the doubled L3 cache of the new chip. It’s to be noted that it does look that there’s still some sort of logical partitioning going on and 512KB of the cache may be dedicated to the little cores, as random-access latencies start going up at 1.5MB for the S855 and 3.5MB for the S865.

Further down in the deeper memory regions, we’re seeing some very big changes in latency. Qualcomm has been able to shave off around 35ns in the full random-access test, and we’re estimating that the structural latency of the chip now falls in at ~109ns – a 20ns improvements over its predecessor. While it’s a very good improvements in itself, it’s still a slightly behind the designs of HiSilicon, Apple and Samsung. So, while Qualcomm still is the last of the bunch in regards to its memory subsystem, it’s no longer trailing behind by such a large margin. Keep in mind the results of the Kirin 990 here as we go into more detailed analysis of memory-intensive workloads in SPEC on the next page.

Furthermore, what’s very interesting about Qualcomm’s results in the DRAM region is the behaviour of the TLB+CLR Trash test. This test is always hitting the same cache-line within a page across different, forcing a cache line replacement. The oddity here is that the Snapdragon 865 here behaves very differently to the 855, with the results showcasing a separate “step” in the results between 4MB and ~32MB. This result is more of an artefact of the test only hitting a single cache line per page rather than the chip actually having some sort of 32MB hidden cache. My theory is that Qualcomm has done some sort of optimisation to the cache-line replacement policy at the memory controller level, and instead the test hitting DRAM, it’s actually residing at on the SLC cache. It’s a very interesting result and so far, it’s the first and only chipset to exhibit such behaviour. If it’s indeed the SLC, the latency would fall in at around 25-35ns, with the non-uniform latency likely being a result of the four cache slices dedicated to the four memory controllers.

Overall, it looks like Qualcomm has made rather big changes to the memory subsystem this year, and we’re looking forward to see the impact on performance.

We’re moving on to SPEC2006, analysing the new single-threaded performance of the new Cortex-A77 cores. As the new CPU is running at the same clock as the A76-derived design of the Snapdragon 855, any improvements we’ll be seeing today are likely due to the IPC improvements of the core, the doubled L3 cache, as well as the enhancements to the memory controllers and memory subsystem of the chip.

Disclaimer About Power Figures Today:

The power figures presented today were captured using the same methodology we generally use on commercial devices, however this year we’ve noted a large discrepancy between figures reported by the QRD865’s fuel-gauge and the actual power consumption of the device. Generally, we’ve noted that there’s a discrepancy factor of roughly 3x. We’ve reached out to Qualcomm and they confirmed in a very quick testing that there’s a discrepancy of >2.5x. Furthermore, the QRD865 phones this year again suffered from excessive idle power figures of >1.3W.

I’ve attempted to compensate the data as best I could, however the figures published today are merely preliminary and of lower confidence than usual. For what it’s worth, last year, the QRD855 data was within 5% of the commercial phones’ measurements. We’ll be naturally re-testing everything once we get our hands on final commercial devices.

In the SPECint2006 suite, we’re seeing some noticeable performance improvements across the board, with some benchmarks posting some larger than expected increases. The biggest improvements are seen in the memory intensive workloads. 429.mcf is DRAM latency bound and sees a massive improvement of up to 46% compared to the Snapdragon 855.

What’s interesting to see is that some execution bound benchmarks such as 456.hmmer seeing a 28% upgrade. The A77 has an added 4th ALU which represents a 33% throughput increase in simple integer operations, which I don’t doubt is a major reason for the improvements seen here.

The improvements aren’t across the board, with 400.perlbench in particular seeing even a slight degradation for some reason. 403.gcc also saw a smaller 12% increase – it’s likely these benchmarks are bound by other aspects of the microarchitecture.

The power consumption and energy efficiency, if the numbers are correct, roughly match our expectations of the microarchitecture. Power has gone up with performance, but because of the higher performance and smaller runtime of the workloads, energy usage has remained roughly flat. Actually in several tests it’s actually improved in terms of efficiency when compared to the Snapdragon 855, but we’ll have to wait on commercial devices in order to make some definitive conclusions here.

In the SPECfp2006 suite, we’re seeing also seeing some very varied improvements. The biggest change happened to 470.lbm which has a very big hot loop and is memory bandwidth hungry. I think the A77’s new MOP-cache here would help a lot in regards to the instruction throughput, and the improved memory subsystem makes the massive 65% performance jump possible.

Arm actually had advertised IPC improvements of ~25% and ~35% for the int and FP suite of SPEC2006. On the int side, we’re indeed hitting 25% on the Snapdragon 865, compared to the S855, however on the FP side we’re a bit short as the increase falls in at around 29%. The performance increases here strongly depend on the SoC and particular on the memory subsystem, compared to the Kirin 990’s A76 implementation the increases here are only 20% and 24%, but HiSilicon’s chip also has a stronger memory subsystem which allows it to gain quite more performance over the A76’s in the S855.

The overall results for SPEC2006 are very good for the Snapdragon 865. Performance is exactly where Qualcomm advertised it would land at, and we’re seeing a 25% increase in SPECint2006 and a 29% in SPECfp2006. On the integer side, the A77 still trails Apple’s Monsoon cores in the A11, but the new Arm design now has been able to trounce it in the FP suite. We’re still a bit far away from the microarchitectures catching up to Apple’s latest designs, but if Arm keeps up this 25-30% yearly improvement rate, we should be getting there in a few more iterations.

The power and energy efficiency figures, again, taken with a grain of salt, are also very much in line with expectations. Power has slightly increased with performance this generation, however due to the performance increase, energy efficiency has remained relatively flat, or has even seen a slight improvement.

System performance on the QRD865 was a bit of a tricky topic, as we’ve seen that the same chipset can differ quite a lot depending on the software implementation done by the vendor. For the performance preview this year, Qualcomm again integrated a “Performance” mode on the test devices, alongside the default scheduler and DVFS behaviour of the BSP delivered to vendors.

There’s a fine line between genuine “Performance” modes as implemented on commercial devices such as from Samsung and Huawei, which make tunings to the DVFS and schedulers which increase performance while remaining reasonable in their aggressiveness, and more absurd “cheating” performance modes such as implemented by OPPO for example, which simply ramp up the minimum frequencies of the chip.

Qualcomm’s performance mode on the QRD865 is walking this fine line – it’s extremely aggressive in that it’s ramping up the chipset to maximum frequency in ~30ms. It’s also having the little cores start at a notably higher frequency than in the default mode. Nevertheless, it’s still a legitimate operation mode, although I do not expect very many devices to be configured in this way.

The default mode on the other hand is quite similar to what we’ve seen on the Snapdragon 855 QRD last year, but the issue is that this was also rather conservative and many popular devices such as the Galaxy S10 were configured to be more aggressive. Whilst the default config of the QRD865 should be representative of most devices next year, I do expect many of them to do better than the figures represented by this config.

PCMark Work 2.0 - Web Browsing 2.0

Starting off with the web browsing test, we’re seeing the big difference in performance scaling between the two chipsets. The test here is mostly sensible to the performance scaling of the A55 cores. The QRD865 in the default more is more conservative than some existing S855 devices, which is why it performs worse in those situations. On the other hand, the performance results of the QRD865 here are also extremely aggressive and receives the best results out there amongst our current device range. I expect commercial devices to fall in somewhere between the two extremes.

PCMark Work 2.0 - Video Editing

The video editing test nowadays is no longer performance sensitive and most devices fall in the same result range.

PCMark Work 2.0 - Writing 2.0

The writing test is amongst the most important and representative of daily performance of a device, and here the QRD865 does well in both configurations. The Mate 30 Pro with the Kirin 990 is the only other competitive device at this performance level.

PCMark Work 2.0 - Photo Editing 2.0

The Photo Editing test makes use of RenderScript and GPU acceleration, and here it seems the new QRD865 makes some big improvements. Performance is a step-function higher than previous generation devices.

PCMark Work 2.0 - Data Manipulation

Finally, the data manipulation test oddly enough falls in middle of the pack for both performance modes. I’m not too sure as to why this is, but we’ve seen the test being quite sensible to scheduler or even OS configurations.

PCMark Work 2.0 - Performance

Generally, the QRD865 phone landed at the top of the rankings in PCMark.

Web Benchmarks

Speedometer 2.0 - OS WebView WebXPRT 3 - OS WebView JetStream 2 - OS Webview

The web benchmarks results presented here were somewhat disappointing. The QRD865 really didn’t manage to differentiate itself from the rest of the Android pack even though it was supposed to be roughly 20-25% ahead in theory. I’m not sure what the limitation here is, but the 5-10% increases are well below what we had hoped for. For now, it seems like the performance gap to Apple’s chips remains significant.

System Performance Conclusion

Overall, we expect system performance of Snapdragon 865 devices to be excellent. Commercial devices will likely differ somewhat in terms of their scores as I do not expect them to be configured exactly the same as the QRD865. I was rather disappointed with the web benchmarks as the improvements were quite meagre – in hindsight it might be a reason as to why Arm didn’t talk about them at all during the Cortex-A77 launch.

AIMark 3

AIMark makes use of various vendor SDKs to implement the benchmarks. This means that the end-results really aren’t a proper apples-to-apples comparison, however it represents an approach that actually will be used by some vendors in their in-house applications or even some rare third-party app.

Disclaimer: We didn't manange to run AIMark 3 ourselves, the below scores are credited to 肥威 @ Weibo.

鲁大师 / Master Lu - AIMark 3 - InceptionV3 鲁大师 / Master Lu - AIMark 3 - ResNet34 鲁大师 / Master Lu - AIMark 3 - MobileNet-SSD 鲁大师 / Master Lu - AIMark 3 - DeepLabV3

AIBenchmark 3

AIBenchmark takes a different approach to benchmarking. Here the test uses the hardware agnostic NNAPI in order to accelerate inferencing, meaning it doesn’t use any proprietary aspects of a given hardware except for the drivers that actually enable the abstraction between software and hardware. This approach is more apples-to-apples, but also means that we can’t do cross-platform comparisons, like testing iPhones.

We’re publishing one-shot inference times. The difference here to sustained performance inference times is that these figures have more timing overhead on the part of the software stack from initialising the test to actually executing the computation.

AIBenchmark 3 - NNAPI CPU

We’re segregating the AIBenchmark scores by execution block, starting off with the regular CPU workloads that simply use TensorFlow libraries and do not attempt to run on specialized hardware blocks.

AIBenchmark 3 - 1 - The Life - CPU/FP AIBenchmark 3 - 2 - Zoo - CPU/FP AIBenchmark 3 - 3 - Pioneers - CPU/INT AIBenchmark 3 - 4 - Let's Play - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/FP AIBenchmark 3 - 7 - Ms. Universe - CPU/INT AIBenchmark 3 - 8 - Blur iT! - CPU/FP

Starting off with the CPU accelerated benchmarks, we’re seeing some large improvements of the Snapdragon 865. It’s particularly the FP workloads that are seeing some big performance increases, and it seems these improvements are likely linked to the microarchitectural improvements of the A77.

AIBenchmark 3 - NNAPI INT8

AIBenchmark 3 - 1 - The Life - INT8 AIBenchmark 3 - 2 - Zoo - Int8 AIBenchmark 3 - 3 - Pioneers - INT8 AIBenchmark 3 - 5 - Masterpiece - INT8 AIBenchmark 3 - 6 - Cartoons - INT8

INT8 workload acceleration in AI Benchmark happens on the HVX cores of the DSP rather than the Tensor cores, for which the benchmark currently doesn’t have support for. The performance increases here are relatively in line with what we expect in terms of iterative clock frequency increases of the IP block.

AIBenchmark 3 - NNAPI FP16

AIBenchmark 3 - 1 - The Life - FP16 AIBenchmark 3 - 2 - Zoo - FP16 AIBenchmark 3 - 3 - Pioneers - FP16 AIBenchmark 3 - 5 - Masterpiece - FP16 AIBenchmark 3 - 6 - Cartoons - FP16 AIBenchmark 3 - 9 - Berlin Driving - FP16 AIBenchmark 3 - 10 - WESPE-dn - FP16

FP16 acceleration on the Snapdragon 865 through NNAPI is likely facilitated through the GPU, and we’re seeing iterative improvements in the scores. Huawei’s Mate 30 Pro is in the lead in the vast majority of the tests as it’s able to make use of its NPU which support FP16 acceleration, and its performance here is quite significantly ahead of the Qualcomm chipsets.

AIBenchmark 3 - NNAPI FP32

AIBenchmark 3 - 10 - WESPE-dn - FP32

Finally, the FP32 test should be accelerated by the GPU. Oddly enough here the QRD865 doesn’t fare as well as some of the best S855 devices. It’s to be noted that the results here today were based on an early software stack for the S865 – it’s possible and even very likely that things will improve over the coming months, and the results will be different on commercial devices.

Overall, there’s again a conundrum for us in regards to AI benchmarks today, the tests need to be continuously developed in order to properly support the hardware. The test currently doesn’t make use of the Tensor cores of the Snapdragon 865, so it’s not able to showcase one of the biggest areas of improvement for the chipset. In that sense, benchmarks don’t really mean very much, and the true power of the chipset will only be exhibited by first-party applications such as the camera apps, of the upcoming Snapdragon 865 devices.

On the GPU side of things, testing the QRD865 is a bit complicated as we simply didn’t have enough time to run the device through our usual test methodology where we stress both peak as well as sustained performance of the chip. Thus, the results we’re able to present today solely address the peak performance characteristics of the new Adreno 650 GPU.

Disclaimer On Power: As with the CPU results, the GPU power measurements on the QRD865 are not as high confidence as on a commercial device, and the preliminary power and efficiency figures posted below might differ in final devices.

3DMark Sling Shot 3.1 Extreme Unlimited - Physics

The 3DMark Physics tests is a CPU-bound benchmark within a GPU power constrained scenario. The QRD865 here oddly enough doesn’t showcase major improvements compared to its predecessor, in some cases actually being slightly slower than the Pixel 4 XL and also falling behind the Kirin 990 powered Mate 30 Pro even though the new Snapdragon has a microarchitectural advantage. It seems the A77 does very little in terms of improving the bottlenecks of this test.

3DMark Sling Shot 3.1 Extreme Unlimited - Graphics

In the 3DMark Graphics test, the QRD865 results are more in line with what we expect of the GPU. Depending on which S855 you compare to, we’re seeing 15-22% improvements in the peak performance.

GFXBench Aztec Ruins - High - Vulkan/Metal - Off-screen

In the GFXBench Aztec High benchmark, the improvement over the Snapdragon 855 is roughly 26%. There’s one apparent issue here when looking at the chart rankings; although there’s an improvement in the peak performance, the end result is that the QRD865 still isn’t able to reach the sustained performance of Apple’s latest A13 phones.

GFXBench Aztec High Offscreen Power Efficiency
(System Active Power)
  Mfc. Process FPS Avg. Power
(W)
Perf/W
Efficiency
iPhone 11 Pro (A13) Warm N7P 26.14 3.83 6.82 fps/W
iPhone 11 Pro (A13) Cold / Peak N7P 34.00 6.21 5.47 fps/W
iPhone XS (A12) Warm N7 19.32 3.81 5.07 fps/W
iPhone XS (A12) Cold / Peak N7 26.59 5.56 4.78 fps/W
QRD865 (Snapdragon 865) N7P 20.38 4.58 4.44 fps/W
Mate 30 Pro (Kirin 990 4G) N7 16.50 3.96 4.16 fps/W
Galaxy 10+ (Snapdragon 855) N7 16.17 4.69 3.44 fps/W
Galaxy 10+ (Exynos 9820) 8LPP 15.59 4.80 3.24 fps/W

Looking at the estimated power draw of the phone, it indeed does look like Qualcomm has been able to sustain the same power levels as the S855, but the improvements in performance and efficiency here aren’t enough to catch up to either the A12 or A13, with Apple being both ahead in terms of performance, power and efficiency.

GFXBench Aztec Ruins - Normal - Vulkan/Metal - Off-screen

GFXBench Aztec Normal Offscreen Power Efficiency
(System Active Power)
  Mfc. Process FPS Avg. Power
(W)
Perf/W
Efficiency
iPhone 11 Pro (A13) Warm N7P 73.27 4.07 18.00 fps/W
iPhone 11 Pro (A13) Cold / Peak N7P 91.62 6.08 15.06 fps/W
iPhone XS (A12) Warm N7 55.70 3.88 14.35 fps/W
iPhone XS (A12) Cold / Peak N7 76.00 5.59 13.59 fps/W
QRD865 (Snapdragon 865) N7P 53.65 4.65 11.53 fps/W
Mate 30 Pro (Kirin 990 4G) N7 41.68 4.01 10.39 fps/W
Galaxy 10+ (Snapdragon 855) N7 40.63 4.14 9.81 fps/W
Galaxy 10+ (Exynos 9820) 8LPP 40.18 4.62 8.69 fps/W

We’re seeing a similar scenario in the Normal variant of the Aztec test. Although the performance improvements here do match the promised figures, it’s not enough to catch up to Apple’s two latest SoC generations.

GFXBench Manhattan 3.1 Off-screen

GFXBench Manhattan 3.1 Offscreen Power Efficiency
(System Active Power)
  Mfc. Process FPS Avg. Power
(W)
Perf/W
Efficiency
iPhone 11 Pro (A13) Warm N7P 100.58 4.21 23.89 fps/W
iPhone 11 Pro (A13) Cold / Peak N7P 123.54 6.04 20.45 fps/W
iPhone XS (A12) Warm N7 76.51 3.79 20.18 fps/W
iPhone XS (A12) Cold / Peak N7 103.83 5.98 17.36 fps/W
QRD865 (Snapdragon 865) N7P 89.38 5.17 17.28 fps/W
Mate 30 Pro (Kirin 990 4G) N7 75.69 5.04 15.01 fps/W
Galaxy 10+ (Snapdragon 855) N7 70.67 4.88 14.46 fps/W
Galaxy 10+ (Exynos 9820) 8LPP 68.87 5.10 13.48 fps/W
Galaxy S9+ (Snapdragon 845) 10LPP 61.16 5.01 11.99 fps/W
Mate 20 Pro (Kirin 980) N7 54.54 4.57 11.93 fps/W
Galaxy S9 (Exynos 9810) 10LPP 46.04 4.08 11.28 fps/W
Galaxy S8 (Snapdragon 835) 10LPE 38.90 3.79 10.26 fps/W
Galaxy S8 (Exynos 8895) 10LPE 42.49 7.35 5.78 fps/W

Even on the more traditional tests such as Manhattan 3.1, although again the Adreno 650 is able to showcase good improvements this generation, it seems that Qualcomm didn’t aim quite high enough.

GFXBench T-Rex 2.7 Off-screen

GFXBench T-Rex Offscreen Power Efficiency
(System Active Power)
  Mfc. Process FPS Avg. Power
(W)
Perf/W
Efficiency
iPhone 11 Pro (A13) Warm N7P 289.03 4.78 60.46 fps/W
iPhone 11 Pro (A13) Cold / Peak N7P 328.90 5.93 55.46 fps/W
iPhone XS (A12) Warm N7 197.80 3.95 50.07 fps/W
iPhone XS (A12) Cold / Peak N7 271.86 6.10 44.56 fps/W
QRD865 (Snapdragon 865) N7P 206.07 4.70 43.84 fps/W
Galaxy 10+ (Snapdragon 855) N7 167.16 4.10 40.70 fps/W
Mate 30 Pro  (Kirin 990 4G) N7 152.27 4.34 35.08 fps/W
Galaxy S9+ (Snapdragon 845) 10LPP 150.40 4.42 34.00 fps/W
Galaxy 10+ (Exynos 9820) 8LPP 166.00 4.96 33.40fps/W
Galaxy S9 (Exynos 9810) 10LPP 141.91 4.34 32.67 fps/W
Galaxy S8 (Snapdragon 835) 10LPE 108.20 3.45 31.31 fps/W
Mate 20 Pro (Kirin 980) N7 135.75 4.64 29.25 fps/W
Galaxy S8 (Exynos 8895) 10LPE 121.00 5.86 20.65 fps/W

Lastly, the T-Rex benchmark which is the least compute heavy workload tested here, and mostly is bottlenecked by texture and fillrate throughput, sees a 23% increase for the Snapdragon 865.

Overall GPU Conclusion – Good Improvements – Competitively Not Enough

Overall, we were able to verify the Snapdragon 865’s performance improvements and Qualcomm’s 25% claims seem to be largely accurate. The issue is that this doesn’t seem to be enough to keep up with the large improvements that Apple has been able to showcase over the last two generations.

During the chipset’s launch, Qualcomm was eager to mention that their product is able to showcase better long-term sustained performance than a competitor which “throttles within minutes”. While we don’t have confirmation as to whom exactly they were referring to, the data and narrative here only matches Apple’s device behaviour. Whilst we weren’t able to test the sustained performance of the QRD865 today, it unfortunately doesn’t really matter for Qualcomm as the Snapdragon 865 and Adreno 650’s peak performance falls in at a lower level than Apple’s A13 sustained performance.

Apple isn’t the only one Qualcomm has to worry about; the 25% performance increases this generation are within reach of Arm’s Mali-G77. In theory, Samsung’s Exynos 990 should be able to catch up with the Snapdragon 865. Qualcomm had been regarded as the mobile GPU leader over the last few years, but it’s clear that development has slowed down quite a lot recently, and the Adreno family has lost its crown.

Today’s preview focused solely on the performance metrics of the new chipset, which only cover a very small subset of the new features that the chip will be bringing to devices next year. A lot of the talking-points of the new SoC such as 5G connectivity, or the new camera and media capabilities, are aspects for which we’ll have to wait on commercial devices.

For what we’ve been able to test today, the Snapdragon 865 seems very solid. The new Cortex-A77 CPU does bring larger IPC improvements to the table, and thanks to the Snapdragon 865’s improved memory subsystem, the chip has been able to showcase healthy performance increases. I did find it odd that the web benchmarks didn’t quite perform as well as I had expected – I don’t know if the new microarchitecture just doesn’t improve these workloads as much, or if it might have been a software issue on the QRD865 phone; we’ll have to wait for commercial devices to have a clearer picture of the situation. System performance of the new chip certainly shouldn’t be disappointing, and even on a conservative baseline configuration, 2020 flagships should see an increase in responsiveness compared to the Snapdragon 855.

AI performance of the new chip is also improved – although our limited benchmark suite here isn’t able to fully expose the hardware improvements that the S865 brings with it. It’s likely that first-party camera applications will be the first real workloads that will be able to showcase the new capabilities of the chip.

On the GPU side, the improvements are also quite solid, but I just have a feeling that the narrative here isn’t quite the same anymore for Qualcomm, as Apple’s the elephant in the room now here as well. During the launch of the chipset the company was quite eager to promote that its sustained performance is better than the competition. While we weren’t able to test this aspect of the Snapdragon 865 on the QRD865 due to time constraints, the simple fact is that the chip’s peak performance remains inferior to Apple’s sustained performance, with the fruit company essentially dominating an area where previously Qualcomm was king. In this regard, I hope Qualcomm is able to catch up in the future, as the differences here are seemingly getting bigger each year.

Overall, the Snapdragon 865 seems like a very well-balanced chip and I have no doubt it’ll serve as a very competitive foundation for 2020 flagships. Qualcomm’s strengths lie in the fact that they’re able to deliver a complete solution with 5G connectivity – we do however hope that in the future the company will be able to offer more solid performance upgrades; the competition out there is getting tough.

Let's block ads! (Why?)


https://news.google.com/__i/rss/rd/articles/CBMid2h0dHBzOi8vd3d3LmFuYW5kdGVjaC5jb20vc2hvdy8xNTIwNy90aGUtc25hcGRyYWdvbi04NjUtcGVyZm9ybWFuY2UtcHJldmlldy1zZXR0aW5nLXRoZS1zdGFnZS1mb3ItZmxhZ3NoaXAtYW5kcm9pZC0yMDIw0gEA?oc=5

2019-12-16 12:30:00Z
52780502825337

Tidak ada komentar:

Posting Komentar