Skip to content

Synthetic Benchmarks

CPU Benchmarks

  • front node: Intel Core i9-13900H CPU (see its topology in the Description section)
  • az4-n4090 and az4-a7900 partitions: AMD Ryzen 9 7945HX CPU (see its topology in the Description section)
  • iml-ia770 partition: Intel Core Ultra 9 185H CPU (see its topology in the Description section)
  • az5-a890m partition: AMD Ryzen AI 9 HX 370 CPU

Memory Throughput

Memory throughput is measured with the bandwidth benchmark and, each time, one thread is explicitly pinned to one CPU core (even if a core possesses more than one PU).

AMD Ryzen 9 7945HX

az4-mixed RAM is: Corsair Vengeance SO-DIMM 96 GB (2 x 48 GB, dual channel) DDR5 5200 MT/s CL44.

throughput graph 1t
AMD Ryzen 9 7945HX: 1 core (raw CSV file).

throughput graph 2t
AMD Ryzen 9 7945HX: 2 cores (raw CSV file).

throughput graph 3t
AMD Ryzen 9 7945HX: 3 cores (raw CSV file).

throughput graph 4t
AMD Ryzen 9 7945HX: 4 cores (raw CSV file).

throughput graph 5t
AMD Ryzen 9 7945HX: 5 cores (raw CSV file).

throughput graph 6t
AMD Ryzen 9 7945HX: 6 cores (raw CSV file).

throughput graph 7t
AMD Ryzen 9 7945HX: 7 cores (raw CSV file).

throughput graph 8t
AMD Ryzen 9 7945HX: 8 cores (raw CSV file).

throughput graph 9t
AMD Ryzen 9 7945HX: 9 cores (raw CSV file).

throughput graph 9t
AMD Ryzen 9 7945HX: 10 cores (raw CSV file).

throughput graph 11t
AMD Ryzen 9 7945HX: 11 cores (raw CSV file).

throughput graph 12t
AMD Ryzen 9 7945HX: 12 cores (raw CSV file).

throughput graph 13t
AMD Ryzen 9 7945HX: 13 cores (raw CSV file).

throughput graph 14t
AMD Ryzen 9 7945HX: 14 cores (raw CSV file).

throughput graph 15t
AMD Ryzen 9 7945HX: 15 cores (raw CSV file).

throughput graph 16t
AMD Ryzen 9 7945HX: 16 cores (raw CSV file).

Intel Core i9-13900H

front node RAM is: Corsair Vengeance SO-DIMM 96 GB (2 x 48 GB, dual channel) DDR5 5200 MT/s CL44.

throughput graph p-core 1t
Intel Core i9-13900H: p-core, 1 core (raw CSV file).

throughput graph p-core 2t
Intel Core i9-13900H: p-core, 2 cores (raw CSV file).

throughput graph p-core 3t
Intel Core i9-13900H: p-core, 3 cores (raw CSV file).

throughput graph p-core 4t
Intel Core i9-13900H: p-core, 4 cores (raw CSV file).

throughput graph p-core 5t
Intel Core i9-13900H: p-core, 5 cores (raw CSV file).

throughput graph p-core 6t
Intel Core i9-13900H: p-core, 6 cores (raw CSV file).

throughput graph e-core 1t
Intel Core i9-13900H: e-core, 1 core (raw CSV file).

throughput graph e-core 2t
Intel Core i9-13900H: e-core, 2 cores (raw CSV file).

throughput graph e-core 3t
Intel Core i9-13900H: e-core, 3 cores (raw CSV file).

throughput graph e-core 4t
Intel Core i9-13900H: e-core, 4 cores (raw CSV file).

throughput graph e-core 5t
Intel Core i9-13900H: e-core, 5 cores (raw CSV file).

throughput graph e-core 6t
Intel Core i9-13900H: e-core, 6 cores (raw CSV file).

throughput graph e-core 7t
Intel Core i9-13900H: e-core, 7 cores (raw CSV file).

throughput graph e-core 8t
Intel Core i9-13900H: e-core, 8 cores (raw CSV file).

Intel Core Ultra 9 185H

iml-ia770 RAM is: 32 GB (2 x 16 GB, dual channel) DDR5 5600 MT/s.

throughput graph p-core 1t
Intel Core Ultra 9 185H: p-core, 1 core (raw CSV file).

throughput graph p-core 2t
Intel Core Ultra 9 185H: p-core, 2 cores (raw CSV file).

throughput graph p-core 3t
Intel Core Ultra 9 185H: p-core, 3 cores (raw CSV file).

throughput graph p-core 4t
Intel Core Ultra 9 185H: p-core, 4 cores (raw CSV file).

throughput graph p-core 5t
Intel Core Ultra 9 185H: p-core, 5 cores (raw CSV file).

throughput graph p-core 6t
Intel Core Ultra 9 185H: p-core, 6 cores (raw CSV file).

throughput graph e-core 1t
Intel Core Ultra 9 185H: e-core, 1 core (raw CSV file).

throughput graph e-core 2t
Intel Core Ultra 9 185H: e-core, 2 cores (raw CSV file).

throughput graph e-core 3t
Intel Core Ultra 9 185H: e-core, 3 cores (raw CSV file).

throughput graph e-core 4t
Intel Core Ultra 9 185H: e-core, 4 cores (raw CSV file).

throughput graph e-core 5t
Intel Core Ultra 9 185H: e-core, 5 cores (raw CSV file).

throughput graph e-core 6t
Intel Core Ultra 9 185H: e-core, 6 cores (raw CSV file).

throughput graph e-core 7t
Intel Core Ultra 9 185H: e-core, 7 cores (raw CSV file).

throughput graph e-core 8t
Intel Core Ultra 9 185H: e-core, 8 cores (raw CSV file).

throughput graph lpe-core 1t
Intel Core Ultra 9 185H: LP e-core, 1 core (raw CSV file).

throughput graph lpe-core 2t
Intel Core Ultra 9 185H: LP e-core, 2 cores (raw CSV file).

AMD Ryzen AI 9 HX 370

az5-890m node RAM is: 4 x 8 GB, quad channel LPDDR5x 7500 MT/s.

throughput graph p-core 1t
AMD Ryzen AI 9 HX 370: p-core, 1 core (raw CSV file).

throughput graph p-core 2t
AMD Ryzen AI 9 HX 370: p-core, 2 cores (raw CSV file).

throughput graph p-core 3t
AMD Ryzen AI 9 HX 370: p-core, 3 cores (raw CSV file).

throughput graph p-core 4t
AMD Ryzen AI 9 HX 370: p-core, 4 cores (raw CSV file).

throughput graph e-core 1t
AMD Ryzen AI 9 HX 370: e-core, 1 core (raw CSV file).

throughput graph e-core 2t
AMD Ryzen AI 9 HX 370: e-core, 2 cores (raw CSV file).

throughput graph e-core 3t
AMD Ryzen AI 9 HX 370: e-core, 3 cores (raw CSV file).

throughput graph e-core 4t
AMD Ryzen AI 9 HX 370: e-core, 4 cores (raw CSV file).

throughput graph e-core 5t
AMD Ryzen AI 9 HX 370: e-core, 5 cores (raw CSV file).

throughput graph e-core 6t
AMD Ryzen AI 9 HX 370: e-core, 6 cores (raw CSV file).

throughput graph e-core 7t
AMD Ryzen AI 9 HX 370: e-core, 7 cores (raw CSV file).

throughput graph e-core 8t
AMD Ryzen AI 9 HX 370: e-core, 8 cores (raw CSV file).

Throughput Comparison between the different Core Types

In the following graphics, c, p, e and LPe mean "core", "p-core", "e-core" and "LP e-core", resp. They characterize the type of cores used in the benchmark. Just before each abbreviation, there is a number that represents the number of cores involved in the benchmark. For instance 4e means that 4 e-cores have been used.

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "CPU memory throughput depending on the CPU", "width": 1000, "height": 300, "data": { "values": [ {"category": "Ryzen 9 7945HX 1c", "group": "1-read", "value": 328}, {"category": "Ryzen 9 7945HX 1c", "group": "2-write", "value": 175}, {"category": "Ryzen 9 7945HX 1c", "group": "3-copy", "value": 348}, {"category": "Ryzen 9 7945HX 1c", "group": "4-scale", "value": 350}, {"category": "Ryzen 9 7945HX 1c", "group": "5-add", "value": 412}, {"category": "Ryzen 9 7945HX 1c", "group": "6-triad", "value": 465}, {"category": "Core i9-13900H 1p", "group": "1-read", "value": 277}, {"category": "Core i9-13900H 1p", "group": "2-write", "value": 243}, {"category": "Core i9-13900H 1p", "group": "3-copy", "value": 384}, {"category": "Core i9-13900H 1p", "group": "4-scale", "value": 260}, {"category": "Core i9-13900H 1p", "group": "5-add", "value": 310}, {"category": "Core i9-13900H 1p", "group": "6-triad", "value": 283}, {"category": "Core i9-13900H 1e", "group": "1-read", "value": 86}, {"category": "Core i9-13900H 1e", "group": "2-write", "value": 71}, {"category": "Core i9-13900H 1e", "group": "3-copy", "value": 160}, {"category": "Core i9-13900H 1e", "group": "4-scale", "value": 150}, {"category": "Core i9-13900H 1e", "group": "5-add", "value": 125}, {"category": "Core i9-13900H 1e", "group": "6-triad", "value": 125}, {"category": "Core Ultra 9 185H 1p", "group": "1-read", "value": 459}, {"category": "Core Ultra 9 185H 1p", "group": "2-write", "value": 307}, {"category": "Core Ultra 9 185H 1p", "group": "3-copy", "value": 517}, {"category": "Core Ultra 9 185H 1p", "group": "4-scale", "value": 338}, {"category": "Core Ultra 9 185H 1p", "group": "5-add", "value": 409}, {"category": "Core Ultra 9 185H 1p", "group": "6-triad", "value": 376}, {"category": "Core Ultra 9 185H 1e", "group": "1-read", "value": 121}, {"category": "Core Ultra 9 185H 1e", "group": "2-write", "value": 121}, {"category": "Core Ultra 9 185H 1e", "group": "3-copy", "value": 230}, {"category": "Core Ultra 9 185H 1e", "group": "4-scale", "value": 224}, {"category": "Core Ultra 9 185H 1e", "group": "5-add", "value": 182}, {"category": "Core Ultra 9 185H 1e", "group": "6-triad", "value": 177}, {"category": "Core Ultra 9 185H 1LPe", "group": "1-read", "value": 80}, {"category": "Core Ultra 9 185H 1LPe", "group": "2-write", "value": 80}, {"category": "Core Ultra 9 185H 1LPe", "group": "3-copy", "value": 152}, {"category": "Core Ultra 9 185H 1LPe", "group": "4-scale", "value": 143}, {"category": "Core Ultra 9 185H 1LPe", "group": "5-add", "value": 120}, {"category": "Core Ultra 9 185H 1LPe", "group": "6-triad", "value": 116}, {"category": "AI 9 HX 370 1p", "group": "1-read", "value": 280}, {"category": "AI 9 HX 370 1p", "group": "2-write", "value": 280}, {"category": "AI 9 HX 370 1p", "group": "3-copy", "value": 554}, {"category": "AI 9 HX 370 1p", "group": "4-scale", "value": 554}, {"category": "AI 9 HX 370 1p", "group": "5-add", "value": 420}, {"category": "AI 9 HX 370 1p", "group": "6-triad", "value": 418}, {"category": "AI 9 HX 370 1e", "group": "1-read", "value": 212}, {"category": "AI 9 HX 370 1e", "group": "2-write", "value": 212}, {"category": "AI 9 HX 370 1e", "group": "3-copy", "value": 421}, {"category": "AI 9 HX 370 1e", "group": "4-scale", "value": 410}, {"category": "AI 9 HX 370 1e", "group": "5-add", "value": 318}, {"category": "AI 9 HX 370 1e", "group": "6-triad", "value": 318} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "CPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Throughput (GB/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "ubench"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }
Buffer of 16 KB (higher is better).

For all the core types, the L1d cache is owned by a single core.

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "CPU memory throughput depending on the CPU", "width": 1000, "height": 300, "data": { "values": [ {"category": "Ryzen 9 7945HX 1c", "group": "1-read", "value": 175}, {"category": "Ryzen 9 7945HX 1c", "group": "2-write", "value": 173}, {"category": "Ryzen 9 7945HX 1c", "group": "3-copy", "value": 171}, {"category": "Ryzen 9 7945HX 1c", "group": "4-scale", "value": 173}, {"category": "Ryzen 9 7945HX 1c", "group": "5-add", "value": 174}, {"category": "Ryzen 9 7945HX 1c", "group": "6-triad", "value": 174}, {"category": "Core i9-13900H 1p", "group": "1-read", "value": 151}, {"category": "Core i9-13900H 1p", "group": "2-write", "value": 62}, {"category": "Core i9-13900H 1p", "group": "3-copy", "value": 100}, {"category": "Core i9-13900H 1p", "group": "4-scale", "value": 90}, {"category": "Core i9-13900H 1p", "group": "5-add", "value": 131}, {"category": "Core i9-13900H 1p", "group": "6-triad", "value": 127}, {"category": "Core i9-13900H 4e", "group": "1-read", "value": 149}, {"category": "Core i9-13900H 4e", "group": "2-write", "value": 42}, {"category": "Core i9-13900H 4e", "group": "3-copy", "value": 83}, {"category": "Core i9-13900H 4e", "group": "4-scale", "value": 81}, {"category": "Core i9-13900H 4e", "group": "5-add", "value": 115}, {"category": "Core i9-13900H 4e", "group": "6-triad", "value": 115}, {"category": "Core Ultra 9 185H 1p", "group": "1-read", "value": 201}, {"category": "Core Ultra 9 185H 1p", "group": "2-write", "value": 78}, {"category": "Core Ultra 9 185H 1p", "group": "3-copy", "value": 133}, {"category": "Core Ultra 9 185H 1p", "group": "4-scale", "value": 127}, {"category": "Core Ultra 9 185H 1p", "group": "5-add", "value": 169}, {"category": "Core Ultra 9 185H 1p", "group": "6-triad", "value": 161}, {"category": "Core Ultra 9 185H 4e", "group": "1-read", "value": 227}, {"category": "Core Ultra 9 185H 4e", "group": "2-write", "value": 84}, {"category": "Core Ultra 9 185H 4e", "group": "3-copy", "value": 149}, {"category": "Core Ultra 9 185H 4e", "group": "4-scale", "value": 149}, {"category": "Core Ultra 9 185H 4e", "group": "5-add", "value": 226}, {"category": "Core Ultra 9 185H 4e", "group": "6-triad", "value": 225}, {"category": "Core Ultra 9 185H 2LPe", "group": "1-read", "value": 119}, {"category": "Core Ultra 9 185H 2LPe", "group": "2-write", "value": 43}, {"category": "Core Ultra 9 185H 2LPe", "group": "3-copy", "value": 74}, {"category": "Core Ultra 9 185H 2LPe", "group": "4-scale", "value": 73}, {"category": "Core Ultra 9 185H 2LPe", "group": "5-add", "value": 89}, {"category": "Core Ultra 9 185H 2LPe", "group": "6-triad", "value": 88}, {"category": "AI 9 HX 370 1p", "group": "1-read", "value": 275}, {"category": "AI 9 HX 370 1p", "group": "2-write", "value": 273}, {"category": "AI 9 HX 370 1p", "group": "3-copy", "value": 275}, {"category": "AI 9 HX 370 1p", "group": "4-scale", "value": 276}, {"category": "AI 9 HX 370 1p", "group": "5-add", "value": 268}, {"category": "AI 9 HX 370 1p", "group": "6-triad", "value": 266}, {"category": "AI 9 HX 370 1e", "group": "1-read", "value": 200}, {"category": "AI 9 HX 370 1e", "group": "2-write", "value": 166}, {"category": "AI 9 HX 370 1e", "group": "3-copy", "value": 209}, {"category": "AI 9 HX 370 1e", "group": "4-scale", "value": 209}, {"category": "AI 9 HX 370 1e", "group": "5-add", "value": 203}, {"category": "AI 9 HX 370 1e", "group": "6-triad", "value": 205} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "CPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Throughput (GB/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "ubench"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }
Buffer of 512 KB (higher is better).

The configuration of the L2 cache depends on the core type:

  • Ryzen 9 7945HX cores, Core i9-13900H, Core Ultra 9 185H p-cores and Ryzen AI 9 HX 370 p-/e-cores: a L2 cache is dedicated to each core
  • Core i9-13900H and Core Ultra 9 185H e-cores: 4 cores are sharing the L2 cache together
  • Core Ultra 9 185H LP e-cores: 2 cores are sharing the L2 cache

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "CPU memory throughput depending on the CPU", "width": 1000, "height": 300, "data": { "values": [ {"category": "Ryzen 9 7945HX 8c", "group": "1-read", "value": 683}, {"category": "Ryzen 9 7945HX 8c", "group": "2-write", "value": 752}, {"category": "Ryzen 9 7945HX 8c", "group": "3-copy", "value": 780}, {"category": "Ryzen 9 7945HX 8c", "group": "4-scale", "value": 651}, {"category": "Ryzen 9 7945HX 8c", "group": "5-add", "value": 673}, {"category": "Ryzen 9 7945HX 8c", "group": "6-triad", "value": 679}, {"category": "Core i9-13900H 6p", "group": "1-read", "value": 234}, {"category": "Core i9-13900H 6p", "group": "2-write", "value": 183}, {"category": "Core i9-13900H 6p", "group": "3-copy", "value": 226}, {"category": "Core i9-13900H 6p", "group": "4-scale", "value": 205}, {"category": "Core i9-13900H 6p", "group": "5-add", "value": 238}, {"category": "Core i9-13900H 6p", "group": "6-triad", "value": 237}, {"category": "Core i9-13900H 8e", "group": "1-read", "value": 62}, {"category": "Core i9-13900H 8e", "group": "2-write", "value": 72}, {"category": "Core i9-13900H 8e", "group": "3-copy", "value": 90}, {"category": "Core i9-13900H 8e", "group": "4-scale", "value": 99}, {"category": "Core i9-13900H 8e", "group": "5-add", "value": 89}, {"category": "Core i9-13900H 8e", "group": "6-triad", "value": 81}, {"category": "Core Ultra 9 185H 6p", "group": "1-read", "value": 296}, {"category": "Core Ultra 9 185H 6p", "group": "2-write", "value": 195}, {"category": "Core Ultra 9 185H 6p", "group": "3-copy", "value": 273}, {"category": "Core Ultra 9 185H 6p", "group": "4-scale", "value": 270}, {"category": "Core Ultra 9 185H 6p", "group": "5-add", "value": 234}, {"category": "Core Ultra 9 185H 6p", "group": "6-triad", "value": 240}, {"category": "Core Ultra 9 185H 8e", "group": "1-read", "value": 100}, {"category": "Core Ultra 9 185H 8e", "group": "2-write", "value": 74}, {"category": "Core Ultra 9 185H 8e", "group": "3-copy", "value": 145}, {"category": "Core Ultra 9 185H 8e", "group": "4-scale", "value": 145}, {"category": "Core Ultra 9 185H 8e", "group": "5-add", "value": 151}, {"category": "Core Ultra 9 185H 8e", "group": "6-triad", "value": 151}, {"category": "AI 9 HX 370 4p", "group": "1-read", "value": 500}, {"category": "AI 9 HX 370 4p", "group": "2-write", "value": 450}, {"category": "AI 9 HX 370 4p", "group": "3-copy", "value": 459}, {"category": "AI 9 HX 370 4p", "group": "4-scale", "value": 457}, {"category": "AI 9 HX 370 4p", "group": "5-add", "value": 493}, {"category": "AI 9 HX 370 4p", "group": "6-triad", "value": 463}, {"category": "AI 9 HX 370 8e", "group": "1-read", "value": null}, {"category": "AI 9 HX 370 8e", "group": "2-write", "value": null}, {"category": "AI 9 HX 370 8e", "group": "3-copy", "value": null}, {"category": "AI 9 HX 370 8e", "group": "4-scale", "value": null}, {"category": "AI 9 HX 370 8e", "group": "5-add", "value": null}, {"category": "AI 9 HX 370 8e", "group": "6-triad", "value": null} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "CPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Throughput (GB/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "ubench"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }
Buffer of 20 MB (higher is better). Except for AI 9 HX 370 p-cores where a buffer of 10 MB is used.

The configuration of the L3 cache depends on the core type:

  • Ryzen 9 7945HX cores: a L3 cache is shared between 8 cores
  • Core i9-13900H p-cores, Core Ultra 9 185H p-cores and Core Ultra 9 185H e-cores: the L3 cache is shared for all these cores
  • Core Ultra 9 185H LP e-cores do not have a L3 cache
  • Ryzen AI 9 HX 370 p-cores: a L3 cache is shared between 4 cores
  • Ryzen AI 9 HX 370 e-cores: a L3 cache is shared between 8 cores

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "CPU memory throughput depending on the CPU", "width": 1000, "height": 300, "data": { "values": [ {"category": "Ryzen 9 7945HX 16c", "group": "1-read", "value": 48}, {"category": "Ryzen 9 7945HX 16c", "group": "2-write", "value": 56}, {"category": "Ryzen 9 7945HX 16c", "group": "3-copy", "value": 49}, {"category": "Ryzen 9 7945HX 16c", "group": "4-scale", "value": 48}, {"category": "Ryzen 9 7945HX 16c", "group": "5-add", "value": 47}, {"category": "Ryzen 9 7945HX 16c", "group": "6-triad", "value": 48}, {"category": "Core i9-13900H 6p", "group": "1-read", "value": 60}, {"category": "Core i9-13900H 6p", "group": "2-write", "value": 73}, {"category": "Core i9-13900H 6p", "group": "3-copy", "value": 68}, {"category": "Core i9-13900H 6p", "group": "4-scale", "value": 67}, {"category": "Core i9-13900H 6p", "group": "5-add", "value": 66}, {"category": "Core i9-13900H 6p", "group": "6-triad", "value": 66}, {"category": "Core i9-13900H 8e", "group": "1-read", "value": 35}, {"category": "Core i9-13900H 8e", "group": "2-write", "value": 73}, {"category": "Core i9-13900H 8e", "group": "3-copy", "value": 61}, {"category": "Core i9-13900H 8e", "group": "4-scale", "value": 44}, {"category": "Core i9-13900H 8e", "group": "5-add", "value": 41}, {"category": "Core i9-13900H 8e", "group": "6-triad", "value": 41}, {"category": "Core Ultra 9 185H 6p", "group": "1-read", "value": 74}, {"category": "Core Ultra 9 185H 6p", "group": "2-write", "value": 73}, {"category": "Core Ultra 9 185H 6p", "group": "3-copy", "value": 71}, {"category": "Core Ultra 9 185H 6p", "group": "4-scale", "value": 70}, {"category": "Core Ultra 9 185H 6p", "group": "5-add", "value": 72}, {"category": "Core Ultra 9 185H 6p", "group": "6-triad", "value": 72}, {"category": "Core Ultra 9 185H 8e", "group": "1-read", "value": 42}, {"category": "Core Ultra 9 185H 8e", "group": "2-write", "value": 73}, {"category": "Core Ultra 9 185H 8e", "group": "3-copy", "value": 58}, {"category": "Core Ultra 9 185H 8e", "group": "4-scale", "value": 58}, {"category": "Core Ultra 9 185H 8e", "group": "5-add", "value": 51}, {"category": "Core Ultra 9 185H 8e", "group": "6-triad", "value": 51}, {"category": "Core Ultra 9 185H 2LPe", "group": "1-read", "value": 17}, {"category": "Core Ultra 9 185H 2LPe", "group": "2-write", "value": 29}, {"category": "Core Ultra 9 185H 2LPe", "group": "3-copy", "value": 29}, {"category": "Core Ultra 9 185H 2LPe", "group": "4-scale", "value": 27}, {"category": "Core Ultra 9 185H 2LPe", "group": "5-add", "value": 25}, {"category": "Core Ultra 9 185H 2LPe", "group": "6-triad", "value": 24}, {"category": "AI 9 HX 370 4p", "group": "1-read", "value": 61}, {"category": "AI 9 HX 370 4p", "group": "2-write", "value": 98}, {"category": "AI 9 HX 370 4p", "group": "3-copy", "value": 80}, {"category": "AI 9 HX 370 4p", "group": "4-scale", "value": 80}, {"category": "AI 9 HX 370 4p", "group": "5-add", "value": 73}, {"category": "AI 9 HX 370 4p", "group": "6-triad", "value": 72}, {"category": "AI 9 HX 370 8e", "group": "1-read", "value": 61}, {"category": "AI 9 HX 370 8e", "group": "2-write", "value": 86}, {"category": "AI 9 HX 370 8e", "group": "3-copy", "value": 76}, {"category": "AI 9 HX 370 8e", "group": "4-scale", "value": 77}, {"category": "AI 9 HX 370 8e", "group": "5-add", "value": 70}, {"category": "AI 9 HX 370 8e", "group": "6-triad", "value": 70} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "CPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Throughput (GB/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "ubench"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }
Buffer of 2 GB (higher is better).

The RAM throughput mainly depends on the memory module types:

  • Ryzen 9 7945HX and Core i9-13900H cores: Corsair Vengeance SO-DIMM 96 GB (2 x 48 GB, dual channel) DDR5 5200 MT/s CL44
  • Core Ultra 9 185H: 32 GB (2 x 16 GB, dual channel) DDR5 5600 MT/s

Raw Data

Benchmark # Cores AMD Ryzen 9 7945HX Intel Core i9-13900H - p-cores Intel Core i9-13900H - e-cores Intel Core Ultra 9 185H - p-cores Intel Core Ultra 9 185H - e-cores Intel Core Ultra 9 185H - LP e-cores AMD Ryzen AI 9 HX 370 - p-cores (Zen5) AMD Ryzen AI 9 HX 370 - e-cores (Zen5c)
bandwidth 1 download download download download download download download download
bandwidth 2 download download download download download download download download
bandwidth 3 download download download download download - download download
bandwidth 4 download download download download download - download download
bandwidth 5 download download download download download - - download
bandwidth 6 download download download download download - - download
bandwidth 7 download - download - download - - download
bandwidth 8 download - download - download - - download
bandwidth 9 download - - - - - - -
bandwidth 10 download - - - - - - -
bandwidth 11 download - - - - - - -
bandwidth 12 download - - - - - - -
bandwidth 13 download - - - - - - -
bandwidth 14 download - - - - - - -
bandwidth 15 download - - - - - - -
bandwidth 16 download - - - - - - -

Peak Performance

The cpufp benchmark is used. The CPU peak performance is measured according to different operations:

  • FMA - Fused Multiply–Add, performs the following operation: \(d = a \times b + c\) on 64-bit or 32-bit floating-point numbers (reffered as f64 & f32 here)
  • DPA2 - Performs the dot product of two 16-bit integers (i16) and accumulates the result in a 32-bit integer (i32): \(c^{i32} = c^{i32} + \sum^2_{s = 1}{ a_s^{i16} \times b_s^{i16}}\)
  • DPA4 - Performs the dot product of four 8-bit integers (i8) and accumulates the result in a 32-bit integer (i32): \(c^{i32} = c^{i32} + \sum^4_{s = 1}{ a_s^{i8} \times b_s^{i8}}\)

In the following graphics, c, p, e and LPe mean "core", "p-core", "e-core" and "LP e-core", resp. They characterize the type of cores used in the benchmark. Just before each abbreviation, there is a number that represents the number of cores involved in the benchmark. For instance 4e means that 4 e-cores have been used.

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "Peak performance on CPU (mono-core) depending on the type of operation.", "width": 1000, "height": 300, "data": { "values": [ {"category":"Ryzen 9 7945HX 1c", "group": "1-FMA f64", "value": 87}, {"category":"Ryzen 9 7945HX 1c", "group": "2-FMA f32", "value": 174}, {"category":"Ryzen 9 7945HX 1c", "group": "3-DPA2 i32/i16", "value": 347}, {"category":"Ryzen 9 7945HX 1c", "group": "4-DPA4 i32/i8", "value": 695}, {"category": "Core i9-13900H 1p", "group": "1-FMA f64", "value": 61}, {"category": "Core i9-13900H 1p", "group": "2-FMA f32", "value": 121}, {"category": "Core i9-13900H 1p", "group": "3-DPA2 i32/i16", "value": 242}, {"category": "Core i9-13900H 1p", "group": "4-DPA4 i32/i8", "value": 484}, {"category": "Core i9-13900H 1e", "group": "1-FMA f64", "value": 22}, {"category": "Core i9-13900H 1e", "group": "2-FMA f32", "value": 43}, {"category": "Core i9-13900H 1e", "group": "3-DPA2 i32/i16", "value": 43}, {"category": "Core i9-13900H 1e", "group": "4-DPA4 i32/i8", "value": 86}, {"category":"Ultra 9 185H 1p", "group": "1-FMA f64", "value": 76}, {"category":"Ultra 9 185H 1p", "group": "2-FMA f32", "value": 153}, {"category":"Ultra 9 185H 1p", "group": "3-DPA2 i32/i16", "value": 305}, {"category":"Ultra 9 185H 1p", "group": "4-DPA4 i32/i8", "value": 611}, {"category":"Ultra 9 185H 1e", "group": "1-FMA f64", "value": 30}, {"category":"Ultra 9 185H 1e", "group": "2-FMA f32", "value": 60}, {"category":"Ultra 9 185H 1e", "group": "3-DPA2 i32/i16", "value": 121}, {"category":"Ultra 9 185H 1e", "group": "4-DPA4 i32/i8", "value": 240}, {"category":"Ultra 9 185H 1LPe", "group": "1-FMA f64", "value": 18}, {"category":"Ultra 9 185H 1LPe", "group": "2-FMA f32", "value": 35}, {"category":"Ultra 9 185H 1LPe", "group": "3-DPA2 i32/i16", "value": 70}, {"category":"Ultra 9 185H 1LPe", "group": "4-DPA4 i32/i8", "value": 139}, {"category":"AI 9 HX 370 1p", "group": "1-FMA f64", "value": 70}, {"category":"AI 9 HX 370 1p", "group": "2-FMA f32", "value": 140}, {"category":"AI 9 HX 370 1p", "group": "3-DPA2 i32/i16", "value": 280}, {"category":"AI 9 HX 370 1p", "group": "4-DPA4 i32/i8", "value": 559}, {"category":"AI 9 HX 370 1e", "group": "1-FMA f64", "value": 53}, {"category":"AI 9 HX 370 1e", "group": "2-FMA f32", "value": 106}, {"category":"AI 9 HX 370 1e", "group": "3-DPA2 i32/i16", "value": 211}, {"category":"AI 9 HX 370 1e", "group": "4-DPA4 i32/i8", "value": 422} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "CPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Performance (Gop/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "Operation"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }
CPU single-core peak performance depending on the type of operation (higher is better).

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "Peak performance on CPU (multi-core) depending on the type of operation.", "width": 1000, "height": 300, "data": { "values": [ {"category":"Ryzen 9 7945HX 16c", "group": "1-FMA f64", "value": 1281}, {"category":"Ryzen 9 7945HX 16c", "group": "2-FMA f32", "value": 2574}, {"category":"Ryzen 9 7945HX 16c", "group": "3-DPA2 i32/i16", "value": 5176}, {"category":"Ryzen 9 7945HX 16c", "group": "4-DPA4 i32/i8", "value": 10350}, {"category": "Core i9-13900H 6p", "group": "1-FMA f64", "value": 363}, {"category": "Core i9-13900H 6p", "group": "2-FMA f32", "value": 726}, {"category": "Core i9-13900H 6p", "group": "3-DPA2 i32/i16", "value": 1453}, {"category": "Core i9-13900H 6p", "group": "4-DPA4 i32/i8", "value": 2906}, {"category": "Core i9-13900H 8e", "group": "1-FMA f64", "value": 172}, {"category": "Core i9-13900H 8e", "group": "2-FMA f32", "value": 344}, {"category": "Core i9-13900H 8e", "group": "3-DPA2 i32/i16", "value": 344}, {"category": "Core i9-13900H 8e", "group": "4-DPA4 i32/i8", "value": 688}, {"category":"Ultra 9 185H 6p", "group": "1-FMA f64", "value": 431}, {"category":"Ultra 9 185H 6p", "group": "2-FMA f32", "value": 860}, {"category":"Ultra 9 185H 6p", "group": "3-DPA2 i32/i16", "value": 1724}, {"category":"Ultra 9 185H 6p", "group": "4-DPA4 i32/i8", "value": 3452}, {"category":"Ultra 9 185H 8e", "group": "1-FMA f64", "value": 210}, {"category":"Ultra 9 185H 8e", "group": "2-FMA f32", "value": 421}, {"category":"Ultra 9 185H 8e", "group": "3-DPA2 i32/i16", "value": 836}, {"category":"Ultra 9 185H 8e", "group": "4-DPA4 i32/i8", "value": 1682}, {"category":"Ultra 9 185H 2LPe", "group": "1-FMA f64", "value": 37}, {"category":"Ultra 9 185H 2LPe", "group": "2-FMA f32", "value": 75}, {"category":"Ultra 9 185H 2LPe", "group": "3-DPA2 i32/i16", "value": 149}, {"category":"Ultra 9 185H 2LPe", "group": "4-DPA4 i32/i8", "value": 298}, {"category":"AI 9 HX 370 4p", "group": "1-FMA f64", "value": 279}, {"category":"AI 9 HX 370 4p", "group": "2-FMA f32", "value": 559}, {"category":"AI 9 HX 370 4p", "group": "3-DPA2 i32/i16", "value": 1112}, {"category":"AI 9 HX 370 4p", "group": "4-DPA4 i32/i8", "value": 2236}, {"category":"AI 9 HX 370 8e", "group": "1-FMA f64", "value": 422}, {"category":"AI 9 HX 370 8e", "group": "2-FMA f32", "value": 844}, {"category":"AI 9 HX 370 8e", "group": "3-DPA2 i32/i16", "value": 1688}, {"category":"AI 9 HX 370 8e", "group": "4-DPA4 i32/i8", "value": 3376} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "CPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Performance (Gop/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "Operation"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }
CPU multi-core peak performance depending on the type of operation (higher is better).

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "Peak performance on CPU (multi-core) depending on the type of operation.", "width": 1000, "height": 300, "data": { "values": [ {"category":"Ryzen 9 7945HX 16c", "group": "1-FMA f64", "value": 1281}, {"category":"Ryzen 9 7945HX 16c", "group": "2-FMA f32", "value": 2574}, {"category":"Ryzen 9 7945HX 16c", "group": "3-DPA2 i32/i16", "value": 5176}, {"category":"Ryzen 9 7945HX 16c", "group": "4-DPA4 i32/i8", "value": 10350}, {"category": "Core i9-13900H 6p+8e", "group": "1-FMA f64", "value": 535}, {"category": "Core i9-13900H 6p+8e", "group": "2-FMA f32", "value": 1070}, {"category": "Core i9-13900H 6p+8e", "group": "3-DPA2 i32/i16", "value": 1797}, {"category": "Core i9-13900H 6p+8e", "group": "4-DPA4 i32/i8", "value": 3594}, {"category":"Ultra 9 185H 6p+8e+2LPe", "group": "1-FMA f64", "value": 678}, {"category":"Ultra 9 185H 6p+8e+2LPe", "group": "2-FMA f32", "value": 1356}, {"category":"Ultra 9 185H 6p+8e+2LPe", "group": "3-DPA2 i32/i16", "value": 2709}, {"category":"Ultra 9 185H 6p+8e+2LPe", "group": "4-DPA4 i32/i8", "value": 5432}, {"category":"AI 9 HX 370 4p+8e", "group": "1-FMA f64", "value": 701}, {"category":"AI 9 HX 370 4p+8e", "group": "2-FMA f32", "value": 1403}, {"category":"AI 9 HX 370 4p+8e", "group": "3-DPA2 i32/i16", "value": 2800}, {"category":"AI 9 HX 370 4p+8e", "group": "4-DPA4 i32/i8", "value": 5612} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "CPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Performance (Gop/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "Operation"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }
CPU multi-core peak performance depending on the type of operation (higher is better).

Raw Data

Benchmark AMD Ryzen 9 7945HX Intel Core i9-13900H - p-cores Intel Core i9-13900H - e-cores Intel Core Ultra 9 185H - p-cores Intel Core Ultra 9 185H - e-cores Intel Core Ultra 9 185H - LP e-cores
cpufp download download download download download download

GPU Benchmarks

  • front node: Intel Iris Xe iGPU
  • az4-n4090 partition: AMD Radeon 610M iGPU and Nvidia GeForce RTX 4090 dGPU
  • az4-a7900 partition: AMD Radeon 610M iGPU and AMD Radeon RX 7900 XTX dGPU
  • iml-ia770 partition: Intel Arc Mobile iGPU and Intel Arc 770 eGPU
  • az5-a890m partition: AMD Radeon 890M iGPU

Memory Throughput

Measurement of the memory throughput between the GPU and its global memory (VRAM) with the clpeak benchmark. When it is a iGPU, its global memory is the RAM and it is shared with the CPU.

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "GPU memory throughput.", "width": 1000, "height": 300, "data": { "values": [ {"category": "Radeon 610M (iGPU)", "group": "1-float32x1", "value": 38}, {"category": "Radeon 610M (iGPU)", "group": "2-float32x2", "value": 48}, {"category": "Radeon 610M (iGPU)", "group": "3-float32x4", "value": 49}, {"category": "Radeon 610M (iGPU)", "group": "4-float32x8", "value": 43}, {"category": "Radeon 610M (iGPU)", "group": "5-float32x16", "value": 36}, {"category": "Iris Xe (iGPU)", "group": "1-float32x1", "value": 72}, {"category": "Iris Xe (iGPU)", "group": "2-float32x2", "value": 74}, {"category": "Iris Xe (iGPU)", "group": "3-float32x4", "value": 75}, {"category": "Iris Xe (iGPU)", "group": "4-float32x8", "value": 74}, {"category": "Iris Xe (iGPU)", "group": "5-float32x16", "value": 72}, {"category": "Arc Mobile (iGPU)", "group": "1-float32x1", "value": 65}, {"category": "Arc Mobile (iGPU)", "group": "2-float32x2", "value": 71}, {"category": "Arc Mobile (iGPU)", "group": "3-float32x4", "value": 69}, {"category": "Arc Mobile (iGPU)", "group": "4-float32x8", "value": 72}, {"category": "Arc Mobile (iGPU)", "group": "5-float32x16", "value": 56}, {"category": "Radeon 890M (iGPU)", "group": "1-float32x1", "value": 89}, {"category": "Radeon 890M (iGPU)", "group": "2-float32x2", "value": 95}, {"category": "Radeon 890M (iGPU)", "group": "3-float32x4", "value": 96}, {"category": "Radeon 890M (iGPU)", "group": "4-float32x8", "value": 93}, {"category": "Radeon 890M (iGPU)", "group": "5-float32x16", "value": 96}, {"category": "Arc 770 (eGPU)", "group": "1-float32x1", "value": 399}, {"category": "Arc 770 (eGPU)", "group": "2-float32x2", "value": 404}, {"category": "Arc 770 (eGPU)", "group": "3-float32x4", "value": 407}, {"category": "Arc 770 (eGPU)", "group": "4-float32x8", "value": 411}, {"category": "Arc 770 (eGPU)", "group": "5-float32x16", "value": 417}, {"category": "Radeon RX 7900 XTX", "group": "1-float32x1", "value": 708}, {"category": "Radeon RX 7900 XTX", "group": "2-float32x2", "value": 751}, {"category": "Radeon RX 7900 XTX", "group": "3-float32x4", "value": 757}, {"category": "Radeon RX 7900 XTX", "group": "4-float32x8", "value": 811}, {"category": "Radeon RX 7900 XTX", "group": "5-float32x16", "value": 857}, {"category": "GeForce RTX 4090", "group": "1-float32x1", "value": 895}, {"category": "GeForce RTX 4090", "group": "2-float32x2", "value": 924}, {"category": "GeForce RTX 4090", "group": "3-float32x4", "value": 941}, {"category": "GeForce RTX 4090", "group": "4-float32x8", "value": 952}, {"category": "GeForce RTX 4090", "group": "5-float32x16", "value": 963}, {"category": "GeForce RTX 4090 (PoCL)", "group": "1-float32x1", "value": 894}, {"category": "GeForce RTX 4090 (PoCL)", "group": "2-float32x2", "value": 922}, {"category": "GeForce RTX 4090 (PoCL)", "group": "3-float32x4", "value": 941}, {"category": "GeForce RTX 4090 (PoCL)", "group": "4-float32x8", "value": 951}, {"category": "GeForce RTX 4090 (PoCL)", "group": "5-float32x16", "value": 961} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "GPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Throughput (GB/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "Datatype"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }

GPU memory throughput (higher is better).

Peak Performance

Measurement of the GPU peak performance. The clpeak benchmark is used. It is an OpenCL benchmark that executes a compute intensive program to estimate peak performance.

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "GPU peak performance (32-bit float).", "width": 1000, "height": 300, "data": { "values": [ {"category": "Radeon 610M (iGPU)", "group": "1-float16", "value": 1.1}, {"category": "Radeon 610M (iGPU)", "group": "2-float32", "value": 0.5}, {"category": "Radeon 610M (iGPU)", "group": "3-float64", "value": 0.3}, {"category": "Radeon 610M (iGPU)", "group": "4-int8", "value": 0.5}, {"category": "Radeon 610M (iGPU)", "group": "5-int16", "value": 0.5}, {"category": "Radeon 610M (iGPU)", "group": "6-int32", "value": 0.1}, {"category": "Iris Xe (iGPU)", "group": "1-float16", "value": 4.4}, {"category": "Iris Xe (iGPU)", "group": "2-float32", "value": 2.2}, {"category": "Iris Xe (iGPU)", "group": "3-float64", "value": null}, {"category": "Iris Xe (iGPU)", "group": "4-int8", "value": 1.5}, {"category": "Iris Xe (iGPU)", "group": "5-int16", "value": 4.2}, {"category": "Iris Xe (iGPU)", "group": "6-int32", "value": 0.8}, {"category": "Arc Mobile (iGPU)", "group": "1-float16", "value": 9.8}, {"category": "Arc Mobile (iGPU)", "group": "2-float32", "value": 4.8}, {"category": "Arc Mobile (iGPU)", "group": "3-float64", "value": 0.1}, {"category": "Arc Mobile (iGPU)", "group": "4-int8", "value": 2.9}, {"category": "Arc Mobile (iGPU)", "group": "5-int16", "value": 7.5}, {"category": "Arc Mobile (iGPU)", "group": "6-int32", "value": 1.3}, {"category": "Radeon 890M (iGPU)", "group": "1-float16", "value": 7.3}, {"category": "Radeon 890M (iGPU)", "group": "2-float32", "value": 4.3}, {"category": "Radeon 890M (iGPU)", "group": "3-float64", "value": 0.2}, {"category": "Radeon 890M (iGPU)", "group": "4-int8", "value": 3.2}, {"category": "Radeon 890M (iGPU)", "group": "5-int16", "value": 3.4}, {"category": "Radeon 890M (iGPU)", "group": "6-int32", "value": 1.0}, {"category": "Arc 770 (eGPU)", "group": "1-float16", "value": 19.5}, {"category": "Arc 770 (eGPU)", "group": "2-float32", "value": 13.0}, {"category": "Arc 770 (eGPU)", "group": "3-float64", "value": null}, {"category": "Arc 770 (eGPU)", "group": "4-int8", "value": 11.4}, {"category": "Arc 770 (eGPU)", "group": "5-int16", "value": 18.0}, {"category": "Arc 770 (eGPU)", "group": "6-int32", "value": 5.5}, {"category": "Radeon RX 7900 XTX", "group": "1-float16", "value": 63.9}, {"category": "Radeon RX 7900 XTX", "group": "2-float32", "value": 33.8}, {"category": "Radeon RX 7900 XTX", "group": "3-float64", "value": 1.1}, {"category": "Radeon RX 7900 XTX", "group": "4-int8", "value": 31.1}, {"category": "Radeon RX 7900 XTX", "group": "5-int16", "value": 31.1}, {"category": "Radeon RX 7900 XTX", "group": "6-int32", "value": 6.7}, {"category": "GeForce RTX 4090", "group": "1-float16", "value": null}, {"category": "GeForce RTX 4090", "group": "2-float32", "value": 81.4}, {"category": "GeForce RTX 4090", "group": "3-float64", "value": 1.4}, {"category": "GeForce RTX 4090", "group": "4-int8", "value": 38.9}, {"category": "GeForce RTX 4090", "group": "5-int16", "value": 37.8}, {"category": "GeForce RTX 4090", "group": "6-int32", "value": 44.9}, {"category": "GeForce RTX 4090 (PoCL)", "group": "1-float16", "value": 89.4}, {"category": "GeForce RTX 4090 (PoCL)", "group": "2-float32", "value": 80.8}, {"category": "GeForce RTX 4090 (PoCL)", "group": "3-float64", "value": 1.4}, {"category": "GeForce RTX 4090 (PoCL)", "group": "4-int8", "value": 21.2}, {"category": "GeForce RTX 4090 (PoCL)", "group": "5-int16", "value": 22.1}, {"category": "GeForce RTX 4090 (PoCL)", "group": "6-int32", "value": 30.8} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "GPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Peak Performance (Top/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "Datatype"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }

GPU peak performance (higher is better).

Warning

The Nvidia GeForce RTX 4090 GPU supports float16 format in hardware but this is not supported by the OpenCL Nvidia driver. This is why the PoCL driver has also been installed.

Kernel Launch Latency

The kernel launch latency is the duration between the time of the order to execute a kernel from the CPU user-space and the time of its beginning of execution on the GPU (excluding data buffers memory transfers). The clpeak benchmark is used.

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "Kernel launch latency.", "width": 1000, "height": 300, "data": { "values": [ {"category": "Radeon 610M (iGPU)", "group": "OpenCL", "value": null}, {"category": "Iris Xe (iGPU)", "group": "OpenCL", "value": 34.4}, {"category": "Arc Mobile (iGPU)", "group": "OpenCL", "value": 39.5}, {"category": "Radeon 890M (iGPU)", "group": "OpenCL", "value": 5.8}, {"category": "Arc 770 (eGPU)", "group": "OpenCL", "value": 4.4}, {"category": "Radeon RX 7900 XTX", "group": "OpenCL", "value": null}, {"category": "GeForce RTX 4090", "group": "OpenCL", "value": 4.7} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "GPU", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Kernel latency (us)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "API"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }

GPU kernel launch latency (lower is better).

In the previous histogram we did not reported the latency of the PoCL GeForce RTX 4090 because it is abnormally super high (3.6 ms).

Warning

Some OpenCL implementations do not report the correct kernel launch latency and this is why the Radeon 610M and the Radeon RX 7900 XTX do no appear in the previous histogram.

Raw Data

Benchmark AMD Radeon 610M Intel Iris Xe Intel Arc Mobile AMD Radeon 890M Intel Arc 770 AMD Radeon RX 7900 XTX Nvidia GeForce RTX 4090
clpeak download download download [Xe][i915] download download [Xe][i915] download download

SSD Benchmarks

  • front node: 3x Samsung 990 PRO 4 TB SSD
  • az4-n4090 partition: 1x Samsung 990 PRO 4 TB SSD
  • az4-a7900 partition: 1x Samsung 990 PRO 2 TB SSD
  • iml-ia770 partition: 1x Kingston OM8PGP41024Q-A0 1 TB SSD
  • az5-a890m partition: 1x Crucial P3 Plus CT1000P3PSSD8 1 TB SSD

{ "$schema": "https://vega.github.io/schema/vega-lite/v5.json", "description": "SSD throughput.", "width": 1000, "height": 300, "data": { "values": [ {"category": "Samsung 990 PRO 4 TB (front)", "group": "1-seq-read", "value": 3.5}, {"category": "Samsung 990 PRO 4 TB (front)", "group": "2-seq-write", "value": 2.5}, {"category": "Samsung 990 PRO 4 TB (front)", "group": "3-rnd-read", "value": 1.2}, {"category": "Samsung 990 PRO 4 TB (front)", "group": "4-rnd-write", "value": 0.9}, {"category": "Samsung 990 PRO 4 TB (az4-n4090)", "group": "1-seq-read", "value": 3.5}, {"category": "Samsung 990 PRO 4 TB (az4-n4090)", "group": "2-seq-write", "value": 1.9}, {"category": "Samsung 990 PRO 4 TB (az4-n4090)", "group": "3-rnd-read", "value": 1.2}, {"category": "Samsung 990 PRO 4 TB (az4-n4090)", "group": "4-rnd-write", "value": 0.7}, {"category": "Samsung 990 PRO 2 TB", "group": "1-seq-read", "value": 3.5}, {"category": "Samsung 990 PRO 2 TB", "group": "2-seq-write", "value": 1.8}, {"category": "Samsung 990 PRO 2 TB", "group": "3-rnd-read", "value": 1.2}, {"category": "Samsung 990 PRO 2 TB", "group": "4-rnd-write", "value": 0.7}, {"category": "Kingston OM8PGP41024Q-A0 1 TB", "group": "1-seq-read", "value": 3.7}, {"category": "Kingston OM8PGP41024Q-A0 1 TB", "group": "2-seq-write", "value": 3.4}, {"category": "Kingston OM8PGP41024Q-A0 1 TB", "group": "3-rnd-read", "value": 1.2}, {"category": "Kingston OM8PGP41024Q-A0 1 TB", "group": "4-rnd-write", "value": 0.8}, {"category": "Crucial P3 Plus CT1000P3PSSD8 1 TB", "group": "1-seq-read", "value": 4.8}, {"category": "Crucial P3 Plus CT1000P3PSSD8 1 TB", "group": "2-seq-write", "value": 2.1}, {"category": "Crucial P3 Plus CT1000P3PSSD8 1 TB", "group": "3-rnd-read", "value": 1.3}, {"category": "Crucial P3 Plus CT1000P3PSSD8 1 TB", "group": "4-rnd-write", "value": 0.7} ] }, "mark": "bar", "encoding": { "x": {"field": "category", "type": "nominal", "sort": "none", "axis": {"title": "SSD", "labelAngle": -0}}, "y": {"field": "value", "type": "quantitative", "axis": {"title": "Throughput (GB/s)"}, "scale": {"type": "linear"}}, "xOffset": {"field": "group", "sort": "none"}, "color": {"field": "group", "title": "Operation"} }, "layer": [{ "mark": "bar" }, { "mark": { "type": "text", "align": "center", "baseline": "middle", "dy": -10 }, "encoding": { "text": {"field": "value", "type": "quantitative", "sort": "none"} } }] }

SSD throughput (higher is better).

Raw Data

Benchmark Samsung 990 PRO 4 TB (front) Samsung 990 PRO 4 TB (az4-n4090) Samsung 990 PRO 2 TB Kingston OM8PGP41024Q-A0 1 TB Crucial P3 Plus CT1000P3PSSD8 1 TB
dd download download download download download
iozone download download download download download