Intel Xeon Scalable Processors Supercharge Amazon Web Services’ Most Powerful Compute-Optimized Instances

Intel® Xeon® Scalable processors are deployed by today’s cloud service providers to deliver disruptive performance efficiency across a diverse range of cloud workloads. Today, Intel announced Amazon* Web Services’ (AWS) public cloud customers can now harness the workload-optimized performance that the Intel Xeon Scalable platform delivers in Amazon’s latest cloud instance. Available now, the Amazon EC2 C5 instances are AWS’ latest generation, most powerful compute-optimized instances with the best price-to-compute performance.

Today’s news marks the latest milestone in the long-term collaboration between Intel and AWS to drive innovation in the public and hybrid cloud domains.

Press Kit: Intel Xeon Scalable Processors

Intel Xeon Scalable processors are highly agile, high-performance compute engines that allow public cloud environments to seamlessly transition among general purpose compute, high-performance computing (HPC) and AI/deep learning compute. This agility provides public cloud users a wide range of options for their target workloads. Intel Xeon Scalable processors are uniquely architected for today’s evolving cloud data center infrastructure, offering energy efficiency and system-level performance that average 1.65x higher performance1 over the prior generation Xeon processors.

Together with Intel, AWS has optimized AI/deep learning engines with the latest version of the Intel® Math Kernel Library and the Intel Xeon Scalable processors to increase inference performance by over 100x2 on the new Amazon EC2 C5 instances.

The latest Intel Xeon processors incorporate several architectural features that boost the efficiency and performance of running deep learning training and inference workloads in cloud environments. MXNet* and other deep learning frameworks are highly optimized to run on Amazon EC2 C5 instances. Compared with prior generation, Intel Xeon Scalable processors deliver 2.4x higher deep learning inference performance and 2.2x higher deep learning training performance3.

HPC workloads running on Amazon EC2 C5 instances will increase the speed of research and reduce time-to-results. Intel Xeon Scalable processors feature advanced Intel Advanced Vector Extensions-512 (Intel AVX-512) instructions that make them an ideal solution for compute-intensive scientific modeling, financial operations, and distributed analytics that require high-performance, floating point calculations. Amazon EC2 C5 instances include up to 72 vCPUs (twice that of previous generation compute-optimized instances), 144 GiB of memory, and a base clock frequency of 3.0 GHz to run the most demanding HPC workloads.

To learn more about the solutions Intel offers to enable today’s hyper-connected world, visit Intel’s cloud computing page.

11.65X Average Performance: Geomean based on Normalized Generational Performance (estimated based on Intel internal testing of OLTP Brokerage, SAP SD* 2-Tier, HammerDB*, Server-side Jav*a, SPEC*int_rate_base2006, SPEC*fp_rate_base2006, Server Virtualization, STREAM* triad, LAMMPS, DPDK L3 Packet Forwarding, Black-Scholes, Intel® Distribution for LINPACK*.

2Over 100X higher inference performance is based on Amazon internal data and was discussed in the video: https://aws.amazon.com/intel/. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

32.4x deep learning inference and 2.2X deep learning training performance: Inference throughput batch size 1, Training throughput batch size 256. Source: Intel measured as of June 2017. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Optimization Notice: Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Platform: 2S Intel® Xeon® Platinum 8180 CPU @ 2.50GHz (28 cores), HT disabled, turbo disabled, scaling governor set to “performance” via intel_pstate driver, 384GB DDR4-2666 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_64. SSD: Intel® SSD DC S3700 Series (800GB, 2.5in SATA 6Gb/s, 25nm, MLC). Performance measured with: Environment variables: KMP_AFFINITY=’granularity=fine, compact’, OMP_NUM_THREADS=56, CPU Freq set with cpupower frequency-set -d 2.5G -u 3.8G -g performance. Deep Learning Frameworks: Neon: ZP/MKL_CHWN branch commit id:52bd02acb947a2adabb8a227166a7da5d9123b6d. Dummy data was used. The main.py script was used for benchmarking, in mkl mode. ICC version used: 17.0.3 20170404, Intel MKL small libraries version 2018.0.20170425. Platform: Platform: 2S Intel® Xeon® CPU E5-2699 v4 @ 2.20GHz (22 cores), HT enabled, turbo disabled, scaling governor set to “performance” via acpi-cpufreq driver, 256GB DDR4-2133 ECC RAM. CentOS Linux release 7.3.1611 (Core), Linux kernel 3.10.0-514.10.2.el7.x86_64. SSD: Intel® SSD DC S3500 Series (480GB, 2.5in SATA 6Gb/s, 20nm, MLC). Performance measured with: Environment variables: KMP_AFFINITY=’granularity=fine, compact,1,0′, OMP_NUM_THREADS=44, CPU Freq set with cpupower frequency-set -d 2.2G -u 2.2G -g performance. Deep Learning Frameworks: Neon: ZP/MKL_CHWN branch commit id:52bd02acb947a2adabb8a227166a7da5d9123b6d. Dummy data was used. The main.py script was used for benchmarking, in mkl mode. ICC version used: 17.0.3 20170404, Intel MKL small libraries version 2018.0.20170425.

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com/datacenter.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

For more information go to www.intel.com/xeonconfigs.