AVX-512 instructions can significantly boost performance in multiple workloads, but the way these instructions were implemented in CPUs caused a significant frequency drop and increase in power consumption. Yet, how AVX-512 is implemented in AMD's Zen 5-based Ryzen 9000-series processors does not cause any considerable clock speed drop or a massive increase in power draw, as tested by InstLatX64.
As it turns out, AMD's Ryzen 9 9950X drops frequency by 10% with heavy AVX-512 usage: it reduces clock speed from 5,700 MHz to 5,300 MHz, which is not substantial and which is in line with what AMD said in an interview with Tom's Hardware back in July. In contrast, Intel processors that do support AVX-512 (the company is known for disabling AVX-512 from Alder Lake and Raptor Lake CPUs for various reasons) usually drop their clocks dramatically when executing AVX-512 instructions.
To some degree, this happens because Intel's AVX-512-supporting CPUs are made on rather outdated process technologies. On the other hand, wide data paths are power hungry themselves, so it remains to be seen how much power AMD Ryzen 9000-series processors do (which are made on TSMC's N4P, 4nm-class process technology) consume when executing AVX-512 instructions.
AMD's Zen 5-based desktop processors have four full-width 512-bit execution units for AVX-512, which makes execution of such instructions very efficient (as some parts use double-pumped AVX-256 units to execute 512-bit instructions), but at the cost of die size.
High-performance desktops, workstations, and servers are often used for various vector workloads from AI and HPC realms, so implementing AVX-512 correctly was crucial for AMD when it designed its Zen 5 implementation for desktops and servers. However, AMD's mobile parts, such as the codenamed Strix Point processors, use double-pumped AVX-256 to execute AVX-512 instructions.
While such an approach will probably confuse software developers and, to some degree, end users, it should be noted that by avoiding the implementation of full-blown 512-bit data paths, AMD makes its cores slightly more compact. This allows it to pack more cores into its processors, and more cores bring higher performance for more users than AVX-512 alone.