TPAUSE — Timed PAUSE

Opcode / Instruction Op/En 64/32 bit Mode Support CPUID Feature Flag Description
66 0F AE /6 TPAUSE r32, <edx>, <eax> A V/V WAITPKG Directs the processor to enter an implementation-dependent optimized state until the TSC reaches the value in EDX:EAX.

Instruction Operand Encoding1

1. The Mod field of the ModR/M byte must have value 11B.

Op/En Tuple Operand 1 Operand 2 Operand 3 Operand 4
A NA ModRM:r/m (r) NA NA NA

Description

TPAUSE instructs the processor to enter an implementation-dependent optimized state. There are two such optimized states to choose from: light-weight power/performance optimized state, and improved power/performance optimized state. The selection between the two is governed by the explicit input register bit[0] source operand.

TPAUSE is available when CPUID.7.0:ECX.WAITPKG[bit 5] is enumerated as 1. TPAUSE may be executed at any privilege level. This instruction’s operation is the same in non-64-bit modes and in 64-bit mode.

Unlike PAUSE, the TPAUSE instruction will not cause an abort when used inside a transactional region, described in the chapter Chapter 16, “Programming with Intel® Transactional Synchronization Extensions,” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1.

The input register contains information such as the preferred optimized state the processor should enter as described in the following table. Bits other than bit 0 are reserved and will result in #GP if non-zero.

Bit Value State Name Wakeup Time Power Savings Other Benefits
bit[0] = 0 C0.2 Slower Larger Improves performance of the other SMT thread(s) on the same core.
bit[0] = 1 C0.1 Faster Smaller NA
bits[31:1] NA NA NA Reserved
Table 4-19. TPAUSE Input Register Bit Definitions

The instruction execution wakes up when the time-stamp counter reaches or exceeds the implicit EDX:EAX 64-bit input value.

Prior to executing the TPAUSE instruction, an operating system may specify the maximum delay it allows the processor to suspend its operation. It can do so by writing TSC-quanta value to the following 32-bit MSR (IA32_UMWAIT_CONTROL at MSR index E1H):

If the processor that executed a TPAUSE instruction wakes due to the expiration of the operating system time-limit, the instructions sets RFLAGS.CF; otherwise, that flag is cleared.

The following additional events cause the processor to exit the implementation-dependent optimized state: a store to the read-set range within the transactional region, an NMI or SMI, a debug exception, a machine check exception, the BINIT# signal, the INIT# signal, and the RESET# signal.

Other implementation-dependent events may cause the processor to exit the implementation-dependent optimized state proceeding to the instruction following TPAUSE. In addition, an external interrupt causes the processor to exit the implementation-dependent optimized state regardless of whether maskable-interrupts are inhibited (EFLAGS.IF =0). It should be noted that if maskable-interrupts are inhibited execution will proceed to the instruction following TPAUSE.

Operation

os_deadline ← TSC+(IA32_MWAIT_CONTROL[31:2]<<2)
instr_deadline ← UINT64(EDX:EAX)
IF os_deadline < instr_deadline:
    deadline ← os_deadline
    using_os_deadline ← 1
ELSE:
    deadline ← instr_deadline
    using_os_deadline ← 0
WHILE TSC < deadline:
    implementation_dependent_optimized_state(Source register, deadline, IA32_UMWAIT_CONTROL[0])
IF using_os_deadline AND TSC > deadline:
    RFLAGS.CF ← 1
ELSE:
    RFLAGS.CF ← 0
RFLAGS.AF,PF,SF,ZF,OF ← 0

Intel C/C++ Compiler Intrinsic Equivalent

TPAUSE uint8_t _tpause(uint32_t control, uint64_t counter);

Numeric Exceptions

None.

Exceptions (All Operating Modes)

#GP(0) If src[31:1] != 0.

If CR4.TSD = 1 and CPL != 0.

#UD If CPUID.7.0:ECX.WAITPKG[bit 5]=0.