diff mbox series

[RISU,RFC,v2,06/14] x86.risu: add MMX instructions

Message ID 20190701043536.26019-7-jan.bobek@gmail.com (mailing list archive)
State New, archived
Headers show
Series Support for generating x86 MMX/SSE/AVX test images | expand

Commit Message

Jan Bobek July 1, 2019, 4:35 a.m. UTC
Add an x86 configuration file with all MMX instructions.

Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
---
 x86.risu | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)
 create mode 100644 x86.risu

Comments

Richard Henderson July 3, 2019, 9:35 p.m. UTC | #1
On 7/1/19 6:35 AM, Jan Bobek wrote:
> Add an x86 configuration file with all MMX instructions.
> 
> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
> ---
>  x86.risu | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 96 insertions(+)
>  create mode 100644 x86.risu

Note that most of these MMX instructions affect the FPU, not the vector unit.
We would want to extend risu again to handle this.  You'd also need to seed the
FPU with random data.

I was thinking for a moment that this is really beyond what you've signed up
for, but on second thoughts it's not.  Decoding SSE is really tangled with
decoding MMX, via the 0x66 prefix, and you'll want to be able to verify that
you don't regress.

> +# State Management Instructions
> +EMMS            MMX     00001111 01110111 !emit { }

I'm not sure this is really testable, because of the state change.  But we'll
see what happens with the aforementioned dumping.

> +# Arithmetic Instructions
> +PADDB           MMX     00001111 11111100 !emit { modrm(); mem(size => 8); }
> +PADDW           MMX     00001111 11111101 !emit { modrm(); mem(size => 8); }
> +PADDD           MMX     00001111 11111110 !emit { modrm(); mem(size => 8); }
> +PADDQ           MMX     00001111 11010100 !emit { modrm(); mem(size => 8); }

PADDQ is sse2.


r~
Richard Henderson July 3, 2019, 9:49 p.m. UTC | #2
On 7/1/19 6:35 AM, Jan Bobek wrote:
> +MOVQ            MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => MOD_DIRECT, rm => ~REG_ESP); }
> +MOVQ_mem        MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => ~MOD_DIRECT); mem(size => 8); }

Oh, note that there are only 8 mmx registers, so the respective rex.{r,b} bit
can't be set.


r~
Peter Maydell July 3, 2019, 10:01 p.m. UTC | #3
On Mon, 1 Jul 2019 at 05:43, Jan Bobek <jan.bobek@gmail.com> wrote:
>
> Add an x86 configuration file with all MMX instructions.
>
> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>

> --- /dev/null
> +++ b/x86.risu
> @@ -0,0 +1,96 @@
> +###############################################################################
> +# Copyright (c) 2019 Linaro Limited

I'm guessing from your email address that this copyright line probably
isn't right :-)

thanks
-- PMM
Jan Bobek July 10, 2019, 6:29 p.m. UTC | #4
On 7/3/19 5:35 PM, Richard Henderson wrote:
> On 7/1/19 6:35 AM, Jan Bobek wrote:
>> Add an x86 configuration file with all MMX instructions.
>>
>> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
>> ---
>>  x86.risu | 96 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 96 insertions(+)
>>  create mode 100644 x86.risu
> 
> Note that most of these MMX instructions affect the FPU, not the vector unit.
> We would want to extend risu again to handle this.  You'd also need to seed the
> FPU with random data.
> 
> I was thinking for a moment that this is really beyond what you've signed up
> for, but on second thoughts it's not.  Decoding SSE is really tangled with
> decoding MMX, via the 0x66 prefix, and you'll want to be able to verify that
> you don't regress.

Honestly, I added MMX instructions just for completeness; I figured it can't
hurt, and you can always filter them out via command-line switches. You have
a point with the regression testing, though...

>> +# State Management Instructions
>> +EMMS            MMX     00001111 01110111 !emit { }
> 
> I'm not sure this is really testable, because of the state change.  But we'll
> see what happens with the aforementioned dumping.
> 
>> +# Arithmetic Instructions
>> +PADDB           MMX     00001111 11111100 !emit { modrm(); mem(size => 8); }
>> +PADDW           MMX     00001111 11111101 !emit { modrm(); mem(size => 8); }
>> +PADDD           MMX     00001111 11111110 !emit { modrm(); mem(size => 8); }
>> +PADDQ           MMX     00001111 11010100 !emit { modrm(); mem(size => 8); }

Not this one, at least according to the Intel docs:

NP 0F D4 /r: PADDQ mm, mm/m64          (MMX)
66 0F D4 /r: PADDQ xmm1, xmm2/m128     (SSE2)

The SSE2 version is added in a later patch.

-Jan
Jan Bobek July 10, 2019, 6:32 p.m. UTC | #5
On 7/3/19 5:49 PM, Richard Henderson wrote:
> On 7/1/19 6:35 AM, Jan Bobek wrote:
>> +MOVQ            MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => MOD_DIRECT, rm => ~REG_ESP); }
>> +MOVQ_mem        MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => ~MOD_DIRECT); mem(size => 8); }
> 
> Oh, note that there are only 8 mmx registers, so the respective rex.{r,b} bit
> can't be set.

Actually, my CPU chewed it without choking even when the bits were
set, but it will taken care of in v3.

-Jan
Jan Bobek July 10, 2019, 6:35 p.m. UTC | #6
On 7/3/19 6:01 PM, Peter Maydell wrote:
> On Mon, 1 Jul 2019 at 05:43, Jan Bobek <jan.bobek@gmail.com> wrote:
>>
>> Add an x86 configuration file with all MMX instructions.
>>
>> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
> 
>> --- /dev/null
>> +++ b/x86.risu
>> @@ -0,0 +1,96 @@
>> +###############################################################################
>> +# Copyright (c) 2019 Linaro Limited
> 
> I'm guessing from your email address that this copyright line probably
> isn't right :-)

Haha indeed, I just copy-pasted it from the other files; the same goes for
the rest of the source files.

Any suggestions on what it should be? I'm not currently employed by
anyone (as Google keeps reminding us).

-Jan
Alex Bennée July 11, 2019, 6:45 a.m. UTC | #7
Jan Bobek <jan.bobek@gmail.com> writes:

> On 7/3/19 6:01 PM, Peter Maydell wrote:
>> On Mon, 1 Jul 2019 at 05:43, Jan Bobek <jan.bobek@gmail.com> wrote:
>>>
>>> Add an x86 configuration file with all MMX instructions.
>>>
>>> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
>>
>>> --- /dev/null
>>> +++ b/x86.risu
>>> @@ -0,0 +1,96 @@
>>> +###############################################################################
>>> +# Copyright (c) 2019 Linaro Limited
>>
>> I'm guessing from your email address that this copyright line probably
>> isn't right :-)
>
> Haha indeed, I just copy-pasted it from the other files; the same goes for
> the rest of the source files.
>
> Any suggestions on what it should be? I'm not currently employed by
> anyone (as Google keeps reminding us).

It should be (c) 2019 Jan Bobek as you wrote it. The license text should
be the same (assuming you are happy to license it, which I assume you
are given you are contributing to RISU ;-)

>
> -Jan


--
Alex Bennée
Richard Henderson July 11, 2019, 9:32 a.m. UTC | #8
On 7/10/19 8:29 PM, Jan Bobek wrote:
>>> +# Arithmetic Instructions
>>> +PADDB           MMX     00001111 11111100 !emit { modrm(); mem(size => 8); }
>>> +PADDW           MMX     00001111 11111101 !emit { modrm(); mem(size => 8); }
>>> +PADDD           MMX     00001111 11111110 !emit { modrm(); mem(size => 8); }
>>> +PADDQ           MMX     00001111 11010100 !emit { modrm(); mem(size => 8); }
> 
> Not this one, at least according to the Intel docs:
> 
> NP 0F D4 /r: PADDQ mm, mm/m64          (MMX)
> 66 0F D4 /r: PADDQ xmm1, xmm2/m128     (SSE2)
> 
> The SSE2 version is added in a later patch.

That's not how I read the Intel docs.

In the CPUID feature flag column of the MMX PADDQ, I see SSE2.  While the insn
affects the mmx registers, it was not added with the original MMX instruction set.


r~
Richard Henderson July 11, 2019, 9:34 a.m. UTC | #9
On 7/10/19 8:32 PM, Jan Bobek wrote:
> On 7/3/19 5:49 PM, Richard Henderson wrote:
>> On 7/1/19 6:35 AM, Jan Bobek wrote:
>>> +MOVQ            MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => MOD_DIRECT, rm => ~REG_ESP); }
>>> +MOVQ_mem        MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => ~MOD_DIRECT); mem(size => 8); }
>>
>> Oh, note that there are only 8 mmx registers, so the respective rex.{r,b} bit
>> can't be set.
> 
> Actually, my CPU chewed it without choking even when the bits were
> set, but it will taken care of in v3.

That's interesting data.

I wonder if it's worth retaining this as a feature in order to check qemu's
implementation?


r~
Alex Bennée July 11, 2019, 9:44 a.m. UTC | #10
Richard Henderson <richard.henderson@linaro.org> writes:

> On 7/10/19 8:32 PM, Jan Bobek wrote:
>> On 7/3/19 5:49 PM, Richard Henderson wrote:
>>> On 7/1/19 6:35 AM, Jan Bobek wrote:
>>>> +MOVQ            MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => MOD_DIRECT, rm => ~REG_ESP); }
>>>> +MOVQ_mem        MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => ~MOD_DIRECT); mem(size => 8); }
>>>
>>> Oh, note that there are only 8 mmx registers, so the respective rex.{r,b} bit
>>> can't be set.
>>
>> Actually, my CPU chewed it without choking even when the bits were
>> set, but it will taken care of in v3.
>
> That's interesting data.
>
> I wonder if it's worth retaining this as a feature in order to check qemu's
> implementation?

We could be some time, c.f. BlackHat 2017

  https://www.youtube.com/watch?v=KrksBdWcZgQ

I suspect if we set https://github.com/xoreaxeaxeax/sandsifter on QEMU
we might find a few breakages.

>
>
> r~


--
Alex Bennée
Jan Bobek July 11, 2019, 1:29 p.m. UTC | #11
On 7/11/19 5:32 AM, Richard Henderson wrote:
> On 7/10/19 8:29 PM, Jan Bobek wrote:
>>>> +# Arithmetic Instructions
>>>> +PADDB           MMX     00001111 11111100 !emit { modrm(); mem(size => 8); }
>>>> +PADDW           MMX     00001111 11111101 !emit { modrm(); mem(size => 8); }
>>>> +PADDD           MMX     00001111 11111110 !emit { modrm(); mem(size => 8); }
>>>> +PADDQ           MMX     00001111 11010100 !emit { modrm(); mem(size => 8); }
>>
>> Not this one, at least according to the Intel docs:
>>
>> NP 0F D4 /r: PADDQ mm, mm/m64          (MMX)
>> 66 0F D4 /r: PADDQ xmm1, xmm2/m128     (SSE2)
>>
>> The SSE2 version is added in a later patch.
> 
> That's not how I read the Intel docs.
> 
> In the CPUID feature flag column of the MMX PADDQ, I see SSE2.  While the insn
> affects the mmx registers, it was not added with the original MMX instruction set.

I know what you mean; for example, PSUBQ is like that. I know about
these kind of instructions because "{name}_{enc}" does not form a
unique key, and risugen would complain about that. That's why there is
PSUBQ_mm and PSUBQ in the final x86.risu file.

However, I downloaded a fresh copy of Intel SDM off the Intel website
this morning (just to make sure) and in Volume 2B, Section "4.3
Instructions (M-U)," page 4-208 titled "PADDB/PADDW/PADDD/PADDQ—Add
Packed Integers," there's the NP 0F D4 /r PADDQ mm, mm/m64 instruction
in the 4th row, and the CPUID column says MMX. On the other hand, I
can't find it in the Volume 1, Section 5.4 "MMX(tm) Instructions," or
in Vol. 1, Chapter 9 "Programming with Intel(R) MMX(tm) Technology,"
so it's a bit confusing.

If you know for a fact that it didn't come until SSE2 and the manual
is wrong, I will change it.

-Jan
Jan Bobek July 11, 2019, 1:33 p.m. UTC | #12
On 7/11/19 2:45 AM, Alex Bennée wrote:
> 
> Jan Bobek <jan.bobek@gmail.com> writes:
> 
>> On 7/3/19 6:01 PM, Peter Maydell wrote:
>>> On Mon, 1 Jul 2019 at 05:43, Jan Bobek <jan.bobek@gmail.com> wrote:
>>>>
>>>> Add an x86 configuration file with all MMX instructions.
>>>>
>>>> Signed-off-by: Jan Bobek <jan.bobek@gmail.com>
>>>
>>>> --- /dev/null
>>>> +++ b/x86.risu
>>>> @@ -0,0 +1,96 @@
>>>> +###############################################################################
>>>> +# Copyright (c) 2019 Linaro Limited
>>>
>>> I'm guessing from your email address that this copyright line probably
>>> isn't right :-)
>>
>> Haha indeed, I just copy-pasted it from the other files; the same goes for
>> the rest of the source files.
>>
>> Any suggestions on what it should be? I'm not currently employed by
>> anyone (as Google keeps reminding us).
> 
> It should be (c) 2019 Jan Bobek as you wrote it. The license text should
> be the same (assuming you are happy to license it, which I assume you
> are given you are contributing to RISU ;-)

Sounds great, thank you!

-Jan
Richard Henderson July 11, 2019, 1:57 p.m. UTC | #13
On 7/11/19 3:29 PM, Jan Bobek wrote:
> However, I downloaded a fresh copy of Intel SDM off the Intel website
> this morning (just to make sure) and in Volume 2B, Section "4.3
> Instructions (M-U)," page 4-208 titled "PADDB/PADDW/PADDD/PADDQ—Add
> Packed Integers," there's the NP 0F D4 /r PADDQ mm, mm/m64 instruction
> in the 4th row, and the CPUID column says MMX. On the other hand, I
> can't find it in the Volume 1, Section 5.4 "MMX(tm) Instructions," or
> in Vol. 1, Chapter 9 "Programming with Intel(R) MMX(tm) Technology,"
> so it's a bit confusing.
> 
> If you know for a fact that it didn't come until SSE2 and the manual
> is wrong, I will change it.

Interesting.  I see what you see in

  253665-069US January 2019

but I first looked at

  325462-058US April 2016

which definitely has this marked as SSE2.

In the 2019 version, "5.6.3 SSE2 128-Bit SIMD Integer Instructions" is the
first mention of PADDQ.  Whereas "5.4.3 MMX Packed Arithmetic Instructions"
mentions PADD{B,W,D} but not Q.

I tend to think that this is a bug in the current manual.

Checking in binutils I see

> paddq, 2, 0x660fd4, None, 2, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
> paddq, 2, 0xfd4, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoAVX, { Qword|Unspecified|BaseIndex|RegMMX, RegMMX }

and both contain CpuSSE2. If you like, I could run this by one of the Intel GCC
folk to be sure.


r~
Jan Bobek July 11, 2019, 9:29 p.m. UTC | #14
On 7/11/19 9:57 AM, Richard Henderson wrote:
> On 7/11/19 3:29 PM, Jan Bobek wrote:
>> However, I downloaded a fresh copy of Intel SDM off the Intel website
>> this morning (just to make sure) and in Volume 2B, Section "4.3
>> Instructions (M-U)," page 4-208 titled "PADDB/PADDW/PADDD/PADDQ—Add
>> Packed Integers," there's the NP 0F D4 /r PADDQ mm, mm/m64 instruction
>> in the 4th row, and the CPUID column says MMX. On the other hand, I
>> can't find it in the Volume 1, Section 5.4 "MMX(tm) Instructions," or
>> in Vol. 1, Chapter 9 "Programming with Intel(R) MMX(tm) Technology,"
>> so it's a bit confusing.
>>
>> If you know for a fact that it didn't come until SSE2 and the manual
>> is wrong, I will change it.
> 
> Interesting.  I see what you see in
> 
>   253665-069US January 2019
> 
> but I first looked at
> 
>   325462-058US April 2016
> 
> which definitely has this marked as SSE2.
> 
> In the 2019 version, "5.6.3 SSE2 128-Bit SIMD Integer Instructions" is the
> first mention of PADDQ.  Whereas "5.4.3 MMX Packed Arithmetic Instructions"
> mentions PADD{B,W,D} but not Q.
> 
> I tend to think that this is a bug in the current manual.
> 
> Checking in binutils I see
> 
>> paddq, 2, 0x660fd4, None, 2, CpuSSE2, Modrm|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
>> paddq, 2, 0xfd4, None, 2, CpuSSE2, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf|NoAVX, { Qword|Unspecified|BaseIndex|RegMMX, RegMMX }
> 
> and both contain CpuSSE2. If you like, I could run this by one of the Intel GCC
> folk to be sure.

I think this is convincing enough for me; it was a good idea to check
binutils! I find it interesting that they'd get it wrong in a more
recent version of the manual, though.

-Jan
diff mbox series

Patch

diff --git a/x86.risu b/x86.risu
new file mode 100644
index 0000000..f2dd9b0
--- /dev/null
+++ b/x86.risu
@@ -0,0 +1,96 @@ 
+###############################################################################
+# Copyright (c) 2019 Linaro Limited
+# All rights reserved. This program and the accompanying materials
+# are made available under the terms of the Eclipse Public License v1.0
+# which accompanies this distribution, and is available at
+# http://www.eclipse.org/legal/epl-v10.html
+#
+# Contributors:
+#     Jan Bobek - initial implementation
+###############################################################################
+
+# Input file for risugen defining x86 instructions
+.mode x86
+
+# Data Transfer Instructions
+MOVD            MMX     00001111 011 d 1110 !emit { modrm(mod => MOD_DIRECT, rm => ~REG_ESP); }
+MOVD_mem        MMX     00001111 011 d 1110 !emit { modrm(mod => ~MOD_DIRECT); mem(size => 4); }
+MOVQ            MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => MOD_DIRECT, rm => ~REG_ESP); }
+MOVQ_mem        MMX     00001111 011 d 1110 !emit { rex(w => 1); modrm(mod => ~MOD_DIRECT); mem(size => 8); }
+MOVQ_mm         MMX     00001111 011 d 1111 !emit { modrm(); mem(size => 8); }
+
+# Arithmetic Instructions
+PADDB           MMX     00001111 11111100 !emit { modrm(); mem(size => 8); }
+PADDW           MMX     00001111 11111101 !emit { modrm(); mem(size => 8); }
+PADDD           MMX     00001111 11111110 !emit { modrm(); mem(size => 8); }
+PADDQ           MMX     00001111 11010100 !emit { modrm(); mem(size => 8); }
+PADDSB          MMX     00001111 11101100 !emit { modrm(); mem(size => 8); }
+PADDSW          MMX     00001111 11101101 !emit { modrm(); mem(size => 8); }
+PADDUSB         MMX     00001111 11011100 !emit { modrm(); mem(size => 8); }
+PADDUSW         MMX     00001111 11011101 !emit { modrm(); mem(size => 8); }
+
+PSUBB           MMX     00001111 11111000 !emit { modrm(); mem(size => 8); }
+PSUBW           MMX     00001111 11111001 !emit { modrm(); mem(size => 8); }
+PSUBD           MMX     00001111 11111010 !emit { modrm(); mem(size => 8); }
+PSUBSB          MMX     00001111 11101000 !emit { modrm(); mem(size => 8); }
+PSUBSW          MMX     00001111 11101001 !emit { modrm(); mem(size => 8); }
+PSUBUSB         MMX     00001111 11011000 !emit { modrm(); mem(size => 8); }
+PSUBUSW         MMX     00001111 11011001 !emit { modrm(); mem(size => 8); }
+
+PMULLW          MMX     00001111 11010101 !emit { modrm(); mem(size => 8); }
+PMULHW          MMX     00001111 11100101 !emit { modrm(); mem(size => 8); }
+
+PMADDWD         MMX     00001111 11110101 !emit { modrm(); mem(size => 8); }
+
+# Comparison Instructions
+PCMPEQB         MMX     00001111 01110100 !emit { modrm(); mem(size => 8); }
+PCMPEQW         MMX     00001111 01110101 !emit { modrm(); mem(size => 8); }
+PCMPEQD         MMX     00001111 01110110 !emit { modrm(); mem(size => 8); }
+PCMPGTB         MMX     00001111 01100100 !emit { modrm(); mem(size => 8); }
+PCMPGTW         MMX     00001111 01100101 !emit { modrm(); mem(size => 8); }
+PCMPGTD         MMX     00001111 01100110 !emit { modrm(); mem(size => 8); }
+
+# Logical Instructions
+PAND            MMX     00001111 11011011 !emit { modrm(); mem(size => 8); }
+PANDN           MMX     00001111 11011111 !emit { modrm(); mem(size => 8); }
+POR             MMX     00001111 11101011 !emit { modrm(); mem(size => 8); }
+PXOR            MMX     00001111 11101111 !emit { modrm(); mem(size => 8); }
+
+# Shift and Rotate Instructions
+PSLLW           MMX     00001111 11110001 !emit { modrm(); mem(size => 8); }
+PSLLD           MMX     00001111 11110010 !emit { modrm(); mem(size => 8); }
+PSLLQ           MMX     00001111 11110011 !emit { modrm(); mem(size => 8); }
+
+PSLLW_imm       MMX     00001111 01110001 !emit { modrm(mod => MOD_DIRECT, reg => 6); imm(size => 1); }
+PSLLD_imm       MMX     00001111 01110010 !emit { modrm(mod => MOD_DIRECT, reg => 6); imm(size => 1); }
+PSLLQ_imm       MMX     00001111 01110011 !emit { modrm(mod => MOD_DIRECT, reg => 6); imm(size => 1); }
+
+PSRLW           MMX     00001111 11010001 !emit { modrm(); mem(size => 8); }
+PSRLD           MMX     00001111 11010010 !emit { modrm(); mem(size => 8); }
+PSRLQ           MMX     00001111 11010011 !emit { modrm(); mem(size => 8); }
+
+PSRLW_imm       MMX     00001111 01110001 !emit { modrm(mod => MOD_DIRECT, reg => 2); imm(size => 1); }
+PSRLD_imm       MMX     00001111 01110010 !emit { modrm(mod => MOD_DIRECT, reg => 2); imm(size => 1); }
+PSRLQ_imm       MMX     00001111 01110011 !emit { modrm(mod => MOD_DIRECT, reg => 2); imm(size => 1); }
+
+PSRAW           MMX     00001111 11100001 !emit { modrm(); mem(size => 8); }
+PSRAD           MMX     00001111 11100010 !emit { modrm(); mem(size => 8); }
+
+PSRAW_imm       MMX     00001111 01110001 !emit { modrm(mod => MOD_DIRECT, reg => 4); imm(size => 1); }
+PSRAD_imm       MMX     00001111 01110010 !emit { modrm(mod => MOD_DIRECT, reg => 4); imm(size => 1); }
+
+# Shuffle, Unpack, Blend, Insert, Extract, Broadcast, Permute, Scatter Instructions
+PACKSSWB        MMX     00001111 01100011 !emit { modrm(); mem(size => 8); }
+PACKSSDW        MMX     00001111 01101011 !emit { modrm(); mem(size => 8); }
+PACKUSWB        MMX     00001111 01100111 !emit { modrm(); mem(size => 8); }
+
+PUNPCKHBW       MMX     00001111 01101000 !emit { modrm(); mem(size => 8); }
+PUNPCKHWD       MMX     00001111 01101001 !emit { modrm(); mem(size => 8); }
+PUNPCKHDQ       MMX     00001111 01101010 !emit { modrm(); mem(size => 8); }
+
+PUNPCKLBW       MMX     00001111 01100000 !emit { modrm(); mem(size => 4); }
+PUNPCKLWD       MMX     00001111 01100001 !emit { modrm(); mem(size => 4); }
+PUNPCKLDQ       MMX     00001111 01100010 !emit { modrm(); mem(size => 4); }
+
+# State Management Instructions
+EMMS            MMX     00001111 01110111 !emit { }