diff mbox series

[v35,05/13] target/avr: Add instruction translation - Arithmetic and Logic Instructions

Message ID 20191029212430.20617-6-mrolnik@gmail.com (mailing list archive)
State New, archived
Headers show
Series QEMU AVR 8 bit cores | expand

Commit Message

Michael Rolnik Oct. 29, 2019, 9:24 p.m. UTC
This includes:
- ADD, ADC, ADIW
- SBIW, SUB, SUBI, SBC, SBCI
- AND, ANDI
- OR, ORI, EOR
- COM, NEG
- INC, DEC
- MUL, MULS, MULSU
- FMUL, FMULS, FMULSU
- DES

Signed-off-by: Michael Rolnik <mrolnik@gmail.com>
---
 target/avr/translate.c | 822 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 822 insertions(+)

Comments

Aleksandar Markovic Nov. 5, 2019, 8:39 a.m. UTC | #1
>
>
> +
> +/*
> + *  This instruction performs 8-bit x 8-bit -> 16-bit signed
> multiplication
> + *  and shifts the result one bit left.
> + */
> +static bool trans_FMULSU(DisasContext *ctx, arg_FMULSU *a)
> +{
> +    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
> +        return true;
> +    }
> +
> +    TCGv R0 = cpu_r[0];
> +    TCGv R1 = cpu_r[1];
> +    TCGv Rd = cpu_r[a->rd];
> +    TCGv Rr = cpu_r[a->rr];
> +    TCGv R = tcg_temp_new_i32();
> +    TCGv t0 = tcg_temp_new_i32();
> +
> +    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
> +    tcg_gen_mul_tl(R, t0, Rr); /* R = Rd * Rr */
> +    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
> +
> +    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
> +    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
> +
> +    tcg_gen_shli_tl(R, R, 1);
> +
> +    tcg_gen_andi_tl(R0, R, 0xff);
> +    tcg_gen_shri_tl(R1, R, 8);
> +    tcg_gen_andi_tl(R1, R1, 0xff);
> +
> +    tcg_temp_free_i32(t0);
> +    tcg_temp_free_i32(R);
> +
> +    return true;
> +}
> +


Hi, Michael.

The way I understand the spec is that a->rd and a->rd must be between 16
and 23:

https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_FMULSU.html

Is my interpretation right? If yes, where is the corresponding part in the
implementation?

Yours, Aleksandar


>
Aleksandar Markovic Nov. 5, 2019, 9:46 a.m. UTC | #2
On Tuesday, November 5, 2019, Aleksandar Markovic <
aleksandar.m.mail@gmail.com> wrote:

>
>> +
>> +/*
>> + *  This instruction performs 8-bit x 8-bit -> 16-bit signed
>> multiplication
>> + *  and shifts the result one bit left.
>> + */
>> +static bool trans_FMULSU(DisasContext *ctx, arg_FMULSU *a)
>> +{
>> +    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
>> +        return true;
>> +    }
>> +
>> +    TCGv R0 = cpu_r[0];
>> +    TCGv R1 = cpu_r[1];
>> +    TCGv Rd = cpu_r[a->rd];
>> +    TCGv Rr = cpu_r[a->rr];
>> +    TCGv R = tcg_temp_new_i32();
>> +    TCGv t0 = tcg_temp_new_i32();
>> +
>> +    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
>> +    tcg_gen_mul_tl(R, t0, Rr); /* R = Rd * Rr */
>> +    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
>> +
>> +    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
>> +    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
>> +
>> +    tcg_gen_shli_tl(R, R, 1);
>> +
>> +    tcg_gen_andi_tl(R0, R, 0xff);
>> +    tcg_gen_shri_tl(R1, R, 8);
>> +    tcg_gen_andi_tl(R1, R1, 0xff);
>> +
>> +    tcg_temp_free_i32(t0);
>> +    tcg_temp_free_i32(R);
>> +
>> +    return true;
>> +}
>> +
>
>
> Hi, Michael.
>
> The way I understand the spec is that a->rd and a->rd must be between 16
> and 23:
>
> https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_FMULSU.html
>
> Is my interpretation right? If yes, where is the corresponding part in the
> implementation?
>
>
Or, perhaps,

TCGv Rd = cpu_r[a->rd];

should be

TCGv Rd = cpu_r[a->rd + 16];

(and the same for rs)

?


Yours, Aleksandar
>
>
>>
Richard Henderson Nov. 5, 2019, 1:23 p.m. UTC | #3
On 11/5/19 10:46 AM, Aleksandar Markovic wrote:
> 
> 
> On Tuesday, November 5, 2019, Aleksandar Markovic <aleksandar.m.mail@gmail.com
> <mailto:aleksandar.m.mail@gmail.com>> wrote:
> 
> 
>         +
>         +/*
>         + *  This instruction performs 8-bit x 8-bit -> 16-bit signed
>         multiplication
>         + *  and shifts the result one bit left.
>         + */
>         +static bool trans_FMULSU(DisasContext *ctx, arg_FMULSU *a)
>         +{
>         +    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
>         +        return true;
>         +    }
>         +
>         +    TCGv R0 = cpu_r[0];
>         +    TCGv R1 = cpu_r[1];
>         +    TCGv Rd = cpu_r[a->rd];
>         +    TCGv Rr = cpu_r[a->rr];
>         +    TCGv R = tcg_temp_new_i32();
>         +    TCGv t0 = tcg_temp_new_i32();
>         +
>         +    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
>         +    tcg_gen_mul_tl(R, t0, Rr); /* R = Rd * Rr */
>         +    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
>         +
>         +    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
>         +    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
>         +
>         +    tcg_gen_shli_tl(R, R, 1);
>         +
>         +    tcg_gen_andi_tl(R0, R, 0xff);
>         +    tcg_gen_shri_tl(R1, R, 8);
>         +    tcg_gen_andi_tl(R1, R1, 0xff);
>         +
>         +    tcg_temp_free_i32(t0);
>         +    tcg_temp_free_i32(R);
>         +
>         +    return true;
>         +}
>         +
> 
> 
>     Hi, Michael.
> 
>     The way I understand the spec is that a->rd and a->rd must be between 16
>     and 23:
> 
>     https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_FMULSU.html
>     <https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_FMULSU.html>
> 
>     Is my interpretation right? If yes, where is the corresponding part in the
>     implementation?
> 
> 
> Or, perhaps,
> 
> TCGv Rd = cpu_r[a->rd];
> 
> should be
> 
> TCGv Rd = cpu_r[a->rd + 16];
>  
> (and the same for rs)

This happens during decode:

+%rd_b           4:3                         !function=to_B
+%rr_b           0:3                         !function=to_B
+@fmul           .... .... . ... . ...       &rd_rr      rd=%rd_b rr=%rr_b
+FMUL            0000 0011 0 ... 1 ...       @fmul
+FMULS           0000 0011 1 ... 0 ...       @fmul
+FMULSU          0000 0011 1 ... 1 ...       @fmul

This means that a->rd = to_B(extract32(insn, 4, 3)), and

> +static int to_B(DisasContext *ctx, int indx) { return 16 + (indx % 8); }

et voila.


r~
Aleksandar Markovic Nov. 5, 2019, 2:37 p.m. UTC | #4
On Tue, Nov 5, 2019 at 2:23 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 11/5/19 10:46 AM, Aleksandar Markovic wrote:
> >
> >
> > On Tuesday, November 5, 2019, Aleksandar Markovic <aleksandar.m.mail@gmail.com
> > <mailto:aleksandar.m.mail@gmail.com>> wrote:
> >
> >
> >         +
> >         +/*
> >         + *  This instruction performs 8-bit x 8-bit -> 16-bit signed
> >         multiplication
> >         + *  and shifts the result one bit left.
> >         + */
> >         +static bool trans_FMULSU(DisasContext *ctx, arg_FMULSU *a)
> >         +{
> >         +    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
> >         +        return true;
> >         +    }
> >         +
> >         +    TCGv R0 = cpu_r[0];
> >         +    TCGv R1 = cpu_r[1];
> >         +    TCGv Rd = cpu_r[a->rd];
> >         +    TCGv Rr = cpu_r[a->rr];
> >         +    TCGv R = tcg_temp_new_i32();
> >         +    TCGv t0 = tcg_temp_new_i32();
> >         +
> >         +    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
> >         +    tcg_gen_mul_tl(R, t0, Rr); /* R = Rd * Rr */
> >         +    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
> >         +
> >         +    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
> >         +    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
> >         +
> >         +    tcg_gen_shli_tl(R, R, 1);
> >         +
> >         +    tcg_gen_andi_tl(R0, R, 0xff);
> >         +    tcg_gen_shri_tl(R1, R, 8);
> >         +    tcg_gen_andi_tl(R1, R1, 0xff);
> >         +
> >         +    tcg_temp_free_i32(t0);
> >         +    tcg_temp_free_i32(R);
> >         +
> >         +    return true;
> >         +}
> >         +
> >
> >
> >     Hi, Michael.
> >
> >     The way I understand the spec is that a->rd and a->rd must be between 16
> >     and 23:
> >
> >     https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_FMULSU.html
> >     <https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_FMULSU.html>
> >
> >     Is my interpretation right? If yes, where is the corresponding part in the
> >     implementation?
> >
> >
> > Or, perhaps,
> >
> > TCGv Rd = cpu_r[a->rd];
> >
> > should be
> >
> > TCGv Rd = cpu_r[a->rd + 16];
> >
> > (and the same for rs)
>
> This happens during decode:
>
> +%rd_b           4:3                         !function=to_B
> +%rr_b           0:3                         !function=to_B
> +@fmul           .... .... . ... . ...       &rd_rr      rd=%rd_b rr=%rr_b
> +FMUL            0000 0011 0 ... 1 ...       @fmul
> +FMULS           0000 0011 1 ... 0 ...       @fmul
> +FMULSU          0000 0011 1 ... 1 ...       @fmul
>
> This means that a->rd = to_B(extract32(insn, 4, 3)), and
>
> > +static int to_B(DisasContext *ctx, int indx) { return 16 + (indx % 8); }
>
> et voila.
>

OK. Thanks for clarification.

>
> r~
Michael Rolnik Nov. 19, 2019, 8:09 p.m. UTC | #5
Hi Aleksandar et al.

how is it going? should I rebase or not?

Michael

On Tue, Nov 5, 2019 at 4:38 PM Aleksandar Markovic
<aleksandar.m.mail@gmail.com> wrote:
>
> On Tue, Nov 5, 2019 at 2:23 PM Richard Henderson
> <richard.henderson@linaro.org> wrote:
> >
> > On 11/5/19 10:46 AM, Aleksandar Markovic wrote:
> > >
> > >
> > > On Tuesday, November 5, 2019, Aleksandar Markovic <aleksandar.m.mail@gmail.com
> > > <mailto:aleksandar.m.mail@gmail.com>> wrote:
> > >
> > >
> > >         +
> > >         +/*
> > >         + *  This instruction performs 8-bit x 8-bit -> 16-bit signed
> > >         multiplication
> > >         + *  and shifts the result one bit left.
> > >         + */
> > >         +static bool trans_FMULSU(DisasContext *ctx, arg_FMULSU *a)
> > >         +{
> > >         +    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
> > >         +        return true;
> > >         +    }
> > >         +
> > >         +    TCGv R0 = cpu_r[0];
> > >         +    TCGv R1 = cpu_r[1];
> > >         +    TCGv Rd = cpu_r[a->rd];
> > >         +    TCGv Rr = cpu_r[a->rr];
> > >         +    TCGv R = tcg_temp_new_i32();
> > >         +    TCGv t0 = tcg_temp_new_i32();
> > >         +
> > >         +    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
> > >         +    tcg_gen_mul_tl(R, t0, Rr); /* R = Rd * Rr */
> > >         +    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
> > >         +
> > >         +    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
> > >         +    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
> > >         +
> > >         +    tcg_gen_shli_tl(R, R, 1);
> > >         +
> > >         +    tcg_gen_andi_tl(R0, R, 0xff);
> > >         +    tcg_gen_shri_tl(R1, R, 8);
> > >         +    tcg_gen_andi_tl(R1, R1, 0xff);
> > >         +
> > >         +    tcg_temp_free_i32(t0);
> > >         +    tcg_temp_free_i32(R);
> > >         +
> > >         +    return true;
> > >         +}
> > >         +
> > >
> > >
> > >     Hi, Michael.
> > >
> > >     The way I understand the spec is that a->rd and a->rd must be between 16
> > >     and 23:
> > >
> > >     https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_FMULSU.html
> > >     <https://www.microchip.com/webdoc/avrassembler/avrassembler.wb_FMULSU.html>
> > >
> > >     Is my interpretation right? If yes, where is the corresponding part in the
> > >     implementation?
> > >
> > >
> > > Or, perhaps,
> > >
> > > TCGv Rd = cpu_r[a->rd];
> > >
> > > should be
> > >
> > > TCGv Rd = cpu_r[a->rd + 16];
> > >
> > > (and the same for rs)
> >
> > This happens during decode:
> >
> > +%rd_b           4:3                         !function=to_B
> > +%rr_b           0:3                         !function=to_B
> > +@fmul           .... .... . ... . ...       &rd_rr      rd=%rd_b rr=%rr_b
> > +FMUL            0000 0011 0 ... 1 ...       @fmul
> > +FMULS           0000 0011 1 ... 0 ...       @fmul
> > +FMULSU          0000 0011 1 ... 1 ...       @fmul
> >
> > This means that a->rd = to_B(extract32(insn, 4, 3)), and
> >
> > > +static int to_B(DisasContext *ctx, int indx) { return 16 + (indx % 8); }
> >
> > et voila.
> >
>
> OK. Thanks for clarification.
>
> >
> > r~
Aleksandar Markovic Nov. 19, 2019, 9:18 p.m. UTC | #6
On Tue, Nov 19, 2019 at 9:09 PM Michael Rolnik <mrolnik@gmail.com> wrote:
>
> Hi Aleksandar et al.
>
> how is it going? should I rebase or not?
>
> Michael
>

I am just in the process of taking a detailed look to the whole series.

I don't have any serious objection so far, I may just suggest a couple
of things that will improve readability of the code - so nothing of some
really serious nature.

Stay tuned (and be patient). Will hear from me soon.

Yours, Aleksandar
Michael Rolnik Nov. 19, 2019, 9:19 p.m. UTC | #7
Sounds good, thanks!

On Tue, Nov 19, 2019 at 11:18 PM Aleksandar Markovic
<aleksandar.m.mail@gmail.com> wrote:
>
> On Tue, Nov 19, 2019 at 9:09 PM Michael Rolnik <mrolnik@gmail.com> wrote:
> >
> > Hi Aleksandar et al.
> >
> > how is it going? should I rebase or not?
> >
> > Michael
> >
>
> I am just in the process of taking a detailed look to the whole series.
>
> I don't have any serious objection so far, I may just suggest a couple
> of things that will improve readability of the code - so nothing of some
> really serious nature.
>
> Stay tuned (and be patient). Will hear from me soon.
>
> Yours, Aleksandar
diff mbox series

Patch

diff --git a/target/avr/translate.c b/target/avr/translate.c
index 53c9892a60..573c9988b5 100644
--- a/target/avr/translate.c
+++ b/target/avr/translate.c
@@ -130,3 +130,825 @@  static bool avr_have_feature(DisasContext *ctx, int feature)
 static bool decode_insn(DisasContext *ctx, uint16_t insn);
 #include "decode_insn.inc.c"
 
+
+static void gen_add_CHf(TCGv R, TCGv Rd, TCGv Rr)
+{
+    TCGv t1 = tcg_temp_new_i32();
+    TCGv t2 = tcg_temp_new_i32();
+    TCGv t3 = tcg_temp_new_i32();
+
+    tcg_gen_and_tl(t1, Rd, Rr); /* t1 = Rd & Rr */
+    tcg_gen_andc_tl(t2, Rd, R); /* t2 = Rd & ~R */
+    tcg_gen_andc_tl(t3, Rr, R); /* t3 = Rr & ~R */
+    tcg_gen_or_tl(t1, t1, t2); /* t1 = t1 | t2 | t3 */
+    tcg_gen_or_tl(t1, t1, t3);
+
+    tcg_gen_shri_tl(cpu_Cf, t1, 7); /* Cf = t1(7) */
+    tcg_gen_shri_tl(cpu_Hf, t1, 3); /* Hf = t1(3) */
+    tcg_gen_andi_tl(cpu_Hf, cpu_Hf, 1);
+
+    tcg_temp_free_i32(t3);
+    tcg_temp_free_i32(t2);
+    tcg_temp_free_i32(t1);
+}
+
+
+static void gen_add_Vf(TCGv R, TCGv Rd, TCGv Rr)
+{
+    TCGv t1 = tcg_temp_new_i32();
+    TCGv t2 = tcg_temp_new_i32();
+
+    /* t1 = Rd & Rr & ~R | ~Rd & ~Rr & R = (Rd ^ R) & ~(Rd ^ Rr) */
+    tcg_gen_xor_tl(t1, Rd, R);
+    tcg_gen_xor_tl(t2, Rd, Rr);
+    tcg_gen_andc_tl(t1, t1, t2);
+
+    tcg_gen_shri_tl(cpu_Vf, t1, 7); /* Vf = t1(7) */
+
+    tcg_temp_free_i32(t2);
+    tcg_temp_free_i32(t1);
+}
+
+
+static void gen_sub_CHf(TCGv R, TCGv Rd, TCGv Rr)
+{
+    TCGv t1 = tcg_temp_new_i32();
+    TCGv t2 = tcg_temp_new_i32();
+    TCGv t3 = tcg_temp_new_i32();
+
+    /* Cf & Hf */
+    tcg_gen_not_tl(t1, Rd); /* t1 = ~Rd */
+    tcg_gen_and_tl(t2, t1, Rr); /* t2 = ~Rd & Rr */
+    tcg_gen_or_tl(t3, t1, Rr); /* t3 = (~Rd | Rr) & R */
+    tcg_gen_and_tl(t3, t3, R);
+    tcg_gen_or_tl(t2, t2, t3); /* t2 = ~Rd & Rr | ~Rd & R | R & Rr */
+    tcg_gen_shri_tl(cpu_Cf, t2, 7); /* Cf = t2(7) */
+    tcg_gen_shri_tl(cpu_Hf, t2, 3); /* Hf = t2(3) */
+    tcg_gen_andi_tl(cpu_Hf, cpu_Hf, 1);
+
+    tcg_temp_free_i32(t3);
+    tcg_temp_free_i32(t2);
+    tcg_temp_free_i32(t1);
+}
+
+
+static void gen_sub_Vf(TCGv R, TCGv Rd, TCGv Rr)
+{
+    TCGv t1 = tcg_temp_new_i32();
+    TCGv t2 = tcg_temp_new_i32();
+
+    /* Vf */
+    /* t1 = Rd & ~Rr & ~R | ~Rd & Rr & R = (Rd ^ R) & (Rd ^ R) */
+    tcg_gen_xor_tl(t1, Rd, R);
+    tcg_gen_xor_tl(t2, Rd, Rr);
+    tcg_gen_and_tl(t1, t1, t2);
+    tcg_gen_shri_tl(cpu_Vf, t1, 7); /* Vf = t1(7) */
+
+    tcg_temp_free_i32(t2);
+    tcg_temp_free_i32(t1);
+}
+
+
+static void gen_NSf(TCGv R)
+{
+    tcg_gen_shri_tl(cpu_Nf, R, 7); /* Nf = R(7) */
+    tcg_gen_xor_tl(cpu_Sf, cpu_Nf, cpu_Vf); /* Sf = Nf ^ Vf */
+}
+
+
+static void gen_ZNSf(TCGv R)
+{
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+    tcg_gen_shri_tl(cpu_Nf, R, 7); /* Nf = R(7) */
+    tcg_gen_xor_tl(cpu_Sf, cpu_Nf, cpu_Vf); /* Sf = Nf ^ Vf */
+}
+
+
+/*
+ *  Adds two registers without the C Flag and places the result in the
+ *  destination register Rd.
+ */
+static bool trans_ADD(DisasContext *ctx, arg_ADD *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+
+    /* op */
+    tcg_gen_add_tl(R, Rd, Rr); /* Rd = Rd + Rr */
+    tcg_gen_andi_tl(R, R, 0xff); /* make it 8 bits */
+
+    gen_add_CHf(R, Rd, Rr);
+    gen_add_Vf(R, Rd, Rr);
+    gen_ZNSf(R);
+
+    /* R */
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  Adds two registers and the contents of the C Flag and places the result in
+ *  the destination register Rd.
+ */
+static bool trans_ADC(DisasContext *ctx, arg_ADC *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+
+    /* op */
+    tcg_gen_add_tl(R, Rd, Rr); /* R = Rd + Rr + Cf */
+    tcg_gen_add_tl(R, R, cpu_Cf);
+    tcg_gen_andi_tl(R, R, 0xff); /* make it 8 bits */
+
+    gen_add_CHf(R, Rd, Rr);
+    gen_add_Vf(R, Rd, Rr);
+    gen_ZNSf(R);
+
+    /* R */
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  Subtracts an immediate value (0-63) from a register pair and places the
+ *  result in the register pair. This instruction operates on the upper four
+ *  register pairs, and is well suited for operations on the Pointer Registers.
+ *  This instruction is not available in all devices. Refer to the device
+ *  specific instruction set summary.
+ */
+static bool trans_SBIW(DisasContext *ctx, arg_SBIW *a)
+{
+    if (!avr_have_feature(ctx, AVR_FEATURE_ADIW_SBIW)) {
+        return true;
+    }
+
+    TCGv RdL = cpu_r[a->rd];
+    TCGv RdH = cpu_r[a->rd + 1];
+    int Imm = (a->imm);
+    TCGv R = tcg_temp_new_i32();
+    TCGv Rd = tcg_temp_new_i32();
+
+    /* op */
+    tcg_gen_deposit_tl(Rd, RdL, RdH, 8, 8); /* Rd = RdH:RdL */
+    tcg_gen_subi_tl(R, Rd, Imm); /* R = Rd - Imm */
+    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
+
+    /* Cf */
+    tcg_gen_andc_tl(cpu_Cf, R, Rd);
+    tcg_gen_shri_tl(cpu_Cf, cpu_Cf, 15); /* Cf = R & ~Rd */
+
+    /* Vf */
+    tcg_gen_andc_tl(cpu_Vf, Rd, R);
+    tcg_gen_shri_tl(cpu_Vf, cpu_Vf, 15); /* Vf = Rd & ~R */
+
+    /* Zf */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+
+    /* Nf */
+    tcg_gen_shri_tl(cpu_Nf, R, 15); /* Nf = R(15) */
+
+    /* Sf */
+    tcg_gen_xor_tl(cpu_Sf, cpu_Nf, cpu_Vf); /* Sf = Nf ^ Vf */
+
+    /* R */
+    tcg_gen_andi_tl(RdL, R, 0xff);
+    tcg_gen_shri_tl(RdH, R, 8);
+
+    tcg_temp_free_i32(Rd);
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  Adds an immediate value (0 - 63) to a register pair and places the result
+ *  in the register pair. This instruction operates on the upper four register
+ *  pairs, and is well suited for operations on the pointer registers.  This
+ *  instruction is not available in all devices. Refer to the device specific
+ *  instruction set summary.
+ */
+static bool trans_ADIW(DisasContext *ctx, arg_ADIW *a)
+{
+    if (!avr_have_feature(ctx, AVR_FEATURE_ADIW_SBIW)) {
+        return true;
+    }
+
+    TCGv RdL = cpu_r[a->rd];
+    TCGv RdH = cpu_r[a->rd + 1];
+    int Imm = (a->imm);
+    TCGv R = tcg_temp_new_i32();
+    TCGv Rd = tcg_temp_new_i32();
+
+    /* op */
+    tcg_gen_deposit_tl(Rd, RdL, RdH, 8, 8); /* Rd = RdH:RdL */
+    tcg_gen_addi_tl(R, Rd, Imm); /* R = Rd + Imm */
+    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
+
+    /* Cf */
+    tcg_gen_andc_tl(cpu_Cf, Rd, R); /* Cf = Rd & ~R */
+    tcg_gen_shri_tl(cpu_Cf, cpu_Cf, 15);
+
+    /* Vf */
+    tcg_gen_andc_tl(cpu_Vf, R, Rd); /* Vf = R & ~Rd */
+    tcg_gen_shri_tl(cpu_Vf, cpu_Vf, 15);
+
+    /* Zf */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+
+    /* Nf */
+    tcg_gen_shri_tl(cpu_Nf, R, 15); /* Nf = R(15) */
+
+    /* Sf */
+    tcg_gen_xor_tl(cpu_Sf, cpu_Nf, cpu_Vf);/* Sf = Nf ^ Vf */
+
+    /* R */
+    tcg_gen_andi_tl(RdL, R, 0xff);
+    tcg_gen_shri_tl(RdH, R, 8);
+
+    tcg_temp_free_i32(Rd);
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  Subtracts two registers and places the result in the destination
+ *  register Rd.
+ */
+static bool trans_SUB(DisasContext *ctx, arg_SUB *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+
+    /* op */
+    tcg_gen_sub_tl(R, Rd, Rr); /* R = Rd - Rr */
+    tcg_gen_andi_tl(R, R, 0xff); /* make it 8 bits */
+
+    gen_sub_CHf(R, Rd, Rr);
+    gen_sub_Vf(R, Rd, Rr);
+    gen_ZNSf(R);
+
+    /* R */
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+/*
+ *  Subtracts a register and a constant and places the result in the
+ *  destination register Rd. This instruction is working on Register R16 to R31
+ *  and is very well suited for operations on the X, Y, and Z-pointers.
+ */
+static bool trans_SUBI(DisasContext *ctx, arg_SUBI *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = tcg_const_i32(a->imm);
+    TCGv R = tcg_temp_new_i32();
+
+    /* op */
+    tcg_gen_sub_tl(R, Rd, Rr); /* R = Rd - Imm */
+    tcg_gen_andi_tl(R, R, 0xff); /* make it 8 bits */
+
+    gen_sub_CHf(R, Rd, Rr);
+    gen_sub_Vf(R, Rd, Rr);
+    gen_ZNSf(R);
+
+    /* R */
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(R);
+    tcg_temp_free_i32(Rr);
+
+    return true;
+}
+
+
+/*
+ *  Subtracts two registers and subtracts with the C Flag and places the
+ *  result in the destination register Rd.
+ */
+static bool trans_SBC(DisasContext *ctx, arg_SBC *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+    TCGv zero = tcg_const_i32(0);
+
+    /* op */
+    tcg_gen_sub_tl(R, Rd, Rr); /* R = Rd - Rr - Cf */
+    tcg_gen_sub_tl(R, R, cpu_Cf);
+    tcg_gen_andi_tl(R, R, 0xff); /* make it 8 bits */
+
+    gen_sub_CHf(R, Rd, Rr);
+    gen_sub_Vf(R, Rd, Rr);
+    gen_NSf(R);
+
+    /*
+     * Previous value remains unchanged when the result is zero;
+     * cleared otherwise.
+     */
+    tcg_gen_movcond_tl(TCG_COND_EQ, cpu_Zf, R, zero, cpu_Zf, zero);
+
+    /* R */
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(zero);
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  SBCI -- Subtract Immediate with Carry
+ */
+static bool trans_SBCI(DisasContext *ctx, arg_SBCI *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = tcg_const_i32(a->imm);
+    TCGv R = tcg_temp_new_i32();
+    TCGv zero = tcg_const_i32(0);
+
+    /* op */
+    tcg_gen_sub_tl(R, Rd, Rr); /* R = Rd - Rr - Cf */
+    tcg_gen_sub_tl(R, R, cpu_Cf);
+    tcg_gen_andi_tl(R, R, 0xff); /* make it 8 bits */
+
+    gen_sub_CHf(R, Rd, Rr);
+    gen_sub_Vf(R, Rd, Rr);
+    gen_NSf(R);
+
+    /*
+     * Previous value remains unchanged when the result is zero;
+     * cleared otherwise.
+     */
+    tcg_gen_movcond_tl(TCG_COND_EQ, cpu_Zf, R, zero, cpu_Zf, zero);
+
+    /* R */
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(zero);
+    tcg_temp_free_i32(R);
+    tcg_temp_free_i32(Rr);
+
+    return true;
+}
+
+
+/*
+ *  Performs the logical AND between the contents of register Rd and register
+ *  Rr and places the result in the destination register Rd.
+ */
+static bool trans_AND(DisasContext *ctx, arg_AND *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+
+    /* op */
+    tcg_gen_and_tl(R, Rd, Rr); /* Rd = Rd and Rr */
+
+    /* Vf */
+    tcg_gen_movi_tl(cpu_Vf, 0); /* Vf = 0 */
+
+    /* Zf */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+
+    gen_ZNSf(R);
+
+    /* R */
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  Performs the logical AND between the contents of register Rd and a constant
+ *  and places the result in the destination register Rd.
+ */
+static bool trans_ANDI(DisasContext *ctx, arg_ANDI *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    int Imm = (a->imm);
+
+    /* op */
+    tcg_gen_andi_tl(Rd, Rd, Imm); /* Rd = Rd & Imm */
+
+    tcg_gen_movi_tl(cpu_Vf, 0x00); /* Vf = 0 */
+    gen_ZNSf(Rd);
+
+    return true;
+}
+
+
+/*
+ *  Performs the logical OR between the contents of register Rd and register
+ *  Rr and places the result in the destination register Rd.
+ */
+static bool trans_OR(DisasContext *ctx, arg_OR *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+
+    tcg_gen_or_tl(R, Rd, Rr);
+
+    tcg_gen_movi_tl(cpu_Vf, 0);
+    gen_ZNSf(R);
+
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  Performs the logical OR between the contents of register Rd and a
+ *  constant and places the result in the destination register Rd.
+ */
+static bool trans_ORI(DisasContext *ctx, arg_ORI *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    int Imm = (a->imm);
+
+    tcg_gen_ori_tl(Rd, Rd, Imm); /* Rd = Rd | Imm */
+
+    tcg_gen_movi_tl(cpu_Vf, 0x00); /* Vf = 0 */
+    gen_ZNSf(Rd);
+
+    return true;
+}
+
+
+/*
+ *  Performs the logical EOR between the contents of register Rd and
+ *  register Rr and places the result in the destination register Rd.
+ */
+static bool trans_EOR(DisasContext *ctx, arg_EOR *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+
+    tcg_gen_xor_tl(Rd, Rd, Rr);
+
+    tcg_gen_movi_tl(cpu_Vf, 0);
+    gen_ZNSf(Rd);
+
+    return true;
+}
+
+
+/*
+ *  Clears the specified bits in register Rd. Performs the logical AND
+ *  between the contents of register Rd and the complement of the constant mask
+ *  K. The result will be placed in register Rd.
+ */
+static bool trans_COM(DisasContext *ctx, arg_COM *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv R = tcg_temp_new_i32();
+
+    tcg_gen_xori_tl(Rd, Rd, 0xff);
+
+    tcg_gen_movi_tl(cpu_Cf, 1); /* Cf = 1 */
+    tcg_gen_movi_tl(cpu_Vf, 0); /* Vf = 0 */
+    gen_ZNSf(Rd);
+
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  Replaces the contents of register Rd with its two's complement; the
+ *  value $80 is left unchanged.
+ */
+static bool trans_NEG(DisasContext *ctx, arg_NEG *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+    TCGv t0 = tcg_const_i32(0);
+    TCGv R = tcg_temp_new_i32();
+
+    /* op */
+    tcg_gen_sub_tl(R, t0, Rd); /* R = 0 - Rd */
+    tcg_gen_andi_tl(R, R, 0xff); /* make it 8 bits */
+
+    gen_sub_CHf(R, t0, Rd);
+    gen_sub_Vf(R, t0, Rd);
+    gen_ZNSf(R);
+
+    /* R */
+    tcg_gen_mov_tl(Rd, R);
+
+    tcg_temp_free_i32(t0);
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  Adds one -1- to the contents of register Rd and places the result in the
+ *  destination register Rd.  The C Flag in SREG is not affected by the
+ *  operation, thus allowing the INC instruction to be used on a loop counter in
+ *  multiple-precision computations.  When operating on unsigned numbers, only
+ *  BREQ and BRNE branches can be expected to perform consistently. When
+ *  operating on two's complement values, all signed branches are available.
+ */
+static bool trans_INC(DisasContext *ctx, arg_INC *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+
+    tcg_gen_addi_tl(Rd, Rd, 1);
+    tcg_gen_andi_tl(Rd, Rd, 0xff);
+
+    /* cpu_Vf = Rd == 0x80 */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Vf, Rd, 0x80);
+    gen_ZNSf(Rd);
+
+    return true;
+}
+
+
+/*
+ *  Subtracts one -1- from the contents of register Rd and places the result
+ *  in the destination register Rd.  The C Flag in SREG is not affected by the
+ *  operation, thus allowing the DEC instruction to be used on a loop counter in
+ *  multiple-precision computations.  When operating on unsigned values, only
+ *  BREQ and BRNE branches can be expected to perform consistently.  When
+ *  operating on two's complement values, all signed branches are available.
+ */
+static bool trans_DEC(DisasContext *ctx, arg_DEC *a)
+{
+    TCGv Rd = cpu_r[a->rd];
+
+    tcg_gen_subi_tl(Rd, Rd, 1); /* Rd = Rd - 1 */
+    tcg_gen_andi_tl(Rd, Rd, 0xff); /* make it 8 bits */
+
+    /* cpu_Vf = Rd == 0x7f */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Vf, Rd, 0x7f);
+    gen_ZNSf(Rd);
+
+    return true;
+}
+
+
+/*
+ *  This instruction performs 8-bit x 8-bit -> 16-bit unsigned multiplication.
+ */
+static bool trans_MUL(DisasContext *ctx, arg_MUL *a)
+{
+    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
+        return true;
+    }
+
+    TCGv R0 = cpu_r[0];
+    TCGv R1 = cpu_r[1];
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+
+    tcg_gen_mul_tl(R, Rd, Rr); /* R = Rd * Rr */
+
+    tcg_gen_andi_tl(R0, R, 0xff);
+    tcg_gen_shri_tl(R1, R, 8);
+
+    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  This instruction performs 8-bit x 8-bit -> 16-bit signed multiplication.
+ */
+static bool trans_MULS(DisasContext *ctx, arg_MULS *a)
+{
+    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
+        return true;
+    }
+
+    TCGv R0 = cpu_r[0];
+    TCGv R1 = cpu_r[1];
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+    TCGv t0 = tcg_temp_new_i32();
+    TCGv t1 = tcg_temp_new_i32();
+
+    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
+    tcg_gen_ext8s_tl(t1, Rr); /* make Rr full 32 bit signed */
+    tcg_gen_mul_tl(R, t0, t1); /* R = Rd * Rr */
+    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
+
+    tcg_gen_andi_tl(R0, R, 0xff);
+    tcg_gen_shri_tl(R1, R, 8);
+
+    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+
+    tcg_temp_free_i32(t1);
+    tcg_temp_free_i32(t0);
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  This instruction performs 8-bit x 8-bit -> 16-bit multiplication of a
+ *  signed and an unsigned number.
+ */
+static bool trans_MULSU(DisasContext *ctx, arg_MULSU *a)
+{
+    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
+        return true;
+    }
+
+    TCGv R0 = cpu_r[0];
+    TCGv R1 = cpu_r[1];
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+    TCGv t0 = tcg_temp_new_i32();
+
+    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
+    tcg_gen_mul_tl(R, t0, Rr); /* R = Rd *Rr */
+
+    tcg_gen_andi_tl(R0, R, 0xff);
+    tcg_gen_shri_tl(R1, R, 8);
+
+    tcg_gen_shri_tl(cpu_Cf, R, 16); /* Cf = R(16) */
+    tcg_gen_mov_tl(cpu_Zf, R);
+
+    tcg_temp_free_i32(t0);
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  This instruction performs 8-bit x 8-bit -> 16-bit unsigned
+ *  multiplication and shifts the result one bit left.
+ */
+static bool trans_FMUL(DisasContext *ctx, arg_FMUL *a)
+{
+    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
+        return true;
+    }
+
+    TCGv R0 = cpu_r[0];
+    TCGv R1 = cpu_r[1];
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+
+    tcg_gen_mul_tl(R, Rd, Rr); /* R = Rd * Rr */
+
+    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+
+    tcg_gen_shli_tl(R, R, 1);
+
+    tcg_gen_andi_tl(R0, R, 0xff);
+    tcg_gen_shri_tl(R1, R, 8);
+    tcg_gen_andi_tl(R1, R1, 0xff);
+
+
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  This instruction performs 8-bit x 8-bit -> 16-bit signed multiplication
+ *  and shifts the result one bit left.
+ */
+static bool trans_FMULS(DisasContext *ctx, arg_FMULS *a)
+{
+    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
+        return true;
+    }
+
+    TCGv R0 = cpu_r[0];
+    TCGv R1 = cpu_r[1];
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+    TCGv t0 = tcg_temp_new_i32();
+    TCGv t1 = tcg_temp_new_i32();
+
+    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
+    tcg_gen_ext8s_tl(t1, Rr); /* make Rr full 32 bit signed */
+    tcg_gen_mul_tl(R, t0, t1); /* R = Rd * Rr */
+    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
+
+    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+
+    tcg_gen_shli_tl(R, R, 1);
+
+    tcg_gen_andi_tl(R0, R, 0xff);
+    tcg_gen_shri_tl(R1, R, 8);
+    tcg_gen_andi_tl(R1, R1, 0xff);
+
+    tcg_temp_free_i32(t1);
+    tcg_temp_free_i32(t0);
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  This instruction performs 8-bit x 8-bit -> 16-bit signed multiplication
+ *  and shifts the result one bit left.
+ */
+static bool trans_FMULSU(DisasContext *ctx, arg_FMULSU *a)
+{
+    if (!avr_have_feature(ctx, AVR_FEATURE_MUL)) {
+        return true;
+    }
+
+    TCGv R0 = cpu_r[0];
+    TCGv R1 = cpu_r[1];
+    TCGv Rd = cpu_r[a->rd];
+    TCGv Rr = cpu_r[a->rr];
+    TCGv R = tcg_temp_new_i32();
+    TCGv t0 = tcg_temp_new_i32();
+
+    tcg_gen_ext8s_tl(t0, Rd); /* make Rd full 32 bit signed */
+    tcg_gen_mul_tl(R, t0, Rr); /* R = Rd * Rr */
+    tcg_gen_andi_tl(R, R, 0xffff); /* make it 16 bits */
+
+    tcg_gen_shri_tl(cpu_Cf, R, 15); /* Cf = R(15) */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, cpu_Zf, R, 0); /* Zf = R == 0 */
+
+    tcg_gen_shli_tl(R, R, 1);
+
+    tcg_gen_andi_tl(R0, R, 0xff);
+    tcg_gen_shri_tl(R1, R, 8);
+    tcg_gen_andi_tl(R1, R1, 0xff);
+
+    tcg_temp_free_i32(t0);
+    tcg_temp_free_i32(R);
+
+    return true;
+}
+
+
+/*
+ *  The module is an instruction set extension to the AVR CPU, performing
+ *  DES iterations. The 64-bit data block (plaintext or ciphertext) is placed in
+ *  the CPU register file, registers R0-R7, where LSB of data is placed in LSB
+ *  of R0 and MSB of data is placed in MSB of R7. The full 64-bit key (including
+ *  parity bits) is placed in registers R8- R15, organized in the register file
+ *  with LSB of key in LSB of R8 and MSB of key in MSB of R15. Executing one DES
+ *  instruction performs one round in the DES algorithm. Sixteen rounds must be
+ *  executed in increasing order to form the correct DES ciphertext or
+ *  plaintext. Intermediate results are stored in the register file (R0-R15)
+ *  after each DES instruction. The instruction's operand (K) determines which
+ *  round is executed, and the half carry flag (H) determines whether encryption
+ *  or decryption is performed.  The DES algorithm is described in
+ *  "Specifications for the Data Encryption Standard" (Federal Information
+ *  Processing Standards Publication 46). Intermediate results in this
+ *  implementation differ from the standard because the initial permutation and
+ *  the inverse initial permutation are performed each iteration. This does not
+ *  affect the result in the final ciphertext or plaintext, but reduces
+ *  execution time.
+ */
+static bool trans_DES(DisasContext *ctx, arg_DES *a)
+{
+    /* TODO */
+    if (!avr_have_feature(ctx, AVR_FEATURE_DES)) {
+        return true;
+    }
+
+    return true;
+}