diff mbox

[RFC,v3,07/11] arm64: compat: Add a 32-bit vDSO

Message ID 20161206160353.14581-8-kevin.brodsky@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kevin Brodsky Dec. 6, 2016, 4:03 p.m. UTC
Provide the files necessary for building a compat (AArch32) vDSO in
kernel/vdso32.

This is mostly an adaptation of the arm vDSO. The most significant
change in vgettimeofday.c is the use of the arm64 vdso_data struct,
allowing the vDSO data page to be shared between the 32 and 64-bit
vDSOs. Additionally, a different set of barrier macros is used (see
aarch32-barrier.h), as we want to support old 32-bit compilers that
may not support ARMv8 and its new barrier arguments (*ld).

In addition to the time functions, sigreturn trampolines are also
provided, aiming at replacing those in the sigreturn page as the
latter don't provide any unwinding information (and it's easier to
have just one "user code" page). arm-specific unwinding directives are
used, based on glibc's implementation. Symbol offsets are made
available to the kernel using the same method as the 64-bit vDSO.

There is unfortunately an important caveat: we cannot get away with
hand-coding 32-bit instructions like in kernel/kuser32.S, this time we
really need a 32-bit compiler. The compat vDSO Makefile relies on
CROSS_COMPILE_ARM32 to provide a 32-bit compiler, appropriate logic
will be added to the arm64 Makefile later on to ensure that an attempt
to build the compat vDSO is made only if this variable has been set
properly.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Nathan Lynch <nathan_lynch@mentor.com>
Cc: Christopher Covington <cov@codeaurora.org>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
Note regarding the sigreturn trampolines: they should now have proper unwinding
information. I tested them by recompiling glibc to make it use the kernel
sigreturn (instead of setting its own sa_restorer), and then use glibc's
backtrace() inside a signal handler, or in C++ throw an exception from a signal
handler (hopefully nobody actually uses this "feature", but who knows...). Note
that in C, you must compile your application with -fasynchronous-unwind-tables
(which is the default on x86 but not arm), so that backtrace() actually uses the
unwinder to derive the stack trace. gdb is not very useful in this case, because
it has some magic to detect sigreturn trampolines based on the exact sequence of
instructions, and can derive the stack trace from there, without debug/unwinding
information.

The only remaining issue (AFAIK) is that backtrace() doesn't manage to print
the name of the Thumb trampolines, and gdb is also confused (the frame shows up
as "#1  0x... in ?? ()"). This seems to be a bug somewhere in the unwinder (and
gdb can't use its magic instruction matching because it expects slightly
different instructions), but I can't tell for sure. Given their limited use, an
option would be to only keep the ARM trampolines; but for now they will remain
there for compatibility with arch/arm.

 arch/arm64/kernel/vdso32/Makefile          | 166 ++++++++++++++++
 arch/arm64/kernel/vdso32/aarch32-barrier.h |  33 ++++
 arch/arm64/kernel/vdso32/sigreturn.S       |  76 ++++++++
 arch/arm64/kernel/vdso32/vdso.S            |  32 ++++
 arch/arm64/kernel/vdso32/vdso.lds.S        |  93 +++++++++
 arch/arm64/kernel/vdso32/vgettimeofday.c   | 295 +++++++++++++++++++++++++++++
 6 files changed, 695 insertions(+)
 create mode 100644 arch/arm64/kernel/vdso32/Makefile
 create mode 100644 arch/arm64/kernel/vdso32/aarch32-barrier.h
 create mode 100644 arch/arm64/kernel/vdso32/sigreturn.S
 create mode 100644 arch/arm64/kernel/vdso32/vdso.S
 create mode 100644 arch/arm64/kernel/vdso32/vdso.lds.S
 create mode 100644 arch/arm64/kernel/vdso32/vgettimeofday.c

Comments

Nathan Lynch Dec. 7, 2016, 6:51 p.m. UTC | #1
Hi Kevin,

Kevin Brodsky <kevin.brodsky@arm.com> writes:
> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
> new file mode 100644
> index 000000000000..53c3d1f82b26
> --- /dev/null
> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c

As I said at LPC last month, I'm not excited to have arch/arm's
vgettimeofday.c copied into arch/arm64 and tweaked; I'd rather see this
part of the implementation shared between arch/arm and arch/arm64
somehow, even if there's not an elegant way to do so.

The situation which must be avoided is one where the arch/arm64 compat
VDSO incompatibly diverges from the arch/arm VDSO.  That becomes much
less likely if there's only one copy of the userspace-exposed code to
maintain.


[...]

> +/*
> + * We use the hidden visibility to prevent the compiler from generating a GOT
> + * relocation. Not only is going through a GOT useless (the entry couldn't and
> + * musn't be overridden by another library), it does not even work: the linker
> + * cannot generate an absolute address to the data page.
> + *
> + * With the hidden visibility, the compiler simply generates a PC-relative
> + * relocation (R_ARM_REL32), and this is what we need.
> + */
> +extern const struct vdso_data _vdso_data __attribute__((visibility("hidden")));
> +
> +static inline const struct vdso_data *get_vdso_data(void)
> +{
> +	const struct vdso_data *ret;
> +	/*
> +	 * This simply puts &_vdso_data into ret. The reason why we don't use
> +	 * `ret = &_vdso_data` is that the compiler tends to optimise this in a
> +	 * very suboptimal way: instead of keeping &_vdso_data in a register,
> +	 * it goes through a relocation almost every time _vdso_data must be
> +	 * accessed (even in subfunctions). This is both time and space
> +	 * consuming: each relocation uses a word in the code section, and it
> +	 * has to be loaded at runtime.
> +	 *
> +	 * This trick hides the assignment from the compiler. Since it cannot
> +	 * track where the pointer comes from, it will only use one relocation
> +	 * where get_vdso_data() is called, and then keep the result in a
> +	 * register.
> +	 */
> +	asm("mov %0, %1" : "=r"(ret) : "r"(&_vdso_data));
> +	return ret;
> +}

I think this is an instructive case: this code looks like an improvement
over how the arch/arm VDSO accesses the data page.  Enhancements like
this (not to mention fixes) shouldn't require a patch-per-architecture;
things inevitably get out of sync and it's more of a maintenance burden
in the long term.
Kevin Brodsky Dec. 8, 2016, 10:59 a.m. UTC | #2
On 07/12/16 18:51, Nathan Lynch wrote:
> Hi Kevin,

Hi Nathan,

Thanks for taking a look at these patches!

> Kevin Brodsky <kevin.brodsky@arm.com> writes:
>> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
>> new file mode 100644
>> index 000000000000..53c3d1f82b26
>> --- /dev/null
>> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
> As I said at LPC last month, I'm not excited to have arch/arm's
> vgettimeofday.c copied into arch/arm64 and tweaked; I'd rather see this
> part of the implementation shared between arch/arm and arch/arm64
> somehow, even if there's not an elegant way to do so.
>
> The situation which must be avoided is one where the arch/arm64 compat
> VDSO incompatibly diverges from the arch/arm VDSO.  That becomes much
> less likely if there's only one copy of the userspace-exposed code to
> maintain.

I still agree this is very suboptimal. However, I also think this is by far the most 
straightforward solution, and I would like to stick to it *as a first step*. If you 
diff this "tweaked" vgettimeofday.c with the original one, you'll see that it is 
really not obvious to share even parts of vgettimeofday.c in the current state of arm 
and arm64. Besides, arm's vDSO hasn't changed much since it has been introduced last 
year, and vgettimeofday.c hasn't changed at all, so I think it's fair to say 
divergences are unlikely to happen in the near future.

Nevertheless, I completely agree this is not a good situation in the long run, and I 
am not happy with leaving things in this state. After this series is merged, my plan 
is to try and align arm and arm64 both in terms of vDSO features and kernel-vDSO 
interface. This would notably involve:
* Aligning the vDSO data struct's (especially vdso_data's member names), so that 
vgettimeofday.c can be (mostly) shared.
* Non-trivial Makefile work to share the same one between the arm and arm64 compat 
vDSOs. The refactoring I did in v3 is a first step, as not using top-level flags 
makes it more feasible to share the same Makefile.
* Adding support for CLOCK_MONOTONIC_RAW, as I did in the 64-bit vDSO.
* Moving arm's sigreturn trampolines to its vDSO (if enabled), and convince glibc to 
use the kernel's trampolines like on arm64.

This represents a substantial amount of work, which I think should be in a separate 
series (this one has already grown bigger than I expected).

The final step would be to investigate whether the 64-bit vDSO could reasonably move 
to a C implementation, and if it is the case unify the 3 vDSOs.

> [...]
>
>> +/*
>> + * We use the hidden visibility to prevent the compiler from generating a GOT
>> + * relocation. Not only is going through a GOT useless (the entry couldn't and
>> + * musn't be overridden by another library), it does not even work: the linker
>> + * cannot generate an absolute address to the data page.
>> + *
>> + * With the hidden visibility, the compiler simply generates a PC-relative
>> + * relocation (R_ARM_REL32), and this is what we need.
>> + */
>> +extern const struct vdso_data _vdso_data __attribute__((visibility("hidden")));
>> +
>> +static inline const struct vdso_data *get_vdso_data(void)
>> +{
>> +	const struct vdso_data *ret;
>> +	/*
>> +	 * This simply puts &_vdso_data into ret. The reason why we don't use
>> +	 * `ret = &_vdso_data` is that the compiler tends to optimise this in a
>> +	 * very suboptimal way: instead of keeping &_vdso_data in a register,
>> +	 * it goes through a relocation almost every time _vdso_data must be
>> +	 * accessed (even in subfunctions). This is both time and space
>> +	 * consuming: each relocation uses a word in the code section, and it
>> +	 * has to be loaded at runtime.
>> +	 *
>> +	 * This trick hides the assignment from the compiler. Since it cannot
>> +	 * track where the pointer comes from, it will only use one relocation
>> +	 * where get_vdso_data() is called, and then keep the result in a
>> +	 * register.
>> +	 */
>> +	asm("mov %0, %1" : "=r"(ret) : "r"(&_vdso_data));
>> +	return ret;
>> +}
> I think this is an instructive case: this code looks like an improvement
> over how the arch/arm VDSO accesses the data page.  Enhancements like
> this (not to mention fixes) shouldn't require a patch-per-architecture;
> things inevitably get out of sync and it's more of a maintenance burden
> in the long term.

What arm does is suboptimal indeed, as it uses a whole function to derive vdso_data's 
address instead of a simple relocation. However both methods are functionally 
equivalent. The approach I used here is simply consistent with how vdso_data is 
accessed in the 64-bit vDSO (with those two tricks because of both an assembler 
difference and the fact we are using C here, and GCC is not being smart). Again, I 
agree the arm vDSO should use the same approach, but unification requires other 
changes beyond the scope of this first series.

Thanks,
Kevin
diff mbox

Patch

diff --git a/arch/arm64/kernel/vdso32/Makefile b/arch/arm64/kernel/vdso32/Makefile
new file mode 100644
index 000000000000..87cb8f01d8b7
--- /dev/null
+++ b/arch/arm64/kernel/vdso32/Makefile
@@ -0,0 +1,166 @@ 
+#
+# Building a vDSO image for AArch32.
+#
+# Author: Kevin Brodsky <kevin.brodsky@arm.com>
+# A mix between the arm64 and arm vDSO Makefiles.
+
+CC_ARM32 := $(CROSS_COMPILE_ARM32)gcc
+
+# Same as cc-*option, but using CC_ARM32 instead of CC
+cc32-option = $(call try-run,\
+        $(CC_ARM32) $(1) -c -x c /dev/null -o "$$TMP",$(1),$(2))
+cc32-disable-warning = $(call try-run,\
+	$(CC_ARM32) -W$(strip $(1)) -c -x c /dev/null -o "$$TMP",-Wno-$(strip $(1)))
+cc32-ldoption = $(call try-run,\
+        $(CC_ARM32) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2))
+
+# We cannot use the global flags to compile the vDSO files, the main reason
+# being that the 32-bit compiler may be older than the main (64-bit) compiler
+# and therefore may not understand flags set using $(cc-option ...). Besides,
+# arch-specific options should be taken from the arm Makefile instead of the
+# arm64 one.
+# As a result we set our own flags here.
+
+# From top-level Makefile
+# NOSTDINC_FLAGS
+VDSO_CPPFLAGS := -nostdinc -isystem $(shell $(CC_ARM32) -print-file-name=include)
+VDSO_CPPFLAGS += $(LINUXINCLUDE)
+VDSO_CPPFLAGS += $(KBUILD_CPPFLAGS)
+
+# Common C and assembly flags
+# From top-level Makefile
+VDSO_CAFLAGS := $(VDSO_CPPFLAGS)
+VDSO_CAFLAGS += $(call cc32-option,-fno-PIE)
+ifdef CONFIG_DEBUG_INFO
+VDSO_CAFLAGS += -g
+endif
+ifeq ($(shell $(CONFIG_SHELL) $(srctree)/scripts/gcc-goto.sh $(CC_ARM32)), y)
+VDSO_CAFLAGS += -DCC_HAVE_ASM_GOTO
+endif
+
+# From arm Makefile
+VDSO_CAFLAGS += $(call cc32-option,-fno-dwarf2-cfi-asm)
+VDSO_CAFLAGS += -mabi=aapcs-linux -mfloat-abi=soft
+ifeq ($(CONFIG_CPU_BIG_ENDIAN), y)
+VDSO_CAFLAGS += -mbig-endian
+else
+VDSO_CAFLAGS += -mlittle-endian
+endif
+
+# From arm vDSO Makefile
+VDSO_CAFLAGS += -fPIC -fno-builtin -fno-stack-protector
+VDSO_CAFLAGS += -DDISABLE_BRANCH_PROFILING
+
+# Try to compile for ARMv8. If the compiler is too old and doesn't support it,
+# fall back to v7. There is no easy way to check for what architecture the code
+# is being compiled, so define a macro specifying that (see arch/arm/Makefile).
+VDSO_CAFLAGS += $(call cc32-option,-march=armv8-a -D__LINUX_ARM_ARCH__=8,\
+                                   -march=armv7-a -D__LINUX_ARM_ARCH__=7)
+
+VDSO_CFLAGS := $(VDSO_CAFLAGS)
+# KBUILD_CFLAGS from top-level Makefile
+VDSO_CFLAGS += -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
+               -fno-strict-aliasing -fno-common \
+               -Werror-implicit-function-declaration \
+               -Wno-format-security \
+               -std=gnu89
+VDSO_CFLAGS  += -O2
+# Some useful compiler-dependent flags from top-level Makefile
+VDSO_CFLAGS += $(call cc32-option,-Wdeclaration-after-statement,)
+VDSO_CFLAGS += $(call cc32-option,-Wno-pointer-sign)
+VDSO_CFLAGS += $(call cc32-option,-fno-strict-overflow)
+VDSO_CFLAGS += $(call cc32-option,-Werror=strict-prototypes)
+VDSO_CFLAGS += $(call cc32-option,-Werror=date-time)
+VDSO_CFLAGS += $(call cc32-option,-Werror=incompatible-pointer-types)
+
+# The 32-bit compiler does not provide 128-bit integers, which are used in
+# some headers that are indirectly included from the vDSO code.
+# This hack makes the compiler happy and should trigger a warning/error if
+# variables of such type are referenced.
+VDSO_CFLAGS += -D__uint128_t='void*'
+# Silence some warnings coming from headers that operate on long's
+# (on GCC 4.8 or older, there is unfortunately no way to silence this warning)
+VDSO_CFLAGS += $(call cc32-disable-warning,shift-count-overflow)
+VDSO_CFLAGS += -Wno-int-to-pointer-cast
+
+VDSO_AFLAGS := $(VDSO_CAFLAGS)
+VDSO_AFLAGS += -D__ASSEMBLY__
+
+VDSO_LDFLAGS := $(VDSO_CPPFLAGS)
+# From arm vDSO Makefile
+VDSO_LDFLAGS += -Wl,-Bsymbolic -Wl,--no-undefined -Wl,-soname=linux-vdso.so.1
+VDSO_LDFLAGS += -Wl,-z,max-page-size=4096 -Wl,-z,common-page-size=4096
+VDSO_LDFLAGS += -nostdlib -shared -mfloat-abi=soft
+VDSO_LDFLAGS += $(call cc32-ldoption,-Wl$(comma)--hash-style=sysv)
+VDSO_LDFLAGS += $(call cc32-ldoption,-Wl$(comma)--build-id)
+VDSO_LDFLAGS += $(call cc32-ldoption,-fuse-ld=bfd)
+
+
+# Borrow vdsomunge.c from the arm vDSO
+munge := arch/arm/vdso/vdsomunge
+hostprogs-y := $(srctree)/$(munge)
+
+c-obj-vdso := vgettimeofday.o
+asm-obj-vdso := sigreturn.o
+
+# Build rules
+targets := $(c-obj-vdso) $(asm-obj-vdso) vdso.so vdso.so.dbg vdso.so.raw
+c-obj-vdso := $(addprefix $(obj)/, $(c-obj-vdso))
+asm-obj-vdso := $(addprefix $(obj)/, $(asm-obj-vdso))
+obj-vdso := $(c-obj-vdso) $(asm-obj-vdso)
+
+obj-y += vdso.o
+extra-y += vdso.lds
+CPPFLAGS_vdso.lds += -P -C -U$(ARCH)
+
+# Force dependency (vdso.s includes vdso.so through incbin)
+$(obj)/vdso.o: $(obj)/vdso.so
+
+include/generated/vdso32-offsets.h: $(obj)/vdso.so.dbg FORCE
+	$(call if_changed,vdsosym)
+
+# Strip rule for vdso.so
+$(obj)/vdso.so: OBJCOPYFLAGS := -S
+$(obj)/vdso.so: $(obj)/vdso.so.dbg FORCE
+	$(call if_changed,objcopy)
+
+$(obj)/vdso.so.dbg: $(obj)/vdso.so.raw $(objtree)/$(munge) FORCE
+	$(call if_changed,vdsomunge)
+
+# Link rule for the .so file, .lds has to be first
+$(obj)/vdso.so.raw: $(src)/vdso.lds $(obj-vdso) FORCE
+	$(call if_changed,vdsold)
+
+# Compilation rules for the vDSO sources
+$(c-obj-vdso): %.o: %.c FORCE
+	$(call if_changed_dep,vdsocc)
+$(asm-obj-vdso): %.o: %.S FORCE
+	$(call if_changed_dep,vdsoas)
+
+# Actual build commands
+quiet_cmd_vdsold = VDSOL   $@
+      cmd_vdsold = $(CC_ARM32) -Wp,-MD,$(depfile) $(VDSO_LDFLAGS) \
+                   -Wl,-T $(filter %.lds,$^) $(filter %.o,$^) -o $@
+quiet_cmd_vdsocc = VDSOC   $@
+      cmd_vdsocc = $(CC_ARM32) -Wp,-MD,$(depfile) $(VDSO_CFLAGS) -c -o $@ $<
+quiet_cmd_vdsoas = VDSOA   $@
+      cmd_vdsoas = $(CC_ARM32) -Wp,-MD,$(depfile) $(VDSO_AFLAGS) -c -o $@ $<
+
+quiet_cmd_vdsomunge = MUNGE   $@
+      cmd_vdsomunge = $(objtree)/$(munge) $< $@
+
+# Generate vDSO offsets using helper script (borrowed from the 64-bit vDSO)
+gen-vdsosym := $(srctree)/$(src)/../vdso/gen_vdso_offsets.sh
+quiet_cmd_vdsosym = VDSOSYM $@
+# The AArch64 nm should be able to read an AArch32 binary
+      cmd_vdsosym = $(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@
+
+# Install commands for the unstripped file
+quiet_cmd_vdso_install = INSTALL $@
+      cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/vdso32.so
+
+vdso.so: $(obj)/vdso.so.dbg
+	@mkdir -p $(MODLIB)/vdso
+	$(call cmd,vdso_install)
+
+vdso_install: vdso.so
diff --git a/arch/arm64/kernel/vdso32/aarch32-barrier.h b/arch/arm64/kernel/vdso32/aarch32-barrier.h
new file mode 100644
index 000000000000..31bfee63d59d
--- /dev/null
+++ b/arch/arm64/kernel/vdso32/aarch32-barrier.h
@@ -0,0 +1,33 @@ 
+/*
+ * Barriers for AArch32 code.
+ *
+ * Copyright (C) 2016 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __VDSO32_AARCH32_BARRIER_H
+#define __VDSO32_AARCH32_BARRIER_H
+
+#include <asm/barrier.h>
+
+#if __LINUX_ARM_ARCH__ >= 8
+#define aarch32_smp_mb()	dmb(ish)
+#define aarch32_smp_rmb()	dmb(ishld)
+#define aarch32_smp_wmb()	dmb(ishst)
+#else
+#define aarch32_smp_mb()	dmb(ish)
+#define aarch32_smp_rmb()	dmb(ish) /* ishld does not exist on ARMv7 */
+#define aarch32_smp_wmb()	dmb(ishst)
+#endif
+
+#endif	/* __VDSO32_AARCH32_BARRIER_H */
diff --git a/arch/arm64/kernel/vdso32/sigreturn.S b/arch/arm64/kernel/vdso32/sigreturn.S
new file mode 100644
index 000000000000..9267a4d78404
--- /dev/null
+++ b/arch/arm64/kernel/vdso32/sigreturn.S
@@ -0,0 +1,76 @@ 
+/*
+ * Sigreturn trampolines for returning from a signal when the SA_RESTORER
+ * flag is not set.
+ *
+ * Copyright (C) 2016 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Based on glibc's arm sa_restorer. While this is not strictly necessary, we
+ * provide both A32 and T32 versions, in accordance with the arm sigreturn
+ * code.
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm-offsets.h>
+#include <asm/unistd.h>
+
+.macro sigreturn_trampoline name, syscall, regs_offset
+	/*
+	 * We provide directives for enabling stack unwinding through the
+	 * trampoline. On arm, CFI directives are only used for debugging (and
+	 * the vDSO is stripped of debug information), so only the arm-specific
+	 * unwinding directives are useful here.
+	 */
+	.fnstart
+	.save {r0-r15}
+	.pad #\regs_offset
+	/*
+	 * It is necessary to start the unwind tables at least one instruction
+	 * before the trampoline, as the unwinder will assume that the signal
+	 * handler has been called from the trampoline, that is just before
+	 * where the signal handler returns (mov r7, ...).
+	 */
+	nop
+ENTRY(\name)
+	mov	r7, #\syscall
+	svc	#0
+	.fnend
+	/*
+	 * We would like to use ENDPROC, but the macro uses @ which is a
+	 * comment symbol for arm assemblers, so directly use .type with %
+	 * instead.
+	 */
+	.type \name, %function
+END(\name)
+.endm
+
+	.text
+
+	.arm
+	sigreturn_trampoline __kernel_sigreturn_arm, \
+			     __NR_compat_sigreturn, \
+			     COMPAT_SIGFRAME_REGS_OFFSET
+
+	sigreturn_trampoline __kernel_rt_sigreturn_arm, \
+			     __NR_compat_rt_sigreturn, \
+			     COMPAT_RT_SIGFRAME_REGS_OFFSET
+
+	.thumb
+	sigreturn_trampoline __kernel_sigreturn_thumb, \
+			     __NR_compat_sigreturn, \
+			     COMPAT_SIGFRAME_REGS_OFFSET
+
+	sigreturn_trampoline __kernel_rt_sigreturn_thumb, \
+			     __NR_compat_rt_sigreturn, \
+			     COMPAT_RT_SIGFRAME_REGS_OFFSET
diff --git a/arch/arm64/kernel/vdso32/vdso.S b/arch/arm64/kernel/vdso32/vdso.S
new file mode 100644
index 000000000000..fe19ff70eb76
--- /dev/null
+++ b/arch/arm64/kernel/vdso32/vdso.S
@@ -0,0 +1,32 @@ 
+/*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/page.h>
+
+	.globl vdso32_start, vdso32_end
+	.section .rodata
+	.balign PAGE_SIZE
+vdso32_start:
+	.incbin "arch/arm64/kernel/vdso32/vdso.so"
+	.balign PAGE_SIZE
+vdso32_end:
+
+	.previous
diff --git a/arch/arm64/kernel/vdso32/vdso.lds.S b/arch/arm64/kernel/vdso32/vdso.lds.S
new file mode 100644
index 000000000000..89560e80bd21
--- /dev/null
+++ b/arch/arm64/kernel/vdso32/vdso.lds.S
@@ -0,0 +1,93 @@ 
+/*
+ * Adapted from arm64 version.
+ *
+ * GNU linker script for the VDSO library.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ * Heavily based on the vDSO linker scripts for other archs.
+ */
+
+#include <linux/const.h>
+#include <asm/page.h>
+#include <asm/vdso.h>
+
+OUTPUT_FORMAT("elf32-littlearm", "elf32-bigarm", "elf32-littlearm")
+OUTPUT_ARCH(arm)
+
+SECTIONS
+{
+	PROVIDE_HIDDEN(_vdso_data = . - PAGE_SIZE);
+	. = VDSO_LBASE + SIZEOF_HEADERS;
+
+	.hash		: { *(.hash) }			:text
+	.gnu.hash	: { *(.gnu.hash) }
+	.dynsym		: { *(.dynsym) }
+	.dynstr		: { *(.dynstr) }
+	.gnu.version	: { *(.gnu.version) }
+	.gnu.version_d	: { *(.gnu.version_d) }
+	.gnu.version_r	: { *(.gnu.version_r) }
+
+	.note		: { *(.note.*) }		:text	:note
+
+	.dynamic	: { *(.dynamic) }		:text	:dynamic
+
+	.rodata		: { *(.rodata*) }		:text
+
+	.text		: { *(.text*) }			:text	=0xe7f001f2
+
+	.got		: { *(.got) }
+	.rel.plt	: { *(.rel.plt) }
+
+	/DISCARD/	: {
+		*(.note.GNU-stack)
+		*(.data .data.* .gnu.linkonce.d.* .sdata*)
+		*(.bss .sbss .dynbss .dynsbss)
+	}
+}
+
+/*
+ * We must supply the ELF program headers explicitly to get just one
+ * PT_LOAD segment, and set the flags explicitly to make segments read-only.
+ */
+PHDRS
+{
+	text		PT_LOAD		FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
+	dynamic		PT_DYNAMIC	FLAGS(4);		/* PF_R */
+	note		PT_NOTE		FLAGS(4);		/* PF_R */
+}
+
+VERSION
+{
+	LINUX_2.6 {
+	global:
+		__vdso_clock_gettime;
+		__vdso_gettimeofday;
+		__kernel_sigreturn_arm;
+		__kernel_sigreturn_thumb;
+		__kernel_rt_sigreturn_arm;
+		__kernel_rt_sigreturn_thumb;
+	local: *;
+	};
+}
+
+/*
+ * Make the sigreturn code visible to the kernel.
+ */
+VDSO_compat_sigreturn_arm	= __kernel_sigreturn_arm;
+VDSO_compat_sigreturn_thumb	= __kernel_sigreturn_thumb;
+VDSO_compat_rt_sigreturn_arm	= __kernel_rt_sigreturn_arm;
+VDSO_compat_rt_sigreturn_thumb	= __kernel_rt_sigreturn_thumb;
diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
new file mode 100644
index 000000000000..53c3d1f82b26
--- /dev/null
+++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
@@ -0,0 +1,295 @@ 
+/*
+ * Copyright 2015 Mentor Graphics Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2 of the
+ * License.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/clocksource.h>
+#include <linux/compiler.h>
+#include <linux/time.h>
+#include <asm/unistd.h>
+#include <asm/vdso_datapage.h>
+
+#include "aarch32-barrier.h"
+
+/*
+ * We use the hidden visibility to prevent the compiler from generating a GOT
+ * relocation. Not only is going through a GOT useless (the entry couldn't and
+ * musn't be overridden by another library), it does not even work: the linker
+ * cannot generate an absolute address to the data page.
+ *
+ * With the hidden visibility, the compiler simply generates a PC-relative
+ * relocation (R_ARM_REL32), and this is what we need.
+ */
+extern const struct vdso_data _vdso_data __attribute__((visibility("hidden")));
+
+static inline const struct vdso_data *get_vdso_data(void)
+{
+	const struct vdso_data *ret;
+	/*
+	 * This simply puts &_vdso_data into ret. The reason why we don't use
+	 * `ret = &_vdso_data` is that the compiler tends to optimise this in a
+	 * very suboptimal way: instead of keeping &_vdso_data in a register,
+	 * it goes through a relocation almost every time _vdso_data must be
+	 * accessed (even in subfunctions). This is both time and space
+	 * consuming: each relocation uses a word in the code section, and it
+	 * has to be loaded at runtime.
+	 *
+	 * This trick hides the assignment from the compiler. Since it cannot
+	 * track where the pointer comes from, it will only use one relocation
+	 * where get_vdso_data() is called, and then keep the result in a
+	 * register.
+	 */
+	asm("mov %0, %1" : "=r"(ret) : "r"(&_vdso_data));
+	return ret;
+}
+
+static notrace u32 __vdso_read_begin(const struct vdso_data *vdata)
+{
+	u32 seq;
+repeat:
+	seq = ACCESS_ONCE(vdata->tb_seq_count);
+	if (seq & 1) {
+		cpu_relax();
+		goto repeat;
+	}
+	return seq;
+}
+
+static notrace u32 vdso_read_begin(const struct vdso_data *vdata)
+{
+	u32 seq;
+
+	seq = __vdso_read_begin(vdata);
+
+	aarch32_smp_rmb();
+	return seq;
+}
+
+static notrace int vdso_read_retry(const struct vdso_data *vdata, u32 start)
+{
+	aarch32_smp_rmb();
+	return vdata->tb_seq_count != start;
+}
+
+/*
+ * Note: only AEABI is supported by the compat layer, we can assume AEABI
+ * syscall conventions are used.
+ */
+static notrace long clock_gettime_fallback(clockid_t _clkid,
+					   struct timespec *_ts)
+{
+	register struct timespec *ts asm("r1") = _ts;
+	register clockid_t clkid asm("r0") = _clkid;
+	register long ret asm ("r0");
+	register long nr asm("r7") = __NR_compat_clock_gettime;
+
+	asm volatile(
+	"	svc #0\n"
+	: "=r" (ret)
+	: "r" (clkid), "r" (ts), "r" (nr)
+	: "memory");
+
+	return ret;
+}
+
+static notrace int do_realtime_coarse(struct timespec *ts,
+				      const struct vdso_data *vdata)
+{
+	u32 seq;
+
+	do {
+		seq = vdso_read_begin(vdata);
+
+		ts->tv_sec = vdata->xtime_coarse_sec;
+		ts->tv_nsec = vdata->xtime_coarse_nsec;
+
+	} while (vdso_read_retry(vdata, seq));
+
+	return 0;
+}
+
+static notrace int do_monotonic_coarse(struct timespec *ts,
+				       const struct vdso_data *vdata)
+{
+	struct timespec tomono;
+	u32 seq;
+
+	do {
+		seq = vdso_read_begin(vdata);
+
+		ts->tv_sec = vdata->xtime_coarse_sec;
+		ts->tv_nsec = vdata->xtime_coarse_nsec;
+
+		tomono.tv_sec = vdata->wtm_clock_sec;
+		tomono.tv_nsec = vdata->wtm_clock_nsec;
+
+	} while (vdso_read_retry(vdata, seq));
+
+	ts->tv_sec += tomono.tv_sec;
+	timespec_add_ns(ts, tomono.tv_nsec);
+
+	return 0;
+}
+
+static notrace u64 get_ns(const struct vdso_data *vdata)
+{
+	u64 cycle_delta;
+	u64 cycle_now;
+	u64 nsec;
+
+	/* AArch32 implementation of arch_counter_get_cntvct() */
+	isb();
+	asm volatile("mrrc p15, 1, %Q0, %R0, c14" : "=r" (cycle_now));
+
+	/* The virtual counter provides 56 significant bits. */
+	cycle_delta = (cycle_now - vdata->cs_cycle_last) & CLOCKSOURCE_MASK(56);
+
+	nsec = (cycle_delta * vdata->cs_mono_mult) + vdata->xtime_clock_nsec;
+	nsec >>= vdata->cs_shift;
+
+	return nsec;
+}
+
+static notrace int do_realtime(struct timespec *ts,
+			       const struct vdso_data *vdata)
+{
+	u64 nsecs;
+	u32 seq;
+
+	do {
+		seq = vdso_read_begin(vdata);
+
+		if (vdata->use_syscall)
+			return -1;
+
+		ts->tv_sec = vdata->xtime_clock_sec;
+		nsecs = get_ns(vdata);
+
+	} while (vdso_read_retry(vdata, seq));
+
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, nsecs);
+
+	return 0;
+}
+
+static notrace int do_monotonic(struct timespec *ts,
+				const struct vdso_data *vdata)
+{
+	struct timespec tomono;
+	u64 nsecs;
+	u32 seq;
+
+	do {
+		seq = vdso_read_begin(vdata);
+
+		if (vdata->use_syscall)
+			return -1;
+
+		ts->tv_sec = vdata->xtime_clock_sec;
+		nsecs = get_ns(vdata);
+
+		tomono.tv_sec = vdata->wtm_clock_sec;
+		tomono.tv_nsec = vdata->wtm_clock_nsec;
+
+	} while (vdso_read_retry(vdata, seq));
+
+	ts->tv_sec += tomono.tv_sec;
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, nsecs + tomono.tv_nsec);
+
+	return 0;
+}
+
+notrace int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts)
+{
+	const struct vdso_data *vdata = get_vdso_data();
+	int ret = -1;
+
+	switch (clkid) {
+	case CLOCK_REALTIME_COARSE:
+		ret = do_realtime_coarse(ts, vdata);
+		break;
+	case CLOCK_MONOTONIC_COARSE:
+		ret = do_monotonic_coarse(ts, vdata);
+		break;
+	case CLOCK_REALTIME:
+		ret = do_realtime(ts, vdata);
+		break;
+	case CLOCK_MONOTONIC:
+		ret = do_monotonic(ts, vdata);
+		break;
+	default:
+		break;
+	}
+
+	if (ret)
+		ret = clock_gettime_fallback(clkid, ts);
+
+	return ret;
+}
+
+static notrace long gettimeofday_fallback(struct timeval *_tv,
+					  struct timezone *_tz)
+{
+	register struct timezone *tz asm("r1") = _tz;
+	register struct timeval *tv asm("r0") = _tv;
+	register long ret asm ("r0");
+	register long nr asm("r7") = __NR_compat_gettimeofday;
+
+	asm volatile(
+	"	svc #0\n"
+	: "=r" (ret)
+	: "r" (tv), "r" (tz), "r" (nr)
+	: "memory");
+
+	return ret;
+}
+
+notrace int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
+{
+	struct timespec ts;
+	const struct vdso_data *vdata = get_vdso_data();
+	int ret;
+
+	ret = do_realtime(&ts, vdata);
+	if (ret)
+		return gettimeofday_fallback(tv, tz);
+
+	if (tv) {
+		tv->tv_sec = ts.tv_sec;
+		tv->tv_usec = ts.tv_nsec / 1000;
+	}
+	if (tz) {
+		tz->tz_minuteswest = vdata->tz_minuteswest;
+		tz->tz_dsttime = vdata->tz_dsttime;
+	}
+
+	return ret;
+}
+
+/* Avoid unresolved references emitted by GCC */
+
+void __aeabi_unwind_cpp_pr0(void)
+{
+}
+
+void __aeabi_unwind_cpp_pr1(void)
+{
+}
+
+void __aeabi_unwind_cpp_pr2(void)
+{
+}