Friday, February 13, 2015

Capsicumizing strings

Like many others, I had an initial reaction of boggle when our very own Michal Zalewski showed that the strings utility isn't safe to run on untrusted files (CVE-2014-8485), because of its use of libbfd to parse object file internals.

However, the strings code gives us a perfect target to demonstrate the process of applying a Capsicum sandbox to an existing small application (similar to Ben Laurie's example of Capsicumizing bzip2). The net result is a sandboxed version of the program, where any remote code execution (RCE) attack is significantly restricted in what it can do.

In particular, the Capsicum-contained code would only be able to read the input files (content and metadata) and output to stdout / stderr. Given that the attacker could already read the input file (since they provided it) and could already output arbitrary strings to stdout (just by including them in the input file), exploiting a Capsicum-protected strings doesn't gain an attacker much*.

[*: There are still some avenues of attack against the Capsicum-protected code. The two most obvious are that an attacker can spew out arbitrary data, not just ASCII strings, and that Capsicum provides little protection against local DoS / resource exhaustion attacks.]

Existing Code Structure

We start by looking at the main loop that does the work of string-hunting (here and below, I'm compacting the code and removing some error-checking arms to save space and help clarity):

  if (optind >= argc) {
      datasection_only = FALSE;
      print_strings("{standard input}", stdin, 0, 0, 0, (char *) NULL);
  } else {
      for (; optind < argc; ++optind) {
        if (strcmp(argv[optind], "-") == 0) {
          datasection_only = FALSE;
        } else {
          exit_status |= strings_file(argv[optind]) == FALSE;
        }
      }
  }

The datasection_only global variable governs whether the binary file descriptor (BFD) library is used for parsing, and is controlled by the -a and -d options. Each file specified on the command line is passed into strings_file() in turn, so that needs a closer examination:

static bfd_boolean strings_file(char *file)
{
  struct stat st;
  if (stat(file, &st) < 0)
    return FALSE;

  if (!datasection_only || !strings_object_file(file)) {
      FILE *stream = fopen(file, FOPEN_RB);
      if (stream == NULL) {
        fprintf(stderr, "%s: ", program_name); perror(file);
        return FALSE;
      }

      print_strings(file, stream, (file_ptr) 0, 0, 0, (char *) 0);

      if (fclose(stream) == EOF) {
        fprintf(stderr, "%s: ", program_name); perror(file);
        return FALSE;
      }
  }
  return TRUE;
}

In both the main loop and this per-file function, the print_strings() function is used to do a raw search of a particular FILE* for strings, but the strings_object_file() may get first crack at each file. Assuming the file looks like a supported object format, this function asks the BFD library to iterate over sections and call back into strings_a_section() for each section:

static bfd_boolean strings_object_file(const char *file)
{
  filename_and_size_t filename_and_size = {file, 0};
  bfd *abfd = bfd_openr(file, target);
  if (abfd == NULL)  /* Treat the file as a non-object file.  */
    return FALSE;

  if (!bfd_check_format(abfd, bfd_object)) {
    bfd_close(abfd);
    return FALSE;
  }

  got_a_section = FALSE;
  bfd_map_over_sections(abfd, strings_a_section, &filename_and_size);

  if (!bfd_close(abfd)) {
    bfd_nonfatal(file);
    return FALSE;
  }
  return got_a_section;
}

Finally, the callback function strings_a_section() limits its search to specific sections, gets the section information from the BFD library, and passes the location of the section to the workhorse print_strings() function:

static void strings_a_section(bfd *abfd, asection *sect, void *arg)
{
  filename_and_size_t *filename_and_sizep = (filename_and_size_t *)arg;
  bfd_size_type sectsize;
  void *mem;

  if ((sect->flags & DATA_FLAGS) != DATA_FLAGS)
    return;
  sectsize = bfd_get_section_size(sect);
  if (sectsize <= 0)
    return;

  mem = xmalloc(sectsize);
  if (bfd_get_section_contents (abfd, sect, mem, (file_ptr) 0, sectsize)) {
    got_a_section = TRUE;
    print_strings(filename_and_sizep->filename, NULL, sect->filepos,
                  0, sectsize, (char *) mem);
  }
  free(mem);
}

This also makes clear that print_strings() has two different modes – one where it reads characters from a location in a FILE*, and one where it processes already-read bytes in a buffer; the details of this are implemented in the get_char() utility function.

Plan of Attack

Now that we understand the code structure, we can plan how to Capsicumize the program. The default plan of attack for these kinds of applications works here:

  • Open all of the input files before entering the main loop.
  • Restrict the input files to just have CAP_READ and CAP_SEEK rights
  • Restrict the outputs (stdout and stderr here) to just have the CAP_WRITE right
  • Enter Capsicum capability mode before starting the main loop.

However, there are additional tasks that are specific to this particular application; in particular, functions that take a char *filename argument need to be converted to alternatively take a pre-opened int fd or FILE *stream argument. We also need to take care that the detailed output of strings is preserved – for example, by deferring error messages from failed file-open operations until the appropriate moment.

(It's worth mentioning that there are other design patterns for Capsicumizing more complicated applications. For example, a compartmentalized application might pass (capability) file descriptors around over local sockets.)

FILEs Not File Names

Converting strings to use FILE* streams rather than filenames internally turns out to be fairly straightforward, thanks to the richness of the BFD API. The main cause for concern is the call to bfd_openr(), which asks the BFD library to open a particular filename; however, the BFD API also includes the bfd_openstreamr() and bfd_fopen() entrypoints that allow the user to supply an existing FILE * stream or file descriptor instead of a filename.

One small wrinkle is that any call to bfd_close on the resulting bfd object also closes the underlying file. This is awkward if the BFD library decides that this isn't a valid object file after all; we can't fall back to byte-by-byte examination of the file if it has been closed already. So we dup() the underlying file descriptor before letting the BFD library have at it.

The remainder of the patch merely moves the file open/close code out of the strings_file() function, and into its caller, ready for the next stage.

From d662f77cb63c4b8247ba8bc2288c42576dc7f7a0 Mon Sep 17 00:00:00 2001
From: David Drysdale <drysdale@google.com>
Date: Tue, 25 Nov 2014 12:52:24 +0000
Subject: [PATCH 1/4] Pass around opened FILE* not a filename

strings_file() and strings_object_file() are converted to take a FILE*
to operate on, rather than just a filename that they open themselves.
---
 binutils/strings.c | 62 +++++++++++++++++++++++++++++++-----------------------
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/binutils/strings.c b/binutils/strings.c
index 2cf046fded1a..09eefc250e69 100644
--- a/binutils/strings.c
+++ b/binutils/strings.c
@@ -139,8 +139,8 @@ typedef struct
 } filename_and_size_t;

 static void strings_a_section (bfd *, asection *, void *);
-static bfd_boolean strings_object_file (const char *);
-static bfd_boolean strings_file (char *);
+static bfd_boolean strings_object_file (const char *, FILE *);
+static bfd_boolean strings_file (char *, FILE *);
 static void print_strings (const char *, FILE *, file_ptr, int, int, char *);
 static void usage (FILE *, int);
 static long get_char (FILE *, file_ptr *, int *, char **);
@@ -306,8 +306,26 @@ main (int argc, char **argv)
             datasection_only = FALSE;
           else
             {
+              FILE *stream;
+
               files_given = TRUE;
-              exit_status |= strings_file (argv[optind]) == FALSE;
+              stream = fopen (argv[optind], FOPEN_RB);
+              if (stream == NULL)
+                {
+                  fprintf (stderr, "%s: ", program_name);
+                  perror (argv[optind]);
+                  exit_status = TRUE;
+                }
+              else
+                {
+                  exit_status |= strings_file (argv[optind], stream) == FALSE;
+                  if (fclose (stream) == EOF)
+                    {
+                      fprintf (stderr, "%s: ", program_name);
+                      perror (argv[optind]);
+                      return FALSE;
+                    }
+                }
             }
         }
     }
@@ -383,16 +401,24 @@ strings_a_section (bfd *abfd, asection *sect, void *arg)
    FALSE if not (such as if FILE is not an object file).  */

 static bfd_boolean
-strings_object_file (const char *file)
+strings_object_file (const char *file, FILE *stream)
 {
   filename_and_size_t filename_and_size;
+  int fd;
   bfd *abfd;

-  abfd = bfd_openr (file, target);
+  fd = dup(fileno(stream));
+  if (fd < 0)
+    return FALSE;
+
+  abfd = bfd_fopen (file, target, FOPEN_RB, fd);

   if (abfd == NULL)
-    /* Treat the file as a non-object file.  */
-    return FALSE;
+    {
+      /* Treat the file as a non-object file.  */
+      close(fd);
+      return FALSE;
+    }

   /* This call is mainly for its side effect of reading in the sections.
      We follow the traditional behavior of `strings' in that we don't
@@ -420,13 +446,13 @@ strings_object_file (const char *file)
 /* Print the strings in FILE.  Return TRUE if ok, FALSE if an error occurs.  */

 static bfd_boolean
-strings_file (char *file)
+strings_file (char *file, FILE *stream)
 {
   struct stat st;

   /* get_file_size does not support non-S_ISREG files.  */

-  if (stat (file, &st) < 0)
+  if (fstat (fileno(stream), &st) < 0)
     {
       if (errno == ENOENT)
         non_fatal (_("'%s': No such file"), file);
@@ -440,26 +466,10 @@ strings_file (char *file)
      try to open it as an object file and only look at
      initialized data sections.  If that fails, fall back to the
      whole file.  */
-  if (!datasection_only || !strings_object_file (file))
+  if (!datasection_only || !strings_object_file (file, stream))
     {
-      FILE *stream;
-
-      stream = fopen (file, FOPEN_RB);
-      if (stream == NULL)
-        {
-          fprintf (stderr, "%s: ", program_name);
-          perror (file);
-          return FALSE;
-        }
-
       print_strings (file, stream, (file_ptr) 0, 0, 0, (char *) 0);

-      if (fclose (stream) == EOF)
-        {
-          fprintf (stderr, "%s: ", program_name);
-          perror (file);
-          return FALSE;
-        }
     }

   return TRUE;
--
1.9.1

Opening In Advance

The next change we need is to ensure that all of the files we need are opened before the main business of scanning (untrusted, user-provided) files starts. This is a fairly straightforward re-structure, but the need to preserve the existing behaviour and order of output leads to the creation of a new stream_into_t structure. This structure tracks:

  • the filename
  • the FILE* stream (if successfully opened)
  • whether this input is a direct stream (i.e. stdin) rather than a file-based stream
  • whether this input should trigger a switch to non-BFD mode
  • any errno value from a failed attempt to open the file

With an array of these structures describing the input files, the main loop has all the information needed to perform the work of strings, while preserving the output behaviour of the existing code.

From 5ab9c0f0797790d1184c54bf6221f6bc95c75088 Mon Sep 17 00:00:00 2001
From: David Drysdale <drysdale@google.com>
Date: Tue, 25 Nov 2014 17:08:03 +0000
Subject: [PATCH 2/4] Open all streams in advance

Preserve the order of output by saving errors on file open,
and emit the error message later.
---
 binutils/strings.c | 83 ++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 59 insertions(+), 24 deletions(-)

diff --git a/binutils/strings.c b/binutils/strings.c
index 09eefc250e69..e5dce2f5a8bc 100644
--- a/binutils/strings.c
+++ b/binutils/strings.c
@@ -138,6 +138,15 @@ typedef struct
   bfd_size_type filesize;
 } filename_and_size_t;

+typedef struct
+{
+  char * filename;
+  FILE * stream;
+  bfd_boolean datasection_only;
+  bfd_boolean direct;
+  int error;
+} stream_info_t;
+
 static void strings_a_section (bfd *, asection *, void *);
 static bfd_boolean strings_object_file (const char *, FILE *);
 static bfd_boolean strings_file (char *, FILE *);
@@ -155,6 +164,9 @@ main (int argc, char **argv)
   bfd_boolean files_given = FALSE;
   char *s;
   int numeric_opt = 0;
+  int ii;
+  int num_streams;
+  stream_info_t *streaminfo;

 #if defined (HAVE_SETLOCALE)
   setlocale (LC_ALL, "");
@@ -291,45 +303,68 @@ main (int argc, char **argv)
   bfd_init ();
   set_default_bfd_target ();

+  /* Pre-open all of the streams involved, and save any errors */
+  num_streams = (optind >= argc) ? 1 : (argc - optind);
+  streaminfo = xmalloc (sizeof (stream_info_t) * num_streams);
   if (optind >= argc)
     {
-      datasection_only = FALSE;
+      streaminfo[0].filename = "{standard input}";
+      streaminfo[0].stream = stdin;
+      streaminfo[0].datasection_only = FALSE;
+      streaminfo[0].direct = TRUE;
+      streaminfo[0].error = 0;
       SET_BINARY (fileno (stdin));
-      print_strings ("{standard input}", stdin, 0, 0, 0, (char *) NULL);
       files_given = TRUE;
     }
   else
     {
-      for (; optind < argc; ++optind)
+      for (ii = 0; ii < num_streams; ++ii)
         {
-          if (strcmp (argv[optind], "-") == 0)
-            datasection_only = FALSE;
+          streaminfo[ii].filename = argv[optind + ii];
+          streaminfo[ii].stream = NULL;
+          streaminfo[ii].datasection_only = TRUE;
+          streaminfo[ii].direct = FALSE;
+          streaminfo[ii].error = 0;
+          if (strcmp (streaminfo[ii].filename, "-") == 0)
+            streaminfo[ii].datasection_only = FALSE;
           else
             {
-              FILE *stream;
-
               files_given = TRUE;
-              stream = fopen (argv[optind], FOPEN_RB);
-              if (stream == NULL)
-                {
-                  fprintf (stderr, "%s: ", program_name);
-                  perror (argv[optind]);
-                  exit_status = TRUE;
-                }
-              else
-                {
-                  exit_status |= strings_file (argv[optind], stream) == FALSE;
-                  if (fclose (stream) == EOF)
-                    {
-                      fprintf (stderr, "%s: ", program_name);
-                      perror (argv[optind]);
-                      return FALSE;
-                    }
-                }
+              streaminfo[ii].stream = fopen (streaminfo[ii].filename, FOPEN_RB);
+              if (streaminfo[ii].stream == NULL)
+                streaminfo[ii].error = errno;
             }
         }
     }

+  for (ii = 0; ii < num_streams; ++ii)
+    {
+      if (streaminfo[ii].datasection_only == FALSE)
+        datasection_only = FALSE;
+      if (streaminfo[ii].stream)
+        {
+          if (streaminfo[ii].direct)
+            print_strings (streaminfo[ii].filename, streaminfo[ii].stream,
+                           0, 0, 0, (char *) NULL);
+          else
+            exit_status |= strings_file (streaminfo[ii].filename,
+                                         streaminfo[ii].stream) == FALSE;
+          if (fclose (streaminfo[ii].stream) == EOF)
+          {
+            fprintf (stderr, "%s: ", program_name);
+            perror (argv[optind]);
+            return FALSE;
+          }
+        }
+      else if (streaminfo[ii].error)
+        {
+          fprintf (stderr, "%s: %s: %s\n", program_name,
+                   streaminfo[ii].filename, strerror(streaminfo[ii].error));
+          exit_status = TRUE;
+        }
+    }
+  free(streaminfo);
+
   if (!files_given)
     usage (stderr, 1);

--
1.9.1

Restricting Rights

With all the preparatory work done, the first pass at adding Capsicum support is straightforward.

From 036f463b51b03f17cb0ea15025d69840a7d13ab0 Mon Sep 17 00:00:00 2001
From: David Drysdale <drysdale@google.com>
Date: Tue, 25 Nov 2014 17:42:48 +0000
Subject: [PATCH 3/4] Add initial rights restriction on file descriptors

---
 binutils/strings.c   | 15 +++++++++++++++
 1 files changed, 15 insertions(+)

diff --git a/binutils/strings.c b/binutils/strings.c
index e5dce2f5a8bc..24bec7c68895 100644
--- a/binutils/strings.c
+++ b/binutils/strings.c
@@ -72,6 +72,8 @@
 #include "safe-ctype.h"
 #include "bucomm.h"

+#include <sys/capsicum.h>
+
 #define STRING_ISGRAPHIC(c) \
       (   (c) >= 0 \
        && (c) <= 255 \
@@ -337,6 +339,19 @@ main (int argc, char **argv)
         }
     }

+  {
+    cap_rights_t rights;
+    cap_rights_limit(fileno(stdout), cap_rights_init(&rights, CAP_WRITE));
+    cap_rights_limit(fileno(stderr), cap_rights_init(&rights, CAP_WRITE));
+    cap_rights_init(&rights, CAP_READ, CAP_SEEK, CAP_FSTAT);
+    for (ii = 0; ii < num_streams; ++ii)
+      {
+        if (streaminfo[ii].stream)
+          cap_rights_limit(fileno(streaminfo[ii].stream), &rights);
+      }
+  }
+  cap_enter();
+
   for (ii = 0; ii < num_streams; ++ii)
     {
       if (streaminfo[ii].datasection_only == FALSE)
--
1.9.1

Hunting Other Operations

However, running the resulting strings binary over a few test files quickly reveals that not all is well. Files that are processed in -a mode are scanned fine, but files that should be parsed as object files are not generating output.

To fine-tune the Capsicumization process, strace is your friend. Repeating the failed run with strace quickly shows some ENOTCAPABLE errors (which show up as ERRNO_135 because strace isn't yet Capsicum-aware):

    fcntl(7, F_GETFL)                       = -1 ERRNO_135 (Unknown error 135)
    fcntl(3, F_GETFL)                       = -1 ERRNO_135 (Unknown error 135)
    fstat(1, 0x7fffab1bac40)                = -1 ERRNO_135 (Unknown error 135)

The last of these just means that stdout needs CAP_FSTAT as an additional right, and the earlier errors tell us that the input files need CAP_FCNTL to allow the BFD library to check for file status flags. We don't want to open up all possible fcntl(2) operations, so we further restrict the CAP_FCNTL right so that only F_GETFL is allowed. Our penultimate code patch tweaks these rights, and the resulting strings binary works fine.

From 7d8d0de149a7e6811d3f499db5fcacd361415cbd Mon Sep 17 00:00:00 2001
From: David Drysdale <drysdale@google.com>
Date: Tue, 25 Nov 2014 17:44:40 +0000
Subject: [PATCH 4/4] Tweak rights needed

On examining strace output, there are some additional rights
needed:
 - CAP_FSTAT is needed for stdout
 - CAP_FCNTL is needed for the input file descriptors

For the latter, only allow the F_GETFL operation.
---
 binutils/strings.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/binutils/strings.c b/binutils/strings.c
index 24bec7c68895..cd02bf5a7b41 100644
--- a/binutils/strings.c
+++ b/binutils/strings.c
@@ -341,13 +341,15 @@ main (int argc, char **argv)

   {
     cap_rights_t rights;
-    cap_rights_limit(fileno(stdout), cap_rights_init(&rights, CAP_WRITE));
+    cap_rights_limit(fileno(stdout), cap_rights_init(&rights, CAP_WRITE, CAP_FSTAT));
     cap_rights_limit(fileno(stderr), cap_rights_init(&rights, CAP_WRITE));
-    cap_rights_init(&rights, CAP_READ, CAP_SEEK, CAP_FSTAT);
+    cap_rights_init(&rights, CAP_READ, CAP_SEEK, CAP_FSTAT, CAP_FCNTL);
     for (ii = 0; ii < num_streams; ++ii)
       {
-        if (streaminfo[ii].stream)
+        if (streaminfo[ii].stream) {
           cap_rights_limit(fileno(streaminfo[ii].stream), &rights);
+          cap_fcntls_limit(fileno(streaminfo[ii].stream), CAP_FCNTL_GETFL);
+        }
       }
   }
   cap_enter();
--
1.9.1

Closing Holes

At this point, we think we've restricted all of the file descriptors for the application, but it's best to make sure. To make this easy, we temporarily insert a stop in the program (kill(getpid(), SIGSTOP);), just after the call to cap_enter(). We can now explore information about the stopped program, to check everything is covered.

The first thing to check is that capability mode has been engaged. In the Linux implementation of Capsicum, this is (mostly) implemented as a seccomp-BPF filter program, so running cat /proc/6620/status | grep Seccomp shows the value 2 (SECCOMP_MODE_FILTER), as expected.

More interesting is to examine the open file descriptors of the process, via Linux's /proc filesystem:

    % cd /proc/6620/fdinfo
    % ls
    0  1  2  3  4  5
    % more *
    ::::::::::::::
    0
    ::::::::::::::
    pos:        0
    flags:        0100002
    mnt_id:        18
    ::::::::::::::
    1
    ::::::::::::::
    pos:        0
    flags:        0100002
    mnt_id:        18
    rights:        0x200000000080002        0x400000000000000
     fcntls: 0x000000
    ::::::::::::::
    2
    ::::::::::::::
    pos:        0
    flags:        0100002
    mnt_id:        18
    rights:        0x200000000000002        0x400000000000000
     fcntls: 0x000000
    ::::::::::::::
    3
    ::::::::::::::
    pos:        0
    flags:        0100000
    mnt_id:        20
    rights:        0x20000000008800d        0x400000000000000
     fcntls: 0x000008
    ::::::::::::::
    4
    ::::::::::::::
    pos:        0
    flags:        0100000
    mnt_id:        20
    rights:        0x20000000008800d        0x400000000000000
     fcntls: 0x000008
    ::::::::::::::
    5
    ::::::::::::::
    pos:        0
    flags:        0100000
    mnt_id:        20
    rights:        0x20000000008800d        0x400000000000000
     fcntls: 0x000008

This shows that rights have been limited for all of the file descriptors, except for file descriptor zero – stdin. So our final patch just closes stdin when it's not needed.

From cc972e504a187d21a37857a720f1f5a91240e7d9 Mon Sep 17 00:00:00 2001
From: David Drysdale <drysdale@google.com>
Date: Fri, 5 Dec 2014 12:09:51 +0000
Subject: [PATCH] Close stdin if not needed

---
 binutils/strings.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/binutils/strings.c b/binutils/strings.c
index cd02bf5a7b41..5a81e2d813ab 100644
--- a/binutils/strings.c
+++ b/binutils/strings.c
@@ -337,6 +337,7 @@ main (int argc, char **argv)
                 streaminfo[ii].error = errno;
             }
         }
+      fclose(stdin);
     }

   {
--
1.9.1

And that's the end of the changes needed to provide Capsicum protection for strings. Again, we should emphasize that this doesn't prevent the RCE – Michal's original test file would still crash strings. However, if the exploit were to actually try to do anything "useful" (scan the local system, contact a remote C&C server, join a DDoS botnet, …), those operations would fail.

Onward

Having made the fairly straightforward code changes for Capsicumizing strings, the next step is considerably harder: updating the autotools configuration to detect whether Capsicum support is available at compile-time. But that's a story for another day…

No comments:

Post a Comment