Thursday, March 31, 2011

How to portably convert a string into an uncommon integer type?

Some background: If I wanted to use for, for instance, scanf() to convert a string into a standard integer type, like uint16_t, I’d use SCNu16 from <inttypes.h>, like this:

#include <stdio.h>
#include <inttypes.h>
uint16_t x;
char *xs = "17";
sscanf(xs, "%" SCNu16, &x);

But a more uncommon integer type like pid_t does not have any such thing; only the normal integer types are supported by <inttypes.h>. To convert the other way, to portably printf() a pid_t, I can cast it to intmax_t and use PRIdMAX, like this:

#include <stdio.h>
#include <inttypes.h>
#include <sys/types.h>
pid_t x = 17;
printf("%" PRIdMAX, (intmax_t)x);

However, there does not seem to be a way to portably scanf() into a pid_t. So this is my question: How to do this portably?

#include <stdio.h>
#include <sys/types.h>
pid_t x;
char *xs = 17;
sscanf(xs, "%u", &x);  /* Not portable! pid_t might not be int! /*

I thought of scanf()ing to an intmax_t and then checking that the value is within pid_t’s limits before casting to pid_t, but there does not seem to be a way to get the maximum or minimum values for pid_t.

From stackoverflow
  • It depends on exactly how portable you want to be. POSIX says that pid_t is a signed integer type used to store process IDs and process group IDs. In practice, you could assume with safety that long is big enough. Failing that, your intmax_t must be big enough (so it will accept any valid pid_t); the trouble is, that type could accept values that are not legitimate in pid_t. You're stuck between a rock and a hard place.

    I would use long and not worry very much about it except for an obscure comment somewhere that a software archaeologist of 100 years hence will find and observe gives a reason why the 256-bit CPU is creaking to a halt when handed a 512-bit value as a pid_t.

    POSIX 1003.1-2008 is now available on the web (all 3872 pages of it, in PDF and HTML). You have to register (free). I got to it from the Open Group Bookstore.

    All that I see there is that it must be a signed integer type. Clearly, all valid signed integer values fit into intmax_t. I cannot find any information in <inttypes.h> or <unistd.h> that indicates PID_T_MAX or PID_T_MIN or other such values (but I've only just this evening got access to it, so it could be hidden where I haven't looked for it). OTOH, I stand by my original comment - I believe that 32-bit values are pragmatically adequate, and I would use long anyway, which would be 64-bit on 8-bit machines. I suppose that roughly the worst thing that could happen is that an 'appropriately privileged' process read a value that was too large, and sent a signal to the wrong process because of a mismatch of types. I'm not convinced I'd be worried about that.

    ...oooh!...p400 under <sys/types.h>

    The implementation shall support one or more programming environments in which the widths of blksize_t, pid_t, size_t, ssize_t, and suseconds_t are no greater than the width of type long.

    Teddy : I heard that the next POSIX standard would only require it to fit into an intmax_t, so using long is out. What I actually did was depend on the GNU C library's docs saying pid_t would always be int, and commenting it. The code uses other glibc-specific stuff anyway, so this is not a real problem.
    Teddy : I’ve already GOT a "pragmatically adequate" solution, by using a non-portable assumption (in code that was never portable in the first place). What I WANTED was a truly portable solution, if such a thing actually exists.
  • Concur with the above. (Can't post comments yet - I'm a newbie).

    If you are super concerned you can _assert(sizeof(pid_t) <= long) or whatever type you choose for your '%' stuff

    Like Jonathan says - spec says signed int. If 'int' changes, your '%u' by definition changes with it.

    Teddy : sizeof does not actually concern itself with the possible values of a type, only the storage requirements in bytes. So your code does not actually guarantee what you think it does. Also, the spec says "signed integer type", not "signed int". Big difference.
  • I just had a thought: would this always work? I don’t know, but the idea is interesting.

    #include <inttypes.h>
    #include <stdio.h>
    #include <iso646.h>
    #include <sys/types.h>
    pid_t x;
    intmax_t xmax;
    char *xs = "17";
    int ret;
    int numchars;
    
    ret = sscanf(xs, "%" SCNdMAX "%n", &xmax, &numchars);
    if(ret < 1 or xmax != (pid_t)xmax or xs[numchars] != '\0'){
        fprintf(stderr, "Bad PID!\n");
    } else {
        x = (pid_t)xmax;
        ...
    }
    

    That is, I scan an intmax_t and see if it "fits" in a pid_t by casting it and comparing it to the original intmax_t value. I also need to use %n to check that there are no junk characters at the end.

    Teddy : There’s a problem with this: if pid_t happens to be the same size as intmax_t, scanf DOES NOT CHECK for over- or underflow, and will happily truncate to INTMAX_MIN or INTMAX_MAX. The only ones checking for errors are the strtoll() and related functions, and none of them actually take an intmax_t.
    Teddy : It would be better to use strtoimax, which does check for over- and underflow. See separate answer for code which does this.
  • There is one robust and portable solution, which is to use strtoimax() and check for overflows.

    That is, I parse an intmax_t, check for an error from strtoimax(), and then also see if it "fits" in a pid_t by casting it and comparing it to the original intmax_t value.

    #include <inttypes.h>
    #include <stdio.h>
    #include <iso646.h>
    #include <sys/types.h>
    char *xs = "17";            /* The string to convert */
    intmax_t xmax;
    char *tmp;
    pid_t x;                    /* Target variable */
    
    errno = 0;
    xmax = strtoimax(xs, &tmp, 10);
    if(errno != 0 or tmp == xs or *tmp != '\0'
       or xmax != (pid_t)xmax){
      fprintf(stderr, "Bad PID!\n");
    } else {
      x = (pid_t)xmax;
      ...
    }
    

    It is not possible to use scanf(), because, (as I said in a comment) scanf() will not detect overflows. But I was wrong in saying that none of the strtoll()-related functions takes an intmax_t; strtoimax() does!

    It also will not work to use anything else than strtoimax() unless you know the size of your integer type (pid_t, in this case).

0 comments:

Post a Comment