Re: [RFC][v8][PATCH 9/10]: Define clone3() syscall

From: Michael Kerrisk
Date: Wed Oct 21 2009 - 00:26:34 EST


On Tue, Oct 20, 2009 at 1:50 AM, Matt Helsley <matthltc@xxxxxxxxxx> wrote:
> On Tue, Oct 20, 2009 at 06:31:20AM +0900, H. Peter Anvin wrote:
>> On 10/20/2009 02:44 AM, Matt Helsley wrote:
>>>> |
>>>> | I know I'm late to this discussion, but why the name clone3()? It's
>>>> | not consistent with any other convention used fo syscall naming,
>>
>> This assumption, of course, is just plain wrong.  Look at the wait
>> system calls, for example.  However, when a small integer is used like
>> that, it pretty much always reflects numbers of arguments.
>>
>>>> | AFAICS. I think a name like clone_ext() or clonex() (for extended)
>>>> | might make more sense.
>>>>
>>>> Sure, we talked about calling it clone_extended() and I can go back
>>>> to that.
>>>>
>>>> Only minor concern with that name was if this new call ever needs to
>>>> be extended, what would we call it :-). With clone3() we could add a
>>>> real/fake parameter and call it clone4() :-p
>>>
>>> Perhaps clone64 (somewhat like stat64 for example)?
>>>
>>
>> I think that doesn't exactly reflect the nature of the changes.
>
> Yes. Without adopting an impractical encoding scheme it's quite
> unlikely a small number like 3 or 64 could exactly reflect all the
> changes :). I don't think that's a realistic objection though so...
>
>> clone3() is actually pretty good.
>
> I agree.

My question here is: what does "3" actually mean? In general, system
calls have not followed any convention of numbering to indicate
successive versions -- clone2() being the one possible exception that
I know of.

The only other conventions used for numbering new versions of system
calls relates either to arguments size (e.g., 32 versus 64) or to
their number number of arguments (dup2(), dup3(), wait3(), wait4(),
accept4(), eventfd2(), inotify_init1(), epoll_create1(), evetfd2(),
signalfd4()). The former convention makes some sense, but the latter
is rather questionable, for a couple of reasons. One is that the
number of arguments for the system call may change in the future
(several of the newer system calls have a flags argument which could
be used to indeicate the presence of an additional, optional
argument). Another reason that the latter convention is questionable
is that the number of arguments exposed to userspace by glibc may be
different from the number of arguments in the raw syscall. For
example, signafd4() has 4 arguments, but the glibc interface
(signalfd() http://www.kernel.org/doc/man-pages/online/pages/man2/signalfd.2.html)
has 3.

Using the name clone3() follows no historical convention, which is why
it seems unwise to me. Thus my suggestion of clonex() (like e.g.,
adjtimex()), though quite possibly there could be some better name.

Sukadev, you wrote "With clone3() we could add areal/fake parameter
and call it clone4()". This rasies for me the question: should
clone3() have a flags argument, so as to allow these types of
extensions (i.e., not for clone flags, but rather to indicate changes
in the system call interface). Yes, I understand there is a 64-bit
flags in 'struct clone_struct', but I wonder whether there is any
virtue in having an additional flags argument in the base signature of
the function, as per the following:
http://www.kernel.org/doc/man-pages/online/pages/man2/dup3.2.html
http://www.kernel.org/doc/man-pages/online/pages/man2/signalfd4.2.html
http://www.kernel.org/doc/man-pages/online/pages/man2/eventfd2.2.html
see also
http://linux-man-pages.blogspot.com/2008/10/recent-changes-in-file-descriptor.html

By the way, one further thought: why "struct clone_struct"? We know
it's a struct. Therefore, it seems pointless to include that in the
name. Something like "struct clone_args" would seem less redundant and
slightly more meaningful as a name.

Cheers,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/