Discussion:
[MLton] MLton broken FFI on AMD64???
Henry Cejtin
2011-02-03 20:47:29 UTC
Permalink
It seems that either there is a new bug in the foreign function calling
sequence or something has changed.

I'm using the version of MLton in Debian testing:
MLton 20100608 (built Tue Jun 15 01:10:03 UTC 2010 on barber)
on a Linux AMD64 architecture.

The following program segfaults when run:

val printf = _import "printf": string * int -> unit;
val () = printf ("You are a dog %d0000", 99)
val () = printf ("Hello silly %d0000", 123)

It prints out the first line but then segfaults on the second call to
printf.

I looked in
/usr/share/doc/mlton/guide/CallingFromSMLToC
page and what I'm doing looks like it is following the description
there.


On Linux i386, this code works both using the Debian stable version of
MLton:
MLton 20070826 (built Mon Sep 03 18:05:04 2007 on eponym)
and also a version I just built from the repository:
MLton rexported (built Tue Feb 1 23:22:44 CST 2011 on jupiter)

I thought perhaps it was something involving int vs. Int64.int, but
changing the printf type in the ML code to Int64.int made no difference
on the AMD64 machine.

Any suggestions?
Wesley W. Terpstra
2011-02-04 06:32:34 UTC
Permalink
Post by Henry Cejtin
val printf = _import "printf": string * int -> unit;
val () = printf ("You are a dog %d0000", 99)
val () = printf ("Hello silly %d0000", 123)
My first answer was: null terminate your string.

However, the segfault persists. If you replace "printf" with "test":
void test(const char* s, int x) {
printf(s, x);
}
... it works.

The problem also only appears for me when I target amd64. i386 works.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20110204/22c10826/attachment.htm
Wesley W. Terpstra
2011-02-04 06:58:01 UTC
Permalink
Post by Wesley W. Terpstra
void test(const char* s, int x) {
printf(s, x);
}
... it works.
I've confirmed that MLton produces identical code except replacing 'printf'
with 'test'. I tried a similar test with gcc. For some reason gcc clears eax
before a call to printf but not a call to 'printm'. Perhaps this is related
to varargs?

#include <stdio.h>
int printm(const char* x, int k);

int beh() {
printm("dog %d\n", 44);
printf("dog %d\n", 44);
return 1;
}

beh:
.LFB13:
subq $8, %rsp
.LCFI0:
movl $44, %esi
movl $.LC0, %edi
call printm
movl $44, %esi
movl $.LC0, %edi
xorl %eax, %eax <<<<<<<<<<<<<<<< WTF?
call printf
movl $1, %eax
addq $8, %rsp
ret

Adding an 'xorl %eax,%eax' directly before the call to printf in the MLton
generated assembler makes the problem disappear.

L_114:
addq $0xFFFFFFFFFFFFFFD8,%rbp
movq (c_stackP+0x0)(%rip),%rsp
movl $0x7B,%r15d
movl %r15d,%esi
movq (globalObjptr+0x80)(%rip),%rdi
addq $0x28,%rbp
leaq (L_115+0x0)(%rip),%r15
movq %r15,0xFFFFFFFFFFFFFFF8(%rbp)
movq %rbp,(gcState+0x10)(%rip)
movq %r12,(gcState+0x0)(%rip)
xorl %eax,%eax
call printf
movq (gcState+0x0)(%rip),%r12
movq (gcState+0x10)(%rip),%rbp
jmp L_115

I then did some digging and found this choice quote in the "System V
Application Binary Interface AMD64 Architecture Processor Supplement Draft
Version 0.99.5":
" When a function taking variable-arguments is called, %rax must be set
to the
total number of floating point parameters passed to the function in vector
registers"

... either we need to add another tag to imports like:
_import "printf" stdarg : ... ;
... or we should just always set rax for every FFI call on AMD64?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20110204/4a37e6d6/attachment.html
Matthew Fluet
2011-02-04 13:02:07 UTC
Permalink
Post by Wesley W. Terpstra
I then did some digging and found this choice quote in the "System V
Application Binary Interface AMD64 Architecture Processor Supplement Draft
"??? When a function taking variable-arguments is called, %rax must be set
to the
total number of floating point parameters passed to the function in vector
registers"
_import "printf" stdarg : ... ;
... or we should just always set rax for every FFI call on AMD64?
Did you mean a "varargs" attribute? That is, I would expect "stdargs"
(or "fixargs"?) to be the silent default.

Interestingly, it seems that MacOSX also demands that rax be set to
the the number of floating point parameters.

It seems that setting rax for every FFI call on AMD64 would be the
simplest solution.
Wesley W. Terpstra
2011-02-04 13:19:48 UTC
Permalink
Post by Matthew Fluet
Post by Wesley W. Terpstra
_import "printf" stdarg : ... ;
... or we should just always set rax for every FFI call on AMD64?
Did you mean a "varargs" attribute? That is, I would expect "stdargs"
(or "fixargs"?) to be the silent default.
The new varargs in C is called stdarg. A very poor name, I agree.
Post by Matthew Fluet
It seems that setting rax for every FFI call on AMD64 would be the
simplest solution.
I agree, if only because MLton64 FFI calls are already so slow. ;)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20110204/bc3c56ae/attachment.html
Matthew Fluet
2011-02-06 23:38:53 UTC
Permalink
Post by Matthew Fluet
Interestingly, it seems that MacOSX also demands that rax be set to
the the number of floating point parameters.
It seems that setting rax for every FFI call on AMD64 would be the
simplest solution.
I committed a patch that always sets %rax to the number of XMMS
registers passed for non-win64 AMD64 platforms; since %rax is caller
save, this doesn't really affect register allocation and just incurs
an immediate->register move instruction. That solved the issue on
Linux and MacOSX (and presumably on other Unices).
Matthew Fluet
2011-02-14 16:08:05 UTC
Permalink
Post by Matthew Fluet
Post by Matthew Fluet
Interestingly, it seems that MacOSX also demands that rax be set to
the the number of floating point parameters.
It seems that setting rax for every FFI call on AMD64 would be the
simplest solution.
I committed a patch that always sets %rax to the number of XMMS
registers passed for non-win64 AMD64 platforms; since %rax is caller
save, this doesn't really affect register allocation and just incurs
an immediate->register move instruction. ?That solved the issue on
Linux and MacOSX (and presumably on other Unices).
BTW, it occurs to me that there isn't a good solution to this problem
with the C codegen. With the C codegen, MLton emits prototypes for
_import-ed functions that are derived from the type of the imported
function. The prototype assumes that the function is a non-varargs.
[There is a second issue with Henry's particular example, where the
MLton emitted prototype for printf disagrees with the prototype
exported by stdio.h, so one actually gets a compile error there.] I
suppose we could also be conservative here and always emit a varargs
prototype.

The other alternative is to simply not support _import-ing of varargs functions.
Wesley W. Terpstra
2011-02-17 13:57:06 UTC
Permalink
Post by Matthew Fluet
BTW, it occurs to me that there isn't a good solution to this problem
with the C codegen. With the C codegen, MLton emits prototypes for
_import-ed functions that are derived from the type of the imported
function. The prototype assumes that the function is a non-varargs.
That's a good point. Yuck.

[There is a second issue with Henry's particular example, where the
Post by Matthew Fluet
MLton emitted prototype for printf disagrees with the prototype
exported by stdio.h, so one actually gets a compile error there.]
It's not just printf. I know of several platforms where system calls in C go
through some header magic-fu. There are #define's that rename things
depending on the _XOPEN_SOURCE / etc selection. Importing *any* symbols from
system libraries is rather error-prone.

Really, the best solution is to do like the MLton runtime: write your own
functions that do the C calls and import them instead. That works with the
system header defines/etc (and also dodges the varargs problem). The
overhead is quite low; gcc optimizes those proxy methods down to a single
branch instruction if the prototypes match.
Post by Matthew Fluet
I suppose we could also be conservative here and always emit a varargs
prototype.
What do you propose? Append a ", ..." to the end of the prototype?

I don't know of any architecture where it is a problem, but might passing a
variable argument and a normal argument differ on an input-by-input basis?
Imagine "void foo(int x, ...);" where passing the first integer goes as
32-bit on the stack but the second gets sign-extended to 64-bit. I don't
think anything forbids such a hypothetically evil ABI.

The other alternative is to simply not support _import-ing of varargs
Post by Matthew Fluet
functions.
Whatever we do, we should do it consistently across the codegens.

If we can't support C properly, then maybe varargs is a losing battle.
On the other hand, people rather expect 'printf' to work.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20110217/997a6955/attachment.html
Matthew Fluet
2011-02-17 16:43:52 UTC
Permalink
Post by Wesley W. Terpstra
Post by Matthew Fluet
BTW, it occurs to me that there isn't a good solution to this problem
with the C codegen. ?With the C codegen, MLton emits prototypes for
_import-ed functions that are derived from the type of the imported
function. ?The prototype assumes that the function is a non-varargs.
That's a good point. Yuck.
Post by Matthew Fluet
[There is a second issue with Henry's particular example, where the
MLton emitted prototype for printf disagrees with the prototype
exported by stdio.h, so one actually gets a compile error there.]
It's not just printf. I know of several platforms where system calls in C go
through some header magic-fu. There are #define's that rename things
depending on the _XOPEN_SOURCE / etc selection. Importing *any* symbols from
system libraries is rather error-prone.
Sure, there could be symbols that work in a C file because the are
cpp-ed down to the symbol that actually appears in the C library.
Although, POSIX is pretty clear on what things might be implemented by
macros and what things must be true symbols. But, of course, that
doesn't mean everyone follows it.
Post by Wesley W. Terpstra
Really, the best solution is to do like the MLton runtime: write your own
functions that do the C calls and import them instead. That works with the
system header defines/etc (and also dodges the varargs problem). The
overhead is quite low; gcc optimizes those proxy methods down to a single
branch instruction if the prototypes match.
It only dodges the varargs problem to the degree that one only writes
non-varargs functions. But that seems to be a reasonable restriction.
Post by Wesley W. Terpstra
Post by Matthew Fluet
I suppose we could also be conservative here and always emit a varargs
prototype.
What do you propose? Append a ", ..." to the end of the prototype?
That's what I meant by being conservative.
Post by Wesley W. Terpstra
I don't know of any architecture where it is a problem, but might passing a
variable argument and a normal argument differ on an input-by-input basis?
Imagine "void foo(int x, ...);" where passing the first integer goes as
32-bit on the stack but the second gets sign-extended to 64-bit. I don't
think anything forbids such a hypothetically evil ABI.
Well, I always thought that there wouldn't be an ABI where the varags
and non-varargs calling conventions wouldn't be different. What we
see with amd64 is that the varags convention is compatible with the
non-vargs, but not the other way around. As you say, there seems to
be little that would prevent an ABI from having varargs and
non-varargs conventions incompatible.
Post by Wesley W. Terpstra
Post by Matthew Fluet
The other alternative is to simply not support _import-ing of varargs functions.
Whatever we do, we should do it consistently across the codegens.
Agreed. Though, for all many of the same reasons, there doesn't seem
to be any way to detect/prevent the _import of a varargs functions (at
a fixed number of arguments); it just means that on some platforms and
some codegens it would work and on others it would not.
Post by Wesley W. Terpstra
If we can't support C properly, then maybe varargs is a losing battle.
On the other hand, people rather expect 'printf' to work.
I've certainly used the _import "printf" trick when needing a
meaningful, but small program with which to debug one of the compiler
passes. But, apparently, never before on amd64.

After this discussion, I'm leaning towards not supporting varargs functions.
Baojian Hua
2011-02-22 06:19:42 UTC
Permalink
hi,

There is a bug related to the "-stop" switch, which is
located at line 740 of mlton/main/main.fun. The arg
is "{f|g|o|sml|tc}", whereas the code next disallows
"sml".

-Paul
Matthew Fluet
2011-02-22 11:42:41 UTC
Permalink
Thanks for the bug report. I've updated the description of the
command-line argument. The "-stop sml" was eliminated because the
resulting SML file was nothing more than the concat of all of the
individual SML files, which means it didn't obey any of the scoping
that the MLB file provided.
Post by Baojian Hua
hi,
There is a bug related to the "-stop" switch, which is
located at line 740 of mlton/main/main.fun. The arg
is "{f|g|o|sml|tc}", whereas the code next disallows
"sml".
-Paul
_______________________________________________
MLton mailing list
http://mlton.org/mailman/listinfo/mlton
Henry Cejtin
2011-02-04 13:56:09 UTC
Permalink
I had seen the clear of eax in the gcc code for printf but didn't find
the ABI item you found. I guess that in the case that there is no C
prototype, the C compiler has to set that %eax `correctly' just in case
the function is varargs.

Perhaps the safe thing to do is to add an attribute to mean `definitely
NOT varargs', and to always set %eax unless that attribute is present.
Ugly, but upward compatible. More importantly, it's pretty easy to miss
the perhaps even undocumented fact that a function CAN take a variable
number of args. YUCK.

Very quick detective work Wesley, thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20110204/fca56b2d/attachment.html
Florian Weimer
2011-02-04 18:04:56 UTC
Permalink
Post by Wesley W. Terpstra
_import "printf" stdarg : ... ;
... or we should just always set rax for every FFI call on AMD64?
The latter is what libffi is doing, FWIW. Given that varargs are
somewhat complicated and expensive on amd64 anyway, this shouldn't
matter much.
Matthew Fluet
2011-02-04 19:35:45 UTC
Permalink
Post by Wesley W. Terpstra
I then did some digging and found this choice quote in the "System V
Application Binary Interface AMD64 Architecture Processor Supplement Draft
"??? When a function taking variable-arguments is called, %rax must be set
to the
total number of floating point parameters passed to the function in vector
registers"
Wesley, can you confirm that this calling convention does not apply to win64?
Florian Weimer
2011-02-05 07:31:45 UTC
Permalink
Post by Matthew Fluet
Wesley, can you confirm that this calling convention does not apply to win64?
For vararg or unprototyped functions, any float values must be
duplicated in the corresponding general-purpose register.
<http://msdn.microsoft.com/en-us/library/ms235286%28v=vs.80%29.aspx>

This suggests that you actually have to mark all varargs
functions. 8-(
Wesley W. Terpstra
2011-02-07 06:46:16 UTC
Permalink
Post by Florian Weimer
Post by Matthew Fluet
Wesley, can you confirm that this calling convention does not apply to win64?
It does not apply.
Post by Florian Weimer
Post by Matthew Fluet
For vararg or unprototyped functions, any float values must be
duplicated in the corresponding general-purpose register.
<http://msdn.microsoft.com/en-us/library/ms235286%28v=vs.80%29.aspx>
That's a pain. We don't do this at the moment. I'll work out a patch which
does this for Win64 varargs, also by default.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20110207/470ee022/attachment.htm
Anders Petersson
2011-02-04 12:07:43 UTC
Permalink
Post by Wesley W. Terpstra
void test(const char* s, int x) {
printf(s, x);
}
... it works.
Yes, varargs was my first instinct.

I don't know the C ABI well enough to know what (if anything) actually
changes nor if mlton is supposed to handle it, but I am passing loads of
data through the MLton FFI on amd64 without any issues.

/Anders
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20110204/4f21495f/attachment.htm
Henry Cejtin
2011-02-04 13:50:00 UTC
Permalink
Sorry for the confusion, the strings were null terminated but a
backslash got lost in the mail. The actual program was:

val printf = _import "printf": string * int -> unit;
val () = printf ("You are a dog %d\n\000", 99)
val () = printf ("Hello silly %d\n\000", 123)





________________________________
From: Wesley W. Terpstra <***@terpstra.ca>
To: Henry Cejtin <***@sbcglobal.net>
Cc: ***@mlton.org
Sent: Fri, February 4, 2011 2:32:00 AM
Subject: Re: [MLton] MLton broken FFI on AMD64???
Post by Henry Cejtin
val printf = _import "printf": string * int -> unit;
val () = printf ("You are a dog %d0000", 99)
val () = printf ("Hello silly %d0000", 123)
My first answer was: null terminate your string.

However, the segfault persists. If you replace "printf" with "test":
void test(const char* s, int x) {
printf(s, x);
}
... it works.

The problem also only appears for me when I target amd64. i386 works.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mlton.org/pipermail/mlton/attachments/20110204/65a64caf/attachment.htm
Loading...