Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

translate-c: correctly translate pointers to cv-qualified values #23394

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

theofabilous
Copy link

@theofabilous theofabilous commented Mar 28, 2025

Fix #23329.

TLDR: const/volatile-qualified value-types (non-pointer types) in C are not always correctly translated when pointers to such types are taken.

For example, the following C code is mistranslated:

typedef struct {
    volatile int reg;
} hw_t;
extern hw_t *hw;
// ...
const int data = hw->reg; // volatile read, equivalent to *(&hw->reg)
const typeof(&hw->reg) data2 = &hw->reg; // volatile pointer

The penultimate line will not be a volatile load, because the type of reg in the translated zig code is just a bare int. The last line will generate a regular non-volatile int pointer for the same reason.


Fixed by explicitly casting pointers to the correct qualified types whenever the qualifiers can legally appear in zig, i.e. when taking the address of a cv-qualified value and right before dereferencing pointers thereto. Additionally, typedefs for volatile-qualified value-types are annotated by prefixing the typename with volatile_. Volatile-qualified value-types are converted to a helper type std.zig.c_translation.Volatile(T).


TODO

  • Remove/handle all my todo comments when associated notes/questions are resolved
  • Add doc comments for restoreValueQualifiersValue/restoreValueQualifiersPtr
  • Possibly wait for Fixed an issue where external local variable references in C code were translated to incorrect zig code #23384 and rebase when/if merged because it seems the changes in that PR deal with pointer type decay which might need updating to handle qualifiers
  • Squash commits for clean history
  • Fix implicit rvalue conversions which translate to volatile loads and the result type is the unqualified inner type
  • Look into ABI mismatches for structs vs primitive types
    • Possibly fix by not translating function parameter types to Volatile(T) and bitcast/ptrcast when such types are passed to func params

@Vexu
Copy link
Member

Vexu commented Mar 28, 2025

I think using a wrapper struct for volatile types could also work:

pub fn Volatile(T: type) type {
    return extern struct {
        inner: T,

        pub inline fn load(v: *volatile Volatile(T)) T {
            return v.inner;
        }
    };
}

const mmio_int = Volatile(c_int);
const mmio_int_ptr = *volatile c_int;
const hw_t = struct {
    reg: mmio_int,
    regs: [4]mmio_int,
    regm: [2][2]mmio_int,
    ptr: mmio_int_ptr,
};
extern var hw: *hw_t;
pub export fn entry() void {
    _ = hw.reg.load();
    _ = hw.regs[2].load();
}

It does seems to generate correct LLVM IR and is a much better solution at least in the case of renaming the typedef to volatile_x.

The check for volatile qualifiers should also exclude pointers since those can be volatile in Zig too.

@theofabilous
Copy link
Author

I think using a wrapper struct for volatile types could also work: ... and is a much better solution at least in the case of renaming the typedef to volatile_x.

Agreed, that's much nicer. Does it make sense to put this in std.zig.c_translation? (kind of like FlexibleArrayType)

The check for volatile qualifiers should also exclude pointers since those can be volatile in Zig too.

Correct me if I'm wrong, and I may have misunderstood what you meant here, but I think QualType::isVolatileQualified() only checks the "direct" qualifications of a type (with the exception of arrays, for some forsaken reason). So it doesn't return the qualifiers for the pointee. Then, excluding pointers types would actually exclude volatile pointers to (arbitrarily qualified pointee), i.e. the pointer value is volatile, which zig can't represent.

Example output from zig translate-c with this PR's branch (as-is):

static volatile int* foo(volatile int **x) {
    return *x;
}
pub fn foo(arg_x: [*c][*c]volatile c_int) callconv(.c) [*c]volatile c_int {
    var x = arg_x;
    _ = &x;
    return x.*;
}

No explicit cast here, since the pointee of x is not volatile qualified, the child of the pointee is (which is ok in zig). Is this what you meant?

@theofabilous
Copy link
Author

NOTE: just realized that you may have meant that adding explicit casts when using "existing" volatile pointers isn't necessary. If this isn't what was meant then feel free to ignore


I think it's safer to add casts whenever dereferencing because the pointer might not be correctly typed at the point of usage (even though the type is valid in zig).

My main concern is that given the following zig expression produced by translate-c: some_c_pointer.*, I'm not 100% certain that some_c_pointer is actually correctly explicitly typed. For example, the following C code:

*(&some_volatile_int) // assume some_volatile_int has type 'volatile int'

was previously mistranslated to:

(&some_volatile_int).*

even though it is "structurally" correct and the type of &some_volatile_int is reprensentable in zig as *volatile c_int -- the only reason that the zig code was wrong in the first place is that the translate-c module didn't explicitly specify the type at that point.

The issue is that by not explicitly casting on usage, the way the pointer is obtained matters, which isn't really checked by translate-c. Now of course this PR diagnoses the above code by correctly restoring qualifiers when the address-of operator is used, but I'm personally not sure that every way to obtain a pointer to a cv-qualified value can be handled in this way. There are others situations wherein pointers can sort of just appear out of nowhere in C (like -> and array-to-pointer decay, but there might be others?)

If this doesn't seem like an issue or if I'm wrong, then the explicit casts for pointer usage can be dropped. Just want to make sure we're on the same page

@Vexu
Copy link
Member

Vexu commented Mar 29, 2025

Does it make sense to put this in std.zig.c_translation? (kind of like FlexibleArrayType)

Yes.

Correct me if I'm wrong, and I may have misunderstood what you meant here, but I think QualType::isVolatileQualified() only checks the "direct" qualifications of a type (with the exception of arrays, for some forsaken reason). So it doesn't return the qualifiers for the pointee. Then, excluding pointers types would actually exclude volatile pointers to (arbitrarily qualified pointee), i.e. the pointer value is volatile, which zig can't represent.

You're right, seems like I got a bit confused there.

@theofabilous
Copy link
Author

Instead of using Volatile(T).load() and Volatile(T).store(), I used .constPtr() and .ptr() to make translation easier.

.store() is non-trivial to implement because it would require keeping track of lhs/rhs in assignments and changing the translation order (rhs needs to be inside the store(...))

.load() also poses a similar problem, it's not an lvalue, but the .ptr() variant is:

const x = hw_struct.*.load();
const x = hw_struct.*.ptr().*;

hw_struct.*.load() = 0; // doesn't work
hw_struct.*.ptr().* = 0; // ok

Using .load() would require extra care around lhs/rhs stuff, but .ptr() can be used anywhere.

I still kept the load() and store() methods in the Volatile(T) type so that they can be used externally, but only emit the ptr() methods during translation.

@alexrp
Copy link
Member

alexrp commented Mar 31, 2025

I think using a wrapper struct for volatile types could also work

I don't know if this has any bearing on this PR, but I seem to recall that there are C ABIs where e.g. int x is not equivalent to struct { int value; } x. I want to say that's the case for x86-windows-msvc, maybe?

@theofabilous
Copy link
Author

I don't know if this has any bearing on this PR, but I seem to recall that there are C ABIs where e.g. int x is not equivalent to struct { int value; } x. I want to say that's the case for x86-windows-msvc, maybe?

That would likely be problematic here. Do you mean that they have different sizes/alignment? If so, that's a problem, and the few temporary tests I wrote for std.zig.c_translation.Volatile(T) will fail.

If it's related to calling conventions (which, now that I think of it, don't most ABIs have different parameter passing conventions for structs?), that can be solved by bitCast or ptrCast (which isn't currently handled)

Regardless, this made me realize that this PR doesn't handle the following simple case of assignment:

typedef volatile int mmio_int; // const mmio_int = Volatile(c_int)
mmio_int x = ...; // const x: mmio_int = ...
mmio_int y = x; // ok

// will not be a volatile load (it is in C)
volatile int y = x;

// will be a type error (assign struct to int)
int y = x;

I think I just need to add a check for lvalue->rvalue conversions in ImplicitCast or something.

If there's a parameter passing convention mismatch, I also need to make sure the function parameter types are not translated to Volatile(T).

@alexrp
Copy link
Member

alexrp commented Mar 31, 2025

That would likely be problematic here. Do you mean that they have different sizes/alignment? If so, that's a problem, and the few temporary tests I wrote for std.zig.c_translation.Volatile(T) will fail.

If it's related to calling conventions (which, now that I think of it, don't most ABIs have different parameter passing conventions for structs?), that can be solved by bitCast or ptrCast (which isn't currently handled)

My memory is fuzzy here, but I believe it has to do with the calling convention, yes. I think size/alignment are the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

translate-c: incorrect type for pointers to cv-qualified values
3 participants