Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C stack overflow in error handler #24

Open
jcelliott opened this issue Jan 31, 2014 · 11 comments
Open

C stack overflow in error handler #24

jcelliott opened this issue Jan 31, 2014 · 11 comments

Comments

@jcelliott
Copy link

When calling a lua function that errors, the C stack is not cleaned up. If the lua function is called many times, it will eventually overflow the stack. Any further calls to the error handler result in the error "error in error handling" being returned.

This is the smallest test case I could come up with to reproduce it. I'm on Linux using lua 5.1.5 and the most recent version of golua.

error.go:

package main

import "github.com/aarzilli/golua/lua"
import "fmt"

func main() {
    L := lua.NewState()
    L.OpenLibs()
    if err := L.DoFile("error.lua"); err != nil {
        panic(err)
    }

    // 120 worked on my machine, might be different for you
    for i:=0; i<120; i++ {
        top := L.GetTop()
        L.GetGlobal("doError")
        err := L.Call(0, lua.LUA_MULTRET)
        if err != nil {
            fmt.Println("error:", err)
            fmt.Println("stack length:", len(L.StackTrace()))
            // fmt.Println("stack:", L.StackTrace())
        }
        L.SetTop(top)
    }
}

error.lua:

function doError()
    error("something bad")
end
@beatgammit
Copy link

Is there any progress on this? This is kind of a big deal for long-running servers that use lua scripting for plugins (that are always loaded).

If you don't have time to fix this now, could you explain where to look to fix it?

@aarzilli
Copy link
Owner

It's pretty complicated, the root of the problem is that when luavm hits an error it will longjump all the way back to the last pcall. We can't let this happen, in general because if between the last pcall and the call to lua_error there is a go function the go runtime will eventually find out that the C stack changed under its feets and choke.
What I did was install a lua error handler that bypassed lua's error handling entirely using a call to panic, this is what causes your bug: the error handler never returns and the C stack is never rolled back.
It wouldn't be so much of a problem if lua_error were only called through lua code, unfortunately there are several lua API calls that can end up calling lua_error if the call was not set up correctly.

@beatgammit
Copy link

I'm working around this now by closing and re-initializing scripts when the "C stack overflow" error is reported. I don't see any memory leakage on initial tests, so this may be acceptable.

Perhaps this should be noted in the README?

@siddontang
Copy link

I replace lua error with my own error handler, and find no “c stack overflow”, maybe.

func doError(l *lua.State) int {
    message := l.ToString(-1)

    panic(fmt.Errorf(message))
}

L.Register("error", doError)

siddontang added a commit to ledisdb/ledisdb that referenced this issue Sep 3, 2014
error may cause c stack overflow
aarzilli/golua#24
@navy1125
Copy link

navy1125 commented Sep 7, 2015

https://github.com/navy1125/golua
I replace panic to error string , it can resolve this problem ugly...

I'm pleasure to put a pull request if nessary

@aarzilli
Copy link
Owner

aarzilli commented Sep 8, 2015

Could you split the go fmt from the changes? It's hard to see what you patched like this. Thank you.

@navy1125
Copy link

navy1125 commented Sep 9, 2015

sorry ,last two commits

@aarzilli
Copy link
Owner

aarzilli commented Sep 9, 2015

There are two problems with this approach:

  1. changing the behavior of lua.(*State).RaiseError like you did breaks API backwards compatibility (this could be overlooked or worked around if problem n. 2 didn't exist)
  2. this approach only works for errors raised from go, if lua raises an error and we let lua's error handling continue it will do a longjmp which will make cgo choke. This can be seen by running TestCheckStringFail in lua/lua_test.go.

@andrewhare
Copy link

I have a fix for this issue: #70

@Shados
Copy link

Shados commented Sep 6, 2019

@aarzilli the correct (but somewhat annoying) solution is to wrap FFI calls in both directions, with wrappers written in the foreign language that handle converting the exception-propagation mechanism appropriately.

Specifically:

  • Lua->Go calls: for any called Go function that might panic, wrap it in a Go function that recovers the panic, convert it to an error value, return it to the Lua side, and have the Lua code convert it to a Lua error*
  • Go->Lua calls: in any place where a Lua C API call may trigger a longjmp, you have to wrap that call with a lua_pcall (thus effectively capturing the error and converting it to a value before returning to the Go caller), and have the Go code convert any returned error value to a Go error

I believe this should ensure that the stack is always unwound appropriately, and never incorrectly done across FFI boundaries. This approach is taken by some other existing Lua embedding/binding projects.


*: Or, possibly store the information about the panic in a variant error, and modify Lua's pcall/xpcall so they can't catch these special panic-wrapped-in-errors, and have your Go->Lua wrapper convert them back to panics at the original Go caller

@rnovatorov
Copy link

rnovatorov commented Aug 27, 2021

As far as I can see, the only problem with error handling is the case when an error is raised from the Go side. In that case the Go stack is left unwound because of longjmp which results in a corrupted syscall frame. But the errors raised from the Lua side seem to be safe even if there is a completed Go function call between Lua pcall and error.

So what if we remove the panic msghandler, move the error call to the Lua side and make sure that no C API calls that can raise an error are ever made in the lib?

lua.(*State).PCall then would become something like this:

func (L *State) PCall(nargs, nresults int) error {
	r := L.pcall(nargs, nresults, 0)
	if r != 0 {
		msg := L.ToString(-1)
		st := L.StackTrace()
		L.Pop(1)
		return &LuaError{r, msg, st}
	}
	return nil
}

The "move error to the Lua side" part could be implemented by using debug hooks called on function return and a global variable containing an error:

debug = require("debug")
debug.sethook(function()
	if _LUAGO_ERROR ~= nil then
		local err = _LUAGO_ERROR
		_LUAGO_ERROR = nil
		error(err)
	end
end, "r")

On the client we will need to set the value of the global variable and successfully complete the call:

L.Register("foo", func(L *State) int {
	if !L.IsNumber(-1) {
		v := L.ToString(-1) // won't handle anything other than strings, but it is just an example
		L.PushString(fmt.Sprintf("oh no, %q is not a number", v))
		L.SetGlobal("_LUAGO_ERROR")
	}
	return 0
})

L.GetGlobal("foo")
L.PushString("oops")
if err := L.PCall(1, 0); err != nil {
	fmt.Println("error:", err)
}

Please, correct me if I am missing something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants