Skip to content

Windows bug: hangs appveyor build #12109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Windows bug: hangs appveyor build #12109

wants to merge 1 commit into from

Conversation

MikeInnes
Copy link
Member

No description provided.

@yuyichao
Copy link
Contributor

I was excited when I saw the title because it makes me think that you find the real reason and fixed it......

@tkelman
Copy link
Contributor

tkelman commented Jul 10, 2015

actually it also happened on the most recently completed build on master - https://ci.appveyor.com/project/StefanKarpinski/julia/build/1.0.6734/job/41e1ujwp78e22yaw

that was merging #11925, another case where we need integration-test-then-merge, not the other way around.

@kshyatt kshyatt added the system:windows Affects only Windows label Jul 10, 2015
@MikeInnes
Copy link
Member Author

@yuyichao Looks like I did fix it after all – weird. Not sure if this is safe to merge or not now.

@@ -12,7 +12,7 @@ const modules = Module[]

const META′ = :__META__

meta(mod) = mod.(META′)
@eval meta(mod) = mod.$META′
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so weird. Why doesn't meta(mod) = mod.__META__ work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the real issue is probably some corner case in llvm 3.3 and tweaking here and there just happened to make it appears and disappears...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might actually be more intermittent than I first thought it was. The startup freezes are frequent enough that from a usability standpoint it might even be release-blocking. Whatever the issue is appears to be deep in the old LLVM JIT's memory management, as far as we've been able to tell so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a search for related llvm issues but didn't find anything (not so surprising given we're probably one of the heaviest llvm jit user).

@tkelman Were you able to learn anything from the buffer pointers in llvm? (The retry condition is a pointer comparison IIRC and I'm wondering if we can get a hint from how they are changed in the infinite loop.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No not really, I have no clue what to look for. Dunno whether @ihnorton has been able to catch the freeze in gdb like I've done a handful of times. Or you could probably head in and do some work on the dual-boot laptop on @andreasnoack's desk during the week.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkelman

No not really, I have no clue what to look for.

Not being able to see anything obviously wrong from reading the code, the only thing I can suggest is to print the following field of MCE in the loop you are stucked in runOnMachineFunction before and after calling MCE.startFunction(MF) as well as before calling MCE.finishFunction(MF) (so three points in the loop).

ActualSize;
SizeEstimate;
BufferBegin;
SavedBufferBegin;
BufferEnd;
SavedBufferEnd;
CurBufferPtr;
SavedCurBufferPtr;

I don't remember how gdb prints C++ classes by default and it might easier to print out the whole class.

Watching the address of these might also be helpful but that might generate too much noise before we know more details.

If no one else has better idea, this should at least help us to narrow down the problem.

Or you could probably head in and do some work on the dual-boot laptop on @andreasnoack's desk during the week.

I'm afraid not the coming weeks. (Weekends works for me though....)

@vtjnash vtjnash closed this Jul 16, 2015
@vtjnash
Copy link
Member

vtjnash commented Jul 16, 2015

this does not seem like it can possibly be the real fix for the windows hang

@ihnorton
Copy link
Member

I believe the point of this was to demonstrate a one-line change that
causes the build to hang, not to fix the larger ongoing issue.

On Thu, Jul 16, 2015 at 3:13 PM, Jameson Nash [email protected]
wrote:

this does not seem like it can possibly be the real fix for the windows
hang


Reply to this email directly or view it on GitHub
#12109 (comment).

@tkelman tkelman deleted the omm/windows-bug branch March 22, 2016 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:windows Affects only Windows
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants