Skip to content

refactor LambdaInfo #14913

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 6 commits into from
Closed

refactor LambdaInfo #14913

wants to merge 6 commits into from

Conversation

vtjnash
Copy link
Member

@vtjnash vtjnash commented Feb 2, 2016

eventually, MethodInfo should be able to merge with Method in the MethodTable->defs
and LambdaInfo (MethodSpecialization?) should be able to replace Method in the MethodTable->cache

the difference between these types is very small in usage (resulting in a small amount of code duplication), but previously jl_lambda_info_t and been performing two nearly completely independent functionalities, identified only by which fields were active. this is a minor but annoying breaking internal change, so I figured doing it on the heals of the big renaming of LambdaStaticData was perhaps the best time to consider this.

the savings is small, but not entirely insignificant (~2% of sysimg)

# specTypes::Any
# unspecialized::LambdaInfo
# def::MethodInfo
# pure::Bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pure field, as it exists currently, is essentially a placeholder for "ast properties that inference wants to stash somewhere"

@JeffBezanson
Copy link
Member

+1 Looks like a good direction to go, and I agree now is the time to do it. How soon do you think we can merge Method and MethodInfo? Would be good to get through as much of that change as possible.

@vtjnash
Copy link
Member Author

vtjnash commented Feb 2, 2016

I think I can replace the method caches with MethodSpecialization (renamed from LambdaInfo) fairly quickly. I think merging Method and MethodInfo could be pretty quick if I simply merge their fields and then fill them in as they become known. I think the tricky case is the code pattern from #10373, where some properties of the Method get filled in differently on each iteration (name, signature, tvars)

one other re-arrangement here that I think could be considered is reintroducing LambdaInfo as a limited subset of its current self that exactly replaces the existing ast field with fields. i'm not really certain how that should look (which is why i started this as an RFC).

@JeffBezanson
Copy link
Member

Let's keep LambdaInfo; I think MethodSpecialization is too long. I think the LambdaInfo you have so far here is very close to its final form, and should be used by all producers and consumers of ASTs. Its inner "ast" field will actually be an array of statements, possibly encoded compactly. This array of statements can't be used on its own, and will always need the context info provided by LambdaInfo.

@StefanKarpinski
Copy link
Member

Making methods more first class has the potential to simplify a number of reflection APIs. One that occurs to me off the top of my head is being able to call a Method on any arguments that satisfy its dispatch constraints. That would allow invoke to be defined in terms of finding a Method and calling it.

@JeffBezanson
Copy link
Member

I think the real way to do invoke is to make a new function that contains only the selected method. Then specialization falls out naturally the way you expect. The question is where to store these per-method functions; currently we have a special field for caching them inside Method objects, which is what makes the feature somewhat "built in". It seems hard to make invoke efficient without that field.

@vtjnash vtjnash changed the title rfc / wip: split part of LambdaInfo into MethodInfo move much of LambdaInfo into Method Feb 3, 2016
eventually, MethodInfo should be able to merge with Method in the MethodTable->defs
and LambdaInfo should be able to replace Method in the MethodTable->cache
MethodTable.defs is now a linked list of Method objects
and MethodList now contains LambdaInfo specializations, not Methods
…er of the uninferred ast

also rename li->unspecialized to li->unspecialized_ducttape, to make clear that this field is a temporary workaround until the interpreter is fully-functional
@vtjnash vtjnash changed the title move much of LambdaInfo into Method refactor LambdaInfo Feb 5, 2016
@vtjnash
Copy link
Member Author

vtjnash commented Feb 5, 2016

I've now tried roughy 4 different variations on this idea to try out some different combinations of field divisions. fwiw, the current situation is roughly:

type Method
  next::Method
  sig::Type
  # fields for (cache) + reference to LambdaInfo (cache)
  # fields for (def) + reference to LambdaInfo (def)
end

type LambdaInfo
  ast
  # fields for (cache) + reference to LambdaInfo (def)
  # fields for (def)
end

# where (def) means reached as an entry from MethodTable.defs
# and (cache) means reached as an entry from MethodTable.cache

the commits vary in how they rearrange the fields among various old and new types. for example, the design of the latest commit (a202fd1 "merge LambdaInfo and MethodList, but split off AstInfo") is:

type Method # one entry in a MethodTable of a Function
  next::Method
  sig::Type
  code::AstInfo
  # fields for def
end # was LambdaInfo (def) + Method (def)

type AstInfo # a syntax tree
  def::Method
  ast
  # metadata for ast
end # was `linfo.ast`

type LambdaInfo # an executable thunk (also may be one entry in a cache)
  next::LambdaInfo
  sig::Type
  rettype::Type
  func # code (e.g. an AstInfo, but could be #undef for a builtin)
  sparam_vals # really a closure-env
  fptrs::Ptr{Void}
end # was LambdaInfo (cache) + Method (cache)

// used to avoid infinite recursion
// sparam_vals is the closure data (a vector of values indexed by func->def->sparam_syms)
jl_svec_t *sparam_vals; // NULL if this is a guard entry
jl_value_t *func; // where this function came from, NULL if there isn't an associated definition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand this field. The fact that it's not always the same type makes me a bit suspicious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this commit was an exploration in making LambdaInfo the minimal set of fields to make it a purely executable thunk (without any extraneous metadata like an AST, name, debug info, etc.)

it was an abstract representation of the contract fptr(this, sig) -> rettype where fptr = compile(func)

@JeffBezanson
Copy link
Member

I see that it makes sense to separate a code object from something that is actually executable, due to (1) a Method definition only contains "raw code" and nothing you can actually execute yet, (2) being executable brings in lots of extra objects (pointers to native code), (3) we need to add static parameters. But I'm not yet 100% sold on splitting LambdaInfo.

Issue 1: IIUC, LambdaInfo.sig is sometimes the types the code was specialized on, and sometimes a method cache key. I find this confusing. I also think we should be able to efficiently use the same compiled code for many method cache entries, by pointing to the same object. Especially with jb/functions there can be bad cases where we might need a large number of such entries. We could, for example, represent the method cache like a typical linear probing hash table, with alternating keys (types) and LambdaInfo pointers in a flat array. I think that might also resolve my confusion about the func field, since it could then always be an AstInfo.

Issue 2: We should probably stop calling these things ASTs. They are not syntax trees, and could even become bytecode. If we do this split I think AstInfo should be called something like Code, and LambdaInfo something like Runnable. Those names probably aren't ideal, but they should clearly convey the difference between the two types.

@vtjnash
Copy link
Member Author

vtjnash commented Feb 10, 2016

Issue 1: IIUC, LambdaInfo.sig is sometimes the types the code was specialized on, and sometimes a method cache key....

agreed. i was exploring whether it could make sense to merge the two ideas (as can easily be done for Method) and remove the distinction between your two usages. the experiment worked and provided some intriguing flexibility in what a LambdaInfo could represent / do, but i have mostly concluded that it was not an overall improvement.

my current plan forward was to scratch that last commit, keeping only a few things from it:

  1. removal of unspecialized logic from gf.c (i liked how it could become a property of codegen, rather than method dispatch. i believe this was a nice reduction in code duplication, at the expense of possibly making calling a function generator marginally more expensive).
  2. giving CodeInfo (née AstInfo) a unique type. per discussions with @carnaval, we should try to design this such that we aren't modifying the LambdaInfo during type-inferenece as an additional criterion / benefit (i'm not sure exactly what he wants there, however)

Issue 2: We should probably stop calling these things ASTs.

sure, it seems reasonable to switch to a more generic term such as IR or Code for this

@vtjnash
Copy link
Member Author

vtjnash commented Apr 12, 2016

replaced by #15779

@vtjnash vtjnash closed this Apr 12, 2016
@vtjnash vtjnash deleted the jn/astinfo branch April 12, 2016 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants