Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add sys.long_info #896

Open
wants to merge 3,125 commits into
base: master
Choose a base branch
from
Open

add sys.long_info #896

wants to merge 3,125 commits into from

Conversation

aisk
Copy link
Contributor

@aisk aisk commented Sep 4, 2015

Implement sys.long_info.

Since we could use CPython's long object implementation, so just put two const numbers in sys.long_info for make run_test_long(This two numbers are picked from OSX 64 bit CPython's sys.long_info).`

kmod and others added 30 commits August 6, 2015 23:45
Refactor location and names of gc visit functions.
add __doc__ attribute to capifunc
and friends.  Should be pretty much the last step for letting
calls go all the way through our runtime using CAPI exceptions.
note: people need to `find -name '*.pyston.so' -delete`, since although
I think we correctly know to try to rebuild the sharedmodules,
distutils won't end up rebuilding it.
Previously, if a CAPI-style callsite tried to call a function but
only CXX versions were available (and visa versa), it would bail
on rewriting.  Now enable CAPI rewrites of CXX functions by using
a helper function, and enable CXX rewrites of CAPI functions by
just calling checkAndThrowCAPIException like we normally do.
I think we track the dependency correctly in cmake, but distutils
doesn't know that the system headers can change from underneath it,
so manually specify them as extra dependencies.
Rewrite unaryop and property get
this all showed up by running test/cpython/test_generators.py
sadly there are still a few until we can run the whole test
Which piggy-back on the integration tests for getting their dependencies.
Our IR doesn't explicitly represent the data transfer between an Invoke
statement and its corresponding LandingPad.  We use a couple different
techniques to pass it through: in the ast interpreter/bjit, we stash it
into an interpreter-local variable and then pull it back out.  Previous
to this change, in the LLVM tier we would pass it directly through
an exception, using either the C++ or CAPI exception-passing mechanism.

This works but is a pain, since it requires coordination between the invoke
and the landingpad.  These live in different basic blocks, so we ended up
having this other code that lives separate from the normal irgen that has
to decide which exception style to use, and it has to respect certain
restrictions within irgen (ie it has to be careful to not request CAPI
exceptions for cases that we haven't added support for yet).

This commit changes the approach so that the exception data is passed
directly as LLVM Values, and its up to the Invoke to figure out how
to get it into that form.  This adds a bit more complexity to the invoke,
but it should make the interface easier to extend (such as in the next commit).
In theory should help with pyxl which throws a decent number
of StopIterations from calling generator.next() directly, but
pretty few of those calls actually make it into the llvm JIT
to benefit from this.
Be able to jit functions that throw CAPI exceptions
It had gotten pretty ad hoc.  There were two things it was doing
- some things dealing with unwinding; moved those to codegen/unwinding.cpp
- most of our exception-throwing logic; moved these to a new runtime/exceptions.cpp
kmod and others added 22 commits September 1, 2015 13:58
when calling a BoxedWrapperDescriptor don't create a BoxedWrapperObject
Notion of redundant visits to slowly move towards scanning everything
(motivated by namedtuple)

This involves two main changes:
- changing the calling convention to pass `globals` as an argument if needed
  (this only applies going into compiled code, it's already passed into the interpreter)
- changing the llvm irgenerator to use the new globals object
It does use the old parser, but it also forces the use
of the llvm tier for everything which usually ends up being
the more important part of the configuration.
With no modificaions in this commit, so that we can track changes.
Unmodified from llvm rev r230300.
- add note + license
- put in our namespace
- change header guards
- make it match our lint style
- don't apply our formatting
The default is 64, which is quite a lot for our uses.
By switching to our DenseMap/Set fork with that allows parameterizing
on this.

This is the same number that CPython uses.  We were previously
using llvm's default of 64, which is pretty high -- probably fine
for their use cases, but in Python programs with lots of sets/dicts
we spend a lot of time doing malloc() for all those 1kB+ allocations.
This also increases the time it takes to iterate over the dict/set,
since there are more empty buckets that have to be read + skipped.
Support non-module-globals in the llvm tier
Lower initial dict/set size from 64->8
one of the problems were that depending on if all files were all ready parsed we JITed more or less functions,
this changed the total number of jited functions which changed all the module names.
This lead to the result that in order to have all entries inside the cache we needed at least 3runs if the pyc files were old.

The whole name scheme should add some point get improved but for now it should be enough.
add tp_nextiter implementation for our iterators
Generate better module names to make the object cache more effective
Add override keyword where necessary.
@@ -17,6 +17,7 @@
#include <cmath>
#include <langinfo.h>
#include <sstream>
#include <gmp.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need gmp.h here?
BTW, can we just hardcode the sys.long_info in test_long? We can implement is latter. Maybe we will use different approach to implement long in future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcode the sys.long_info is OK, but I think in sys more test cases will be passed if it depends on sys.long_info (maybe there is no more test cases depends on it LOL, so I don't know which choice is suitable).

@kmod
Copy link
Collaborator

kmod commented Sep 4, 2015

Yeah, it's tricky to decide what we should do with this. It's hard to imagine what kind of Python code would rely on this metadata (as opposed to something like sys.maxint which is a bit more reasonable) -- but if there is any code out there that actually relies on the values in this field, filling it with arbitrary numbers seems like it would break the code just as much as it not existing. So I'm in favor of not adding sys.long_info to say "hey we don't have that metadata in any meaningful way" rather than making something up for it.

I think the cpython test only uses it in a pretty superficial way, so we could probably just replace the single usage of sys.long_info.bits_per_digit with 64. If we did want to support sys.long_info, I think the correct values to use are to set sizeof_digit to sizeof(mp_limb_t) and bits_per_digit to 8 * sizeof(mp_limb_t) (I think these are correct but I'm not really sure). But since we're not really using a "digits-based" representation (we just have a single mpz_t black-box object) I'd prefer to just not add sys.long_info for now.

@aisk
Copy link
Contributor Author

aisk commented Sep 5, 2015

@kmod Should I just add this two number in cpython/long_test.py?

@kmod
Copy link
Collaborator

kmod commented Sep 22, 2015

Sorry, yeah I would recommend replacing the existing SHIFT = sys.long_info.bits_per_digit with SHIFT = 64 (along with a "Pyston change" comment) in the test file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants